stuart at stuartbishop.net
Fri Aug 21 14:07:58 CEST 2015
Sorry I'm late. pytz author here.
Gosh you guys write a lot. I've tried to skim things, and will default
to agreeing with Tim since it is usually the smart thing to do.
A few notes from my skimming:
- I want a boolean added to datetime instances, even if I don't like
the name, because I can then deprecate pytz and its confusing API and
implementation. I'm happy to work on Python implementation and
documentation. It will save me time and effort in the long run.
- Most of my thoughts got encoded in PEP-431. This would give us a
datetime module that operates exactly the way it does today, but with
the option of performing pytz style unambiguous datetime arithmetic
without pytz and its confusing API. If the developer explicity set the
is_dst flag, then exceptions would be raised when trying to
instantiate an ambiguous or invalid timestamp. For code that does not
specify the new, optional flag things work as they do today and a best
guess made when the localized datetime is constructed.
- PEP-495 seems similar to PEP-431, except that it attempts to allow
things continue in the face of an ambiguous or invalid localized
datetime. The boolean flag is not tristate, so there is no way to have
strict checking of input. It doesn't matter if the developer said
'whatever' and left the flag on the default, or cared enough to
explicitly override it.
- The rules in PEP-495 for utcoffset() and dst() to deal with
ambiguous times only work in simple cases, as there dst offsets both
more and less than 1 hour, and there is no stdoffset since the offset
can change at the same time (eg. Europe/Vilnius 1941, where the clocks
ended up going backwards for summer time instead of forwards).
- Other APIs I know of, including Python's time module, uses is_dst or
isdst as the required boolean flag. As do the timezone databases
containing the data we need. I think the argument against the is_dst
flag name in PEP-495 is flaccid.
- If there is an argument in favour of 'first' over 'is_dst', it is
because occasionally there are timezone changes without a dst
transition. If we call it is_dst, we agree that in a few rare
historical cases we are going to have to lie.
- My argument in favour of 'is_dst' over 'first' is that this is what
we have in the data we are trying to load. You commonly have a
timestamp with a timezone abbreviation and/or offset. This can easily
be converted to an is_dst flag. To convert it to a 'first' flag, we
need to first parse the datetime, determine the transition points that
year, and then which side of the nearest transition point it lies.
Note that there can be more than 2 transition points in a year, and no
api has been discussed for discovering them.
- I think datetime should consider 1 day == 24 hours and not have
concepts like years or months, just like it does today. As others
suggested, a separate module dealing with leap years and variable
length days may be useful to some people, as would leapsecond support
for astronomers and astrologers. But if the default implementation
gives different results to all the other tools on your system, people
will think the default is wrong.
- Offsets should ideally be declared in seconds. Last I looked, the
current Python implementation rounds them to the nearest minute and it
would be nice to fix that. These are almost always historical, dating
from when noon was when the sun was at its highest point above the
capital (eg. Europe/Amsterdam before 1938)
- There are cases where there are gaps at the end of DST, and folds at
the beginning of DST, when the timezone offsets were changed
simultaneously with the dst flag.
- Microsoft's timezone database does not contain historical
information, which is why databases that need support under Windows
like PostgreSQL include the IANA/Olson database.
- Thank you to everyone who has been working on this. I've wanted it
for a long, long time but never got around to remembering how to write
Stuart Bishop <stuart at stuartbishop.net>
More information about the Datetime-SIG