[Datetime-SIG] Are there any "correct" implementations of tzinfo?

Sun Sep 13 23:58:09 CEST 2015

[Tim]
>> Whatever time zone the traveler's railroad schedule uses, so long as
>> it sticks to just one

[Laura]
> This is what does not happen.  Which is why I have written a python
> app to perform conversions for my parents, in the past.

So how did they get the right time zone rules for Creighton?

>>But there's nothing new here:  datetime has been around for a dozen
>>years already, and nobody is proposing to add any new basic
>>functionality to tzinfos.  PEP 495 is only about adding a flag to
>>allow correct conversion of ambiguous local times (typically at the
>>end of DST, when the local clock repeats a span of times) to UTC.  So
>>if this were a popular use case, I expect we would already have heard
>>of it.  Note that Python zoneinfo wrappings are already available via,
>>at least, the pytz and dateutil packages.

> I am a happy user of pytz.  On the other hand, I think this means that
> my brain has gone through some sort of non-reversible transformation
> which makes me accurate, but not exactly sane on the issue.

pytz made some strange decisions, from the POV of datetime's intended
tzinfo design.  But it also solved a problem datetime left hanging:
how to disambiguate ambiguous local times.

The _intended_ way to model zones with UTC offset transitions was via
what the docs call a "hybrid" tzinfo:  a single object smart enough on
its own to figure out, e.g., whether a datetime's date and time are in
"daylight" or "standard" time.  However, there's currently no way for
such a tzinfo to know whether an ambiguous local time is intended to
be the earlier or the later of repeated times.  PEP 495 aims to plug
that hole.

pytz solves it by _never_ creating a hybrid tzinfo.  It only uses
eternally-fixed-offset tzinfos.  For example, for a conceptual zone
with two possible total UTC offsets (one for "daylight", one for
"standard"), there two distinct eternally-fixed-offset tzinfo objects
in pytz.  Then an ambiguous time is resolved by _which_ specific
tzinfo object is attached.  Typically the "daylight" tzinfo for the
first time a repeated local time appears, and the "standard" tzinfo
for its second appearance.

In return, you have to use .localize() and .normalize() at various
times, because pytz's tzinfo objects themselves are completely blind
to the possibility of the total UTC offset changing. .localize() and
.normalize() are needed to possibly _replace_ the tzinfo object in
use, depending on the then-current date and time.

OTOH, `dateutil` does create hybrid tzinfo objects.  No dances are
ever needed to possibly replace them.  But it's impossible for
dateutil's tzinfos to disambiguate times in a fold.  Incidentally,
dateutil also makes no attempt to account for transitions other than
DST (e.g., sometimes a zone may change its _base_ ("standard") offset
from UTC).

So, yup, if you're thoroughly indoctrinated in pytz behavior, you will
be accurate but appear insane to Guido ;-)  At a semantic level, a
pytz tzinfo doesn't capture the notion of a zone with offset changes -
it doesn't even try to.  All knowledge about offset changes is inside
the .localize() and .normalize() dances.

> I think I have misunderstood Alexander Belopolsky as saying that
> datetime had functionality which I don't think it has. Thus I thought
> we must be planning to add some functionality here.  Sorry about this.

Guido told Alex to stop saying that ;-)  You can already get
eternally-fixed-offset classes, like pytz does, on (at least) Linux
systems by setting os.environ['TZ'] and then exploiting that
.astimezone() without an argument magically synthesizes an
eternally-fixed-offset tzinfo for "the system zone" (which the TZ
envar specifies) current total UTC offset.  That's not really
comparable to what pytz does, except at a level that makes a lot of
sense in theory but not much at all in practice ;-)

> However, people do need to be aware, if they are not already, that
> people with 3 times in 3 different tz will want to sort them.  Telling
> them that they must convert them to UTC before they do so is, in my
> opinion, a very fine idea. Expecting them to work this out by themselves
> via a assertion that the comparison operator is not transitive, is,
> I think, asking a lot of them.

Of course.  Note that it's _not_ a problem in pytz, though:  there are
no sorting (or transitivity) problems if the only tzinfos you ever use
have eternally fixed UTC offsets.  There are no gaps or folds then,
and everything works in an utterly obvious way - except that you have
to keep _replacing_  tzinfos when they become inappropriate for the
current dates and times in the datetimes they're attached to.