[Datetime-SIG] Version check (was Re: PEP 495: What's left to resolve)

Tim Peters tim.peters at gmail.com
Tue Sep 8 21:13:45 CEST 2015

>> Which reminds me:  the PEP should add a way for a post-495 tzinfo to
>> say it supplies post-495 semantics, so users can check whether they're
>> getting a tzinfo they require (if they need fold disambiguation) or
>> can't tolerate (if they need folds to be ignored for legacy reasons).

> We may end up providing something like this, but I hope developing this
> mechanism can be left to the tzinfo implementers.

Python defines the tzinfo API and minimal tzinfo semantics.  If Python
doesn't also resolve tzinfo discoverability issues created by its own
new requirements. then tzinfo implementers will create a Tower of
Babel.  Far better for Python to define _the_ way to check whether a
tzinfo implements 495 semantics.  This should be nearly trivial, both
to specify and for tzinfo authors to implement - unless we go out of
our way to create complications that aren't _inherent_ to the problem
at hand ("which version did I get?").

> (Which can as well be us but in another PEP.)

Disagree.  PEP 495 is _creating_ a new "discoverability" problem.  So
that's the place to fix it too, before it becomes a real problem.

> I am not sure a tzinfo object will need a persistent attribute rather than
> just a way to require specific capabilities at the construction time.

"Tower of Babel" - Python has no business specifying how a tzinfo
object "must be" obtained to begin with, and there are already
multiple ways out in the field.  But Python is requiring a change to
semantics.  Some tzinfo authors may choose to provide an explicit way
to ask for PEP 495 semantics, while others may not, etc.  User code
needs a uniform way to ask whether what they get in the end meets
their requirements.  When their requirements depend only on things
where Python itself changed its mind, it's Python's proper responsibility
to give the user a way to tell which they got.

> For example, a hypothetical zoneinfo() constructor or a
> factory function can take a "fold_aware" boolean argument and let the user
> specify what kind of tzinfo is requested.  It will then become a QOI issue
> of whether zoneinfo() supports both pre- and post-PEP semantics or not.

Yes, Tower of Babel.  There's no need to inflict this potential
confusion on users.  Just specify a way to check. that _all_ post-495
tzinfos must support.

> Note that zoneinfo() providers may end up extending  the tzinfo API to
> include queries such as give me all folds between year A and year B.

Different issue, because _Python_ isn't specifying anything about
that.  We can't do anything about Towers of Babel tzinfo authors
choose to create on their own.  We can do something about new
semantics Python is forcing them to supply.  BTW, I've never yet
seen a tzinfo that supplied any functionality beyond the minimum
required by the docs.

> The downside of a persistent run-time attribute that differentiate between
> pre-PEP and post-PEP tzinfos is that it may promote writing code that tries
> to cope with the presence of pre-PEP and post-PEP tzinfos in the same
> program.   This is a recipe for a combinatorial disaster.

If a user chooses to embrace that, that's on them.  Far better to give
them a uniform way to check the tzinfos they get so they can
absolutely avoiding mixing pre-495 and post-495 tzinfos to begin with.

> Note that on top of pre-PEP/post-PEP distinction a good tzinfo() library
> will probably also supply a TZ database version.  Imagine writing a
> simple "within(t, start, stop)" function that should account for the
> tree arguments possibly having different  "fold_aware" attribute
> and different tzversion?

Again, how can a sane user ensure they're _not_ getting into a such a
mess if they can't even ask "is this a pre- or post-495 tzinfo?" in a
uniform way?  Assume 495 is successful.  Some general-purpose library code
will be _passed_ datetimes with tzinfos it had nothing to do with
creating, and general-purpose libraries can't assume more than the
minimum the Python docs require.  The library has no control at all
over the tzinfos it sees, but may _need_ to know whether they're pre-
or post-495.

495 can make that simple instead of nearly impossible.

>> I guess requiring a new `__version__ = 2` attribute would be OK.

> I generally dislike "version" constants or attributes.

Me too, but far better than nothing.

>  My preferred solution would be to provide a generic PEP 495 compliant
> fromutc() in a tzinfo subclass and ask PEP 495 compliant implementations
> to derive from that.

That would be fine, except it's no longer trivial - for us.  It would
be better to supply a new marker class in the stdlib a PEP 495
compliant tzinfo had to derive from, but whose .fromutc() _must_ be
overridden.  All the industrial-strength zone wrappings are dealing
with databases for which overriding .fromutc() is by far the best
approach anyway.  So, if we wanted to be _useful_, it would do more
good for more people if we supplied a horridly slow default
.utcoffset() instead.

But this is "creating complications that aren't _inherent_ to the
problem at hand".   And if this isn't the last change Python ever
makes to tzinfo semantics, a plain integer version number is probably
easier for most people to grasp and live with than a graph of marker
classes anyway.

>> Or (preferably "and") add an optional `fold=None` argument to
>> .utcoffset()  (by default, use the datetime's .fold attribute, else
>> use the passed value).

> I thought about this as an optimization.  dt.utcoffset(fold=1) being an
> equivalent of dt.replace(fold=1).utcoffset() which avoids copying of the
> entire dt object into a temporary.  I think this is a minor issue.  I can go
> either way on this.

It's a poor way to do version-checking, so I shouldn't have mentioned
it.  Alas, Guido's time machine is tied up preventing by-magic
interzone comparison from ever being implemented :-(

More information about the Datetime-SIG mailing list