[Datetime-SIG] PEP-431/495

Fri Aug 28 04:22:01 CEST 2015

[Stuart Bishop <stuart at stuartbishop.net>]
> The fixed-offset classes and sorting arithmetic were the only way to
> get things round tripping.

Arithmetic has nothing to do with round tripping.  I can't explain
that any better than I already have.  Therefore it must already be
clear.  Therefore I'll go on to pretend it isn't ;-)

> Take a datetime. Convert it to another timezone.

That _alone_ can't always work right.  I gave you a specific, detailed
example before.  More generally. pick _any_ case where a UTC time `u`
corresponds to an ambiguous time in some other timezone T.  Convert
`u` to time `t` in T and back to UTC again.  The chance of getting `u`
back again is in one in two.  Without a disambiguation bit, it's
impossible for the conversion back to UTC to know which UTC time this
dance started with.  That's what "ambiguous time" _means_:  there's
more one UTC spelling of `t`.  Without a bit to specify _which_ UTC
time `t` was derived from, it's impossible to do better than just
guess which UTC time was intended.

> Add one hour to both.

But that's off in the weeds.  If you use a form of arithmetic
inappropriate for the problem you're trying to solve, _of course_
that's going to cause problems.  Use, e.g., integer arithmetic for a
problem that requires floating arithmetic, and nothing good can come
of it.  Likewise, in many cases, using floating arithmetic for a
problem that requires integer arithmetic.  Same thing using classic
arithmetic for a problem that requires timeline arithmetic.  That's
"pilot error".

> Compare.  The results were inconsistent.

Conversion alone can cause problems.  Using inappropriate arithmetic
is a distinct source of "garbage in, garbage out".

> You would only get correct results with fixed offset timezones,

If appropriate (timeline) arithmetic had been used instead to "add an
hour", then only the errors due to conversion endcases would have
remained.

> because the builtin arithmetic ignored the timezone,
> because there was no is_dst flag and without it it is impossible to
> get correct results.

is_dst is necessary and sufficient to repair the conversion errors.
Sorry, but all other errors were self-inflicted (using inappropriate
arithmetic).  Granted, the Python docs never did scream about this.
As Guido said in an earlier message, we pretty much just assumed
people who wanted timeline arithmetic would use UTC, or plain old
timestamps, instead.

And they still should.  It's easy to write 1-line Python functions to
implement timeline arithmetic (modulo that errors due to conversion
alone still remain), but that's a grossly inefficient way to avoid
best practice too.

> The burden was left on tzinfo implementations to deal with the problem.

There's more than one problem here.

> You could have naive times and do arithmetic correctly,

At least that part got communicated ;-)

> or you could have zone aware times and do conversions correctly,

No.  Not in all cases.  See the start of this msg.

> but to do both developers had to always convert to and from
> utc to do the arithmetic.

Using timeline arithmetic removes all errors _due_ to using
inappropriate arithmetic, but is of no help for the errors due to
conversion alone.  PEP 495 aims to fix the latter.  There is no
"arithmetic problem" beyond programmers needlessly shooting kittens in
their cute, furry heads.

> And developers being lazy creatures wouldn't bother because it
> would normally work, or even always work in their particular
> timezone, and systems would crash at 4am killing innocent
> kittens.

This isn't unique to datetime code.  Programmers who don't learn and
adopt best practices are responsible for a great deal of damage in the
real world.  No programming language can stop that (although some
academic ones have made heroic efforts).

> And this was a problem with my tzinfo implementation, because
> the only way you could possibly experience the problem was by using my
> tzinfo implementation. Python had avoided this clearly documented
> problem by not supplying any tzinfo implementations, even though it
> would have been easy to create a 'local' one using the information
> already exposed in the time module, and I'd always assumed that fixing
> it was a requirement of adding timezone implementations to the
> standard library. So I fixed it.

I do admire the hack!   But the magic it's trying to perform strikes
me as more of an "attractive nuisance" than a real aid to writing
correct code.  If users needing timeline arithmetic _did_ bite the
bullet and work in UTC internally, they would first find out it's a
very small & squishy bullet to bite, easy to swallow and digest.
That's ain't no Unicode nightmare.  A simple .astimezone() on input &
output and they're golden.  In return they'd enjoy cleaner, more
maintainable, shorter, and more likely correct code.  It would run
faster too.  As is, what if they forget a .normalize()?  Try to use
datetimes obtained from other packages?  Try to pass pytz datetimes
_to_ other packages?  Forget to check after a .replace() or .combine()
or ...?

Of course you can't possibly prevent programmers from slaughtering
kittens either.  But in return for enabling a lazy programmer to avoid
using UTC, that programmer has to litter their code with .localize()
and.normalize() calls.  pytz did a real service for those who couldn't
afford _any_ errors in conversions alone, and by wrapping the Olson
database, but I don't think it does any _real_ favors by making slow &
complex timeline arithmetic more attractive to the terminally lazy.

> Drunk on my own cleverness and relative youth, it never occurred
> to me that it was possible to rationalize the existing behaviour
> with a straight face, where after going to all the effort of constructing
> and adding a tzinfo to your datetime it would sit there entirely ignored
> by Python, except for conversion operations,

If you think _you're_ drunk on your own cleverness, try working for Guido ;-)

In any case, catering to timeline arithmetic was not a use case for
datetime's design.  Conversions were.  Believe it or not, across
datetime's extensive public design phase, timeline arithmetic barely
came up.

> consistently giving you answers that are demonstrably incorrect using
> most modern timekeeping systems. I'm still not capable of conjuring
> up such a monumental rationalization ;)

That's why I repeat mine so often.  Pretty soon you'll be able to just
cut & paste pieces of mine to create trillions of rationalizations
that all sound remarkably similar yet are provably distinct ;-)

...

>>> As far as I know, normalize() is not necessary after astimezone() even
>>> now
>>> https://answers.launchpad.net/pytz/+question/249229

> Yeah, I'm putting off answering that one because I'm not sure if I'll
> get the answer right. People sometimes think I actually know what I'm
> doing. I'll have a look after I get the overdue pytz release out.

My guess is it all depends on what your .fromutc() does, since that's
the last step .astimezone() performs.  That is, if your .fromutc()
attaches "the right" tzinfo, then .astimezone() inherits that
goodness.