[Datetime-SIG] PEP 495: What's left to resolve
alexander.belopolsky at gmail.com
Tue Sep 8 03:57:12 CEST 2015
The good news that other than a few editorial changes there is only one
issue which keeps me from declaring PEP 495 complete. The bad news is that
the remaining issue is subtle and while several solutions have been
proposed, neither stands out as an obviously right.
PEP 495 requires that the value of the fold attribute is ignored when two
aware datetime objects that share tzinfo are compared. This is motivated
by the reasons of backward compatibility: we want the value of fold to only
matter in conversions from one zone to another and not in arithmetic within
a single timezone.
As Tim pointed out, this rule is in conflict with the only requirement that
a hash function must satisfy: if two objects compare as equal, their hashes
should be equal as well.
Let t0 and t1 be two times in the fold that differ only by the value of
their fold attribute: t0.fold == 0, t1.fold == 1. Let u0 =
t0.astimezone(utc) and u1 = t1.astimezone(t1). PEP 495 requires that u0 <
u1. (In fact, this is the main purpose of the PEP to disambiguate between
t0 and t1 so that conversion to UTC is well defined.) However, by the
current PEP 495 rules, t0 == t1 is True, by the pre-PEP rule (and the PEP
rule that fold is ignored in comparisons) we also have t0 == u0 and t1 ==
u1. So, we have (a) a violation of the transitivity of ==: u0 == t0 == t1
== u1 does not imply u0 == u1 which is bad enough by itself, and (b) since
hash(u0) can be equal to hash(u1) only by a lucky coincidence, the rule
"equality of objects implies equality of hashes" leads to contradiction
because applying it to the chain u0 == t0 == t1 == u1, we get hash(u0) ==
hash(t0) == hash(t1) == hash(u1) which is now a chain of equalities of
integers and on integers == is transitive, so we have hash(u0) == hash(u1)
which as we said can only happen by a lucky coincidence.
The Root of the Problem
The rules of arithmetic on aware datetime objects already cause some basic
mathematical identities to break. The problem described above is avoided
by not having a way to represent u1 in the timezone where u0 and u1 map to
the same local time. We still have a surprising u0 < u1, but
u0.astimezone(local) == u1.astimezone(local), but it does not rise to the
level of a hash invariant violation because u0.astimezone(local) and
u1.astimezone(local) are not only equal: they are identical in all other
ways and if we convert them back to UTC - they both convert to u0.
The root of the hash problem is not in the t0 == t1 is True rule. It is in
u0 == t0. The later equality is just too fragile: if you add
timedelta(hour=1) to both sides to this equation, then (assuming an
ordinary 1 hour fall-back fold), you will get two datetime objects that are
no longer equal. (Indeed, local to utc equality t == u is defined as t -
t.utcoffset() == u.replace(tzinfo=t.tzinfo), but when you add 1 hour to t0,
utcoffset() changes so the equality that held for t0 and u0 will no longer
hold for t0 + timedelta(hour=1) and u0 + timedelta(hour=1).)
PEP 495 gives us a way to break the u0 == t0 equality by replacing t0 with
an "equal" object t1 and simultaneously have u0 == t0, t0 == t1 and t1 !=
Tim suggested several solutions to this problem, but by his own admission
neither is more than "grudgingly acceptable." For completeness, I will
also present my "non-solution."
Solution 0: Ignore the problem. Since PEP 495 does not by itself introduce
any tzinfo implementations with variable utcoffset(), it does not create a
hash invariant violation. I call this a non-solution because it would once
again punt an unsolvable problem to tzinfo implementors. It is unsolvable
for *them* because without some variant of the rejected PEP 500, they will
have no control over datetime comparisons or hashing.
Solution 1: Make t1 > t0.
Solution 2: Leave t1 == t0, but make t1 != u1.
Request for Comments
I will not discuss pros and cons on the two solutions because my goal here
was only to state the problem, identify the root case and indicate the
possible solutions. Those interested in details can read Tim's excellent
explanations in the "Another round on error-checking"  and "Another
approach to 495's glitches"  threads.
I "bcc" python-dev in a hope that someone in the expanded forum will either
say "of course solution N is the right one and here is why" or "here is an
obviously right solution - how could you guys miss it."
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Datetime-SIG