[Datetime-SIG] Another round on error-checking

Tim Peters tim.peters at gmail.com
Tue Sep 1 19:26:51 CEST 2015


[Guido]
> I could not accept a PEP that leads to different datetime being considered
> == but having a different hash (*unless* due to a buggy tzinfo subclass
> implementation -- however no historical timezone data should ever depend on
> such a bug).
>
> I'm much less concerned about < being intransitive in edge cases.

Offhand I don't know whether it can be (probably).  The case I
stumbled into yesterday showed that equality ("==") could be
intransitive:

    assert a == b == c == d  and  a < d

While initially jarring, I called it a "minor wart", because the
middle "==" there is working in classic arithmetic but the other two
are working in timeline arithmetic.  But _a_ wart all the same, since
transitivity doesn't fail today.


> I also don't particularly care about == following from the difference being zero.
> Still, unless we're constrained by backward compatibility, I would rather
> not add equivalence between *any* two datetimes whose tzinfo is not the same
> object -- even if we can infer that they both must refer to the same
> instant.

Assuming "equivalent" means "compare equal", we're highly constrained.
For datetimes x and y with distinct non-None tzinfos, it's always been
the case that:

1. x-y effectively converted both to UTC before subtraction.

2. comparison effectively interpreted x-y as a __cmp__ result
2a.  various comparison transitivities essentially followed from that

3. Because of #2, to maintain __hash__'s contract datetime.__hash__
    also effectively converted to UTC before hashing

All of that would (well, "should") continue to work fine, except that
fold=1 is being ignored in intrazone arithmetic (subtraction and
comparisons) and by hash().  Maybe there are other surprises.  I just
happened to notice the hash() problem, and equality intransitivity,
both yesterday. via thought experiments.

On the face of it, it's a conceptual mess to try to make fold=1 "mean
something" in some contexts but not in others.  In particular,
arithmetic, comparison, and hashing are usually deeply interrelated,
and have been in datetime so far.  Ignoring `fold` in single-zone
arithmetic, comparisons and hashing works fine (in "naive time", where
`fold` is senseless), but when going across zones `fold` cannot be
ignored.

That's a huge problem for hash(), because it can have no idea whether
the pattern of later equality comparisons relying on hash results
_will_ be using classic or timeline rules (or a mix of both).

That didn't matter before, because _a_ unique UTC equivalent always
existed (the possibility of ambiguous times was effectively ignored).

Now it does matter, because the UTC equivalent can differ depending on
the `fold` value.  Ignoring it sometimes but not others leads to the
current quandary.


More information about the Datetime-SIG mailing list