Just so people know, over at the datetime-sig I've accepted PEP 495, which adds a fold flag to datetime objects to distinguish ambiguous times. This enables roundripping of conversions for those times where the local clock is moved backward (creating ambiguous times that could not be distinguished before). I would like to thank Alexander and Tim for their unrelenting work on this. The idea seems simple, but the details were excruciatingly hard to get right, given the strict backwards compatibility requirements. There may well be additional beneficial changes to the datetime module. The datetime-sig is now open for their discussion. However, proposals that break backwards compatibility are a waste of everybody's time, so be prepared to explain how your proposal does not break existing code that works under Python 3.5. -- --Guido van Rossum (python.org/~guido)
More context ;-) PEP 0495 -- Local Time Disambiguation Abstract This PEP adds a new attribute fold to instances of the datetime.time and datetime.datetime classes that can be used to differentiate between two moments in time for which local times are the same. The allowed values for the fold attribute will be 0 and 1 with 0 corresponding to the earlier and 1 to the later of the two possible readings of an ambiguous local time. https://www.python.org/dev/peps/pep-0495/ Victor 2015-09-22 0:03 GMT+02:00 Guido van Rossum <guido@python.org>:
Just so people know, over at the datetime-sig I've accepted PEP 495, which adds a fold flag to datetime objects to distinguish ambiguous times. This enables roundripping of conversions for those times where the local clock is moved backward (creating ambiguous times that could not be distinguished before).
I would like to thank Alexander and Tim for their unrelenting work on this. The idea seems simple, but the details were excruciatingly hard to get right, given the strict backwards compatibility requirements.
There may well be additional beneficial changes to the datetime module. The datetime-sig is now open for their discussion. However, proposals that break backwards compatibility are a waste of everybody's time, so be prepared to explain how your proposal does not break existing code that works under Python 3.5.
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.co...
On 22 September 2015 at 08:03, Guido van Rossum <guido@python.org> wrote:
Just so people know, over at the datetime-sig I've accepted PEP 495, which adds a fold flag to datetime objects to distinguish ambiguous times. This enables roundripping of conversions for those times where the local clock is moved backward (creating ambiguous times that could not be distinguished before).
Hurrah, and congratulations in particular on finding a name for the flag which is memorable, meaningful and succinct.
I would like to thank Alexander and Tim for their unrelenting work on this. The idea seems simple, but the details were excruciatingly hard to get right, given the strict backwards compatibility requirements.
I don't think I've seen a collision between mathematical and language level invariants that complex since the first time I had to figure out the conflict between container membership invariants and floating point NaN values (and this one is even more subtle). I'm reading through the full PEP now, and really appreciating the thorough write up. Thanks to Alexander and Tim, and to all the folks involved in the extensive discussions! Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 22 September 2015 at 13:33, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 22 September 2015 at 08:03, Guido van Rossum <guido@python.org> wrote:
Just so people know, over at the datetime-sig I've accepted PEP 495, which adds a fold flag to datetime objects to distinguish ambiguous times. This enables roundripping of conversions for those times where the local clock is moved backward (creating ambiguous times that could not be distinguished before).
Hurrah, and congratulations in particular on finding a name for the flag which is memorable, meaningful and succinct.
I would like to thank Alexander and Tim for their unrelenting work on this. The idea seems simple, but the details were excruciatingly hard to get right, given the strict backwards compatibility requirements.
I don't think I've seen a collision between mathematical and language level invariants that complex since the first time I had to figure out the conflict between container membership invariants and floating point NaN values (and this one is even more subtle).
I'm reading through the full PEP now, and really appreciating the thorough write up. Thanks to Alexander and Tim, and to all the folks involved in the extensive discussions!
It turns out there's one aspect of the accepted proposal that I *think* I understand, but want to confirm: the datetime -> POSIX timestamp -> datetime roundtrip for missing times. If I'm reading the PEP correctly, the defined invariant for local times that actually exist is: dt == datetime.fromtimestamp(dt.timestamp()) No confusion there for the unambiguous times, or for times in a fold. In the latter case, the timestamps produced match the points where the UTC times match the local times in the "In the Fold" UTC/local diagram. The subtle part is the handling of the "timestamp()" method for the "missing" times where the given time doesn't actually correspond to a valid time in the applicable timezone (local time for a naive datetime object). Based on the UTC/local diagram from the "Mind the Gap" section, am I correct in thinking that the modified invariant that also covers times in a gap is: dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp()) That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()". Regards, Nick. P.S. Thanks to whoever drew the diagrams for the In the Fold/Mind the Gap sections - I found them incredibly helpful in understanding the change! -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, Sep 22, 2015 at 12:01 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
It turns out there's one aspect of the accepted proposal that I *think* I understand, but want to confirm: the datetime -> POSIX timestamp -> datetime roundtrip for missing times.
If I'm reading the PEP correctly, the defined invariant for local times that actually exist is:
dt == datetime.fromtimestamp(dt.timestamp())
Yup, except for floating point errors! Those have been fixed, finally: http://bugs.python.org/issue23517 is now closed (the fix didn't make 3.5.0; it will be in 3.5.1 though).
No confusion there for the unambiguous times, or for times in a fold. In the latter case, the timestamps produced match the points where the UTC times match the local times in the "In the Fold" UTC/local diagram.
And this is where the fold flag is essential for the roundtripping.
The subtle part is the handling of the "timestamp()" method for the "missing" times where the given time doesn't actually correspond to a valid time in the applicable timezone (local time for a naive datetime object).
Based on the UTC/local diagram from the "Mind the Gap" section, am I correct in thinking that the modified invariant that also covers times in a gap is:
dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp())
That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
That can't be right -- There is no way any fromtimestamp() call can return a time in the gap. I think about the only useful invariant here is dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
Regards, Nick.
P.S. Thanks to whoever drew the diagrams for the In the Fold/Mind the Gap sections - I found them incredibly helpful in understanding the change!
You're welcome. It was a collaboration by myself and Alexander. I drew the first version by hand because I couldn't follow the math without a visual aid. :-) -- --Guido van Rossum (python.org/~guido)
On Tue, Sep 22, 2015 at 10:43 AM, Guido van Rossum <guido@python.org> wrote:
Based on the UTC/local diagram from the "Mind the Gap" section, am I
correct in thinking that the modified invariant that also covers times in a gap is:
dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp())
That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
That can't be right -- There is no way any fromtimestamp() call can return a time in the gap.
I don't think Nick said that.
I think about the only useful invariant here is
dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
Yes, this is just another way to say that .astimezone() conversions are now "lossless."
On Tue, Sep 22, 2015 at 10:55 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 22, 2015 at 10:43 AM, Guido van Rossum <guido@python.org> wrote:
Based on the UTC/local diagram from the "Mind the Gap" section, am I
correct in thinking that the modified invariant that also covers times in a gap is:
dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp())
That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
That can't be right -- There is no way any fromtimestamp() call can return a time in the gap.
I don't think Nick said that.
On the second reading, it looks like Nick's second sentence contradicts his first. Guido is right. Moreover, there is no way to get a time in the gap as a result of any conversion including astimezone() and fromutc() in addition to fromtimestamp(). Such datetimes may appear if you construct them explicitly, use .replace() to transplant a datetime to another timezone (or modify other components) and in the result of datetime + timedelta operation.
On 23 September 2015 at 01:09, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 22, 2015 at 10:55 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 22, 2015 at 10:43 AM, Guido van Rossum <guido@python.org> wrote:
Based on the UTC/local diagram from the "Mind the Gap" section, am I correct in thinking that the modified invariant that also covers times in a gap is:
dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp())
That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
That can't be right -- There is no way any fromtimestamp() call can return a time in the gap.
I don't think Nick said that.
On the second reading, it looks like Nick's second sentence contradicts his first. Guido is right. Moreover, there is no way to get a time in the gap as a result of any conversion including astimezone() and fromutc() in addition to fromtimestamp(). Such datetimes may appear if you construct them explicitly, use .replace() to transplant a datetime to another timezone (or modify other components) and in the result of datetime + timedelta operation.
Sorry, what I wrote in the code wasn't what I wrote in the text, but I didn't notice until Guido pointed out the discrepancy. To get the right universal invariant, I should have normalised the LHS, not the RHS: dt.astimezone(utc).astimezone(dt.tzinfo) == datetime.fromtimestamp(dt.timestamp()) For unambiguous times and times in the fold, that's a subset of the stronger invariant: dt == dt.astimezone(utc).astimezone(dt.tzinfo) == datetime.fromtimestamp(dt.timestamp()) That stronger invariant is the one that *doesn't* hold for times in the gap, as with fold=0 they'll get normalised to use the right UTC offset (same UTC time, nominally an hour later local time), while with fold=1 they get mapped to an hour earlier in both UTC and local time. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
[Nick Coghlan <ncoghlan@gmail.com>]
... Sorry, what I wrote in the code wasn't what I wrote in the text, but I didn't notice until Guido pointed out the discrepancy. To get the right universal invariant, I should have normalised the LHS, not the RHS:
dt.astimezone(utc).astimezone(dt.tzinfo) == datetime.fromtimestamp(dt.timestamp())
That's always False, since it's comparing an aware datetime (on the left) with a naive datetime (on the right). There's also that, without a tzinfo argument, .fromtimestamp() creates a naive datetime via converting to the current system zone (which may have nothing to do with dt.tzinfo). So add a dt.tzinfo argument to the .fromtimestamp() call, and then it will work as intended. But then it's just saying that two ways of _spelling_ "convert to UTC and back" are equivalent, which isn't saying much ;-) Guido's reply gave a clearer invariant: dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp() That's the key point. If the timestamps are equivalent, then it follows that conversions to UTC are equivalent (UTC calendar notation is just another way to spell a POSIX timestamp), and so also that conversions back from UTC are equivalent.
For unambiguous times and times in the fold, that's a subset of the
You meant "ambiguous" there?
stronger invariant:
dt == dt.astimezone(utc).astimezone(dt.tzinfo) == datetime.fromtimestamp(dt.timestamp())
Same notes as above (always False because ...).
That stronger invariant is the one that *doesn't* hold for times in the gap, as with fold=0 they'll get normalised to use the right UTC offset (same UTC time,
Same UTC time as what? There is no UTC time corresponding to a local gap time, since the latter "doesn't really exist". We're making one up, in the fold=0 case acting as if the user had remembered to move their clock hands forward when the gap started. "Garbage in, good guess out" ;-)
nominally an hour later local time),
Bingo.
while with fold=1 they get mapped to an hour earlier in both UTC and local time.
Yup. All obvious to the most casual observer ;-)
On 24 Sep 2015 01:21, "Tim Peters" <tim.peters@gmail.com> wrote:
Guido's reply gave a clearer invariant:
dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
That's the key point. If the timestamps are equivalent, then it follows that conversions to UTC are equivalent (UTC calendar notation is just another way to spell a POSIX timestamp), and so also that conversions back from UTC are equivalent.
Thanks for the additional clarifications all, I'm confident I understand now :) Might it be worth mentioning Guido's invariant in the section of the PEP about the timestamp method? Cheers, Nick.
[Tim Peters]
Guido's reply gave a clearer invariant:
dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
[ Nick Coghlan]
Might it be worth mentioning Guido's invariant in the section of the PEP about the timestamp method?
The case of missing time in Guido's invariant is rather subtle. What is happening is that .timestamp() and .astimezone(..) methods use the same "normalization" to interpret what dt means. This is not obvious in the expression above. Particularly in dt.astimezone(<any other tz>).timestamp(). Here, if instead of <any other tz> we pass dt.tzinfo, then .astimezone(..) becomes a noop and "normalization" happens in .timestamp(). I don't think exposing all this in the PEP will help. Let's return to this when it is time to write the reference documentation.
On 09/24/2015 04:47 AM, Nick Coghlan wrote:
On 24 Sep 2015 01:21, "Tim Peters" <tim.peters@gmail.com> wrote:
Guido's reply gave a clearer invariant:
dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
Might it be worth mentioning Guido's invariant in the section of the PEP about the timestamp method?
Or maybe worth adding it as unittest? Regards, francis
[Nick Coghlan]
Based on the UTC/local diagram from the "Mind the Gap" section, am I correct in thinking that the modified invariant that also covers times in a gap is:
dt == datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp())
That is, for local times that exist, the invariant "dt == dt.astimezone(utc).astimezone(dt.tzinfo)" holds, but for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
[Guido]
That can't be right -- There is no way any fromtimestamp() call can return a time in the gap.
[Alexander Belopolsky]
I don't think Nick said that.
I do, except that he didn't ;-) Count the parens carefully. The top-level operation on the RHS is datetime.fromtimestamp(). However, it didn't pass a tzinfo, so it creates a naive datetime. Assuming dt was aware to begin with, the attempt to compare will always (gap or not) raise an exception. If it had passed dt.tzinfo, then Guido is right.
I think about the only useful invariant here is
dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(<any other tz>).timestamp()
Yes, this is just another way to say that .astimezone() conversions are now
Nick, to be very clear, there are two scenarios here after PEP 495 is implemented in Python: 1. You're using a pre-495 tzinfo. Then nothing changes from what happens today. 2. You're using a new 495-conforming tzinfo. Then the discussion starts to apply.
[Tim]
... The top-level operation on the RHS is datetime.fromtimestamp(). However, it didn't pass a tzinfo, so it creates a naive datetime. Assuming dt was aware to begin with, the attempt to compare will always (gap or not) raise an exception.
Oops! In current Python, comparing naive and aware via `==` just returns False. That's even more confusing ;-)
On Tue, Sep 22, 2015 at 8:16 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
... The top-level operation on the RHS is datetime.fromtimestamp(). However, it didn't pass a tzinfo, so it creates a naive datetime. Assuming dt was aware to begin with, the attempt to compare will always (gap or not) raise an exception.
Oops! In current Python, comparing naive and aware via `==` just returns False. That's even more confusing ;-)
Hm, but that's in general how == is *supposed* to work between objects of incompatible types. < and > are supposed to fail but == is supposed to return False (the __eq__ should return NotImplemented). If == ever raises an exception, having two different objects as dict keys can cause random, hard-to-debug failures. -- --Guido van Rossum (python.org/~guido)
[Tim]
... The top-level operation on the RHS is datetime.fromtimestamp(). However, it didn't pass a tzinfo, so it creates a naive datetime. Assuming dt was aware to begin with, the attempt to compare will always (gap or not) raise an exception.
[Tim]
Oops! In current Python, comparing naive and aware via `==` just returns False. That's even more confusing ;-)
[Guido]
Hm, but that's in general how == is *supposed* to work between objects of incompatible types. < and > are supposed to fail but == is supposed to return False (the __eq__ should return NotImplemented). If == ever raises an exception, having two different objects as dict keys can cause random, hard-to-debug failures.
Sure - no complaint. I was just saying that in the specific, complicated, contrived expression Nick presented, that it always returns False (no matter which aware datetime he starts with) would be more of a head-scratcher than if it raised a "can't compare naive and aware datetimes" exception instead. That's why, whenever anyone is confused by anything they see in a Python program, they should post all their code verbatim to Python-Dev, prefaced with a "Doesn't work! Fix it." comment ;-)
On Tue, Sep 22, 2015 at 9:47 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
... The top-level operation on the RHS is datetime.fromtimestamp(). However, it didn't pass a tzinfo, so it creates a naive datetime. Assuming dt was aware to begin with, the attempt to compare will always (gap or not) raise an exception.
[Tim]
Oops! In current Python, comparing naive and aware via `==` just returns False. That's even more confusing ;-)
[Guido]
Hm, but that's in general how == is *supposed* to work between objects of incompatible types. < and > are supposed to fail but == is supposed to return False (the __eq__ should return NotImplemented). If == ever raises an exception, having two different objects as dict keys can cause random, hard-to-debug failures.
Sure - no complaint. I was just saying that in the specific, complicated, contrived expression Nick presented, that it always returns False (no matter which aware datetime he starts with) would be more of a head-scratcher than if it raised a "can't compare naive and aware datetimes" exception instead.
And yet I think the desired behavior of == requires us to return False. I think we should change this in the PEP, except I can't find where the PEP says == should raise an exception in this case.
That's why, whenever anyone is confused by anything they see in a Python program, they should post all their code verbatim to Python-Dev, prefaced with a "Doesn't work! Fix it." comment ;-)
Oh, it would be so much better if they posted their code! -- --Guido van Rossum (python.org/~guido)
[Tim]
Sure - no complaint. I was just saying that in the specific, complicated, contrived expression Nick presented, that it always returns False (no matter which aware datetime he starts with) would be more of a head-scratcher than if it raised a "can't compare naive and aware datetimes" exception instead.
[Guido]
And yet I think the desired behavior of == requires us to return False.
Yes - we remain in violent agreement on all points here.
I think we should change this in the PEP, except I can't find where the PEP says == should raise an exception in this case.
It doesn't - the only comparison behavior changed by the PEP is in case of interzone comparison when at least one comparand is a "problem time" (which can only happen with a post-495 tzinfo). Then "==" is always False. That hack is the ugliest part of the PEP, but was needed to preserve the hash invariant (d1 == d2 implies hash(d1) == hash(d2)). BTW, while the PEP doesn't spell this out, trichotomy can fail in some such cases (those where "==" would have returned True had it not been forced to return False - then "<" and ">" will also be False). In any case, nothing changes for any case of aware-vs-naive comparison.
On Tue, Sep 22, 2015 at 10:34 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
Sure - no complaint. I was just saying that in the specific, complicated, contrived expression Nick presented, that it always returns False (no matter which aware datetime he starts with) would be more of a head-scratcher than if it raised a "can't compare naive and aware datetimes" exception instead.
[Guido]
And yet I think the desired behavior of == requires us to return False.
Yes - we remain in violent agreement on all points here.
I think we should change this in the PEP, except I can't find where the PEP says == should raise an exception in this case.
It doesn't - the only comparison behavior changed by the PEP is in case of interzone comparison when at least one comparand is a "problem time" (which can only happen with a post-495 tzinfo). Then "==" is always False. That hack is the ugliest part of the PEP, but was needed to preserve the hash invariant (d1 == d2 implies hash(d1) == hash(d2)).
BTW, while the PEP doesn't spell this out, trichotomy can fail in some such cases (those where "==" would have returned True had it not been forced to return False - then "<" and ">" will also be False).
In any case, nothing changes for any case of aware-vs-naive comparison.
And I guess we can't make < and > raise an exception for backward compatibility reasons. :-( -- --Guido van Rossum (python.org/~guido)
On Tue, Sep 22, 2015 at 1:57 PM, Guido van Rossum <guido@python.org> wrote:
BTW, while the PEP doesn't spell this out, trichotomy can fail in some
such cases (those where "==" would have returned True had it not been forced to return False - then "<" and ">" will also be False).
In any case, nothing changes for any case of aware-vs-naive comparison.
And I guess we can't make < and > raise an exception for backward compatibility reasons. :-(
Just to make it clear, naive to aware comparison is an error now and will still be an error:
from datetime import * datetime.now() > datetime.now(timezone.utc) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't compare offset-naive and offset-aware datetimes
What would be nice, was if datetime.now(tz1) > datetime.now(tz2) was an error whenever tz1 is not tz2, but this is not possible for backward compatibility reasons. I was toying with an idea to make t > s an error whenever the result depends on the value of t.fold or s.fold, but the resulting rules were even uglier than the hash invariant compromise. At the end of the day, this is the case of practicality beating purity. We overload > and - in datetime for convenience of interzone operations. (Want to know number of microseconds since epoch? Easy: (t - datetime(1970, 1, 1, tzinfo=timezone.utc))//timedelta.resolution). We pay for this convenience by a loss of some properties that we expect from mathematical operations (e.g. s - t != (s - u) - (t - u) is possible.) I think this is a fair price to pay for convenience of s > t and s - t over s.is_later(t) and s.timediff(t). Arguably, requiring s.timezone(utc) > t.astimezone(utc) would be "explicit is better than implicit," but you cannot deny the convenience of plain s > t.
On Tue, Sep 22, 2015 at 11:26 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 22, 2015 at 1:57 PM, Guido van Rossum <guido@python.org> wrote:
BTW, while the PEP doesn't spell this out, trichotomy can fail in some
such cases (those where "==" would have returned True had it not been forced to return False - then "<" and ">" will also be False).
In any case, nothing changes for any case of aware-vs-naive comparison.
And I guess we can't make < and > raise an exception for backward compatibility reasons. :-(
Just to make it clear, naive to aware comparison is an error now and will still be an error:
Ah, I just realized one of the confusions here is the use of the word "comparison", since it could refer to == or to < and >.
from datetime import * datetime.now() > datetime.now(timezone.utc) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't compare offset-naive and offset-aware datetimes
IIUC the < and > operators raise here, and == returns False -- which is exactly as it should be.
What would be nice, was if datetime.now(tz1) > datetime.now(tz2) was an error whenever tz1 is not tz2, but this is not possible for backward compatibility reasons.
I was toying with an idea to make t > s an error whenever the result depends on the value of t.fold or s.fold, but the resulting rules were even uglier than the hash invariant compromise.
At the end of the day, this is the case of practicality beating purity. We overload > and - in datetime for convenience of interzone operations. (Want to know number of microseconds since epoch? Easy: (t - datetime(1970, 1, 1, tzinfo=timezone.utc))//timedelta.resolution). We pay for this convenience by a loss of some properties that we expect from mathematical operations (e.g. s - t != (s - u) - (t - u) is possible.) I think this is a fair price to pay for convenience of s > t and s - t over s.is_later(t) and s.timediff(t). Arguably, requiring s.timezone(utc) > t.astimezone(utc) would be "explicit is better than implicit," but you cannot deny the convenience of plain s > t.
But the convenience is false -- it papers over important details. And it is broken, due to the confusion about classic vs. timeline arithmetic -- these have different needs but there's only one > operator. -- --Guido van Rossum (python.org/~guido)
On Tue, Sep 22, 2015 at 3:32 PM, Guido van Rossum <guido@python.org> wrote:
it is broken, due to the confusion about classic vs. timeline arithmetic -- these have different needs but there's only one > operator.
I feel silly trying to defend a design against its author. :-) Yes, a language with more than one > symbol would not have some of these problems. Similarly a language with a special symbol for string catenation would not have a non-commutative + and non-distributive *. All I am saying is that I can live with the choices made in datetime.
On Sep 22, 2015 1:09 PM, "Alexander Belopolsky" < alexander.belopolsky@gmail.com> wrote:
On Tue, Sep 22, 2015 at 3:32 PM, Guido van Rossum <guido@python.org>
it is broken, due to the confusion about classic vs. timeline arithmetic
-- these have different needs but there's only one > operator.
I feel silly trying to defend a design against its author. :-) Yes, a language with more than one > symbol would not have some of these
wrote: problems. Similarly a language with a special symbol for string catenation would not have a non-commutative + and non-distributive *. All I am saying is that I can live with the choices made in datetime. Is there a good argument against at least deprecating inequality comparisons and subtraction between mixed timezone datetimes? It seems like a warning that would be likely to catch real bugs. -n
On Tue, Sep 22, 2015 at 4:14 PM, Nathaniel Smith <njs@pobox.com> wrote:
Is there a good argument against at least deprecating inequality comparisons and subtraction between mixed timezone datetimes?
That's a wrong question. The right question is: "Is current behavior sufficiently broken to justify a backward incompatible change?" We've historically been very conservative with datetime. I would say a proposal to change the way binary operators work with datetimes should face a similar scrutiny as a proposal to change that for a bultin type.
[Guido]
it is broken, due to the confusion about classic vs. timeline arithmetic -- these have different needs but there's only one > operator.
[Alex]
I feel silly trying to defend a design against its author. :-)
"Design" may be an overstatement in this specific case ;-) I remember implementing this stuff, getting to the comparison operators, and noting that the spec was silent about what to do in case the tzinfos differed. I looked at Guido and explained that, and asked "so whaddya wanna do?". One of us (I don't recall which) said "well, we could convert to UTC first - that would make sense". "Ya, sure," said the other. And I said "and then, of course, interzone subtraction should do the same." "Of course," said Guido, now annoyed that I was bothering him with the obvious ;-) Note that, near that time, Python blithely compared _any_ two objects, even if of wildly different types. Compared to that, doing _anything_ arguably sane with datetime objects seemed wildly desirable. Ironically, the datetime implementation was Python's first library type to _refuse_ to compare its objects to others of wildly different types. So, in all, I'd say well under a minute's thought - between us - went into this decision. And we've been living in Paradise ever since :-)
Yes, a language with more than one > symbol would not have some of these problems. Similarly a language with a special symbol for string catenation would not have a non-commutative + and non- distributive *. All I am saying is that I can live with the choices made in datetime.
Given that the alternative is suicide, I approve of that life decision.
Yeah, sadly the point where we *should* have made this clean break was in 3.0. That's where e.g. inequalities between a string and a number, or between either type and None, were changed from returning something "arbitrary but stable" into raising TypeError. It's much harder to break it now, even with endless deprecation warnings. We might want to try again in 4.0. :-) On Tue, Sep 22, 2015 at 1:25 PM, Tim Peters <tim.peters@gmail.com> wrote:
[Guido]
it is broken, due to the confusion about classic vs. timeline arithmetic -- these have different needs but there's only one > operator.
[Alex]
I feel silly trying to defend a design against its author. :-)
"Design" may be an overstatement in this specific case ;-)
I remember implementing this stuff, getting to the comparison operators, and noting that the spec was silent about what to do in case the tzinfos differed. I looked at Guido and explained that, and asked "so whaddya wanna do?".
One of us (I don't recall which) said "well, we could convert to UTC first - that would make sense". "Ya, sure," said the other. And I said "and then, of course, interzone subtraction should do the same." "Of course," said Guido, now annoyed that I was bothering him with the obvious ;-)
Note that, near that time, Python blithely compared _any_ two objects, even if of wildly different types. Compared to that, doing _anything_ arguably sane with datetime objects seemed wildly desirable. Ironically, the datetime implementation was Python's first library type to _refuse_ to compare its objects to others of wildly different types.
So, in all, I'd say well under a minute's thought - between us - went into this decision. And we've been living in Paradise ever since :-)
Yes, a language with more than one > symbol would not have some of these problems. Similarly a language with a special symbol for string catenation would not have a non-commutative + and non- distributive *. All I am saying is that I can live with the choices made in datetime.
Given that the alternative is suicide, I approve of that life decision.
-- --Guido van Rossum (python.org/~guido)
[Guido]
I think we should change this in the PEP, except I can't find where the PEP says == should raise an exception in this case.
[Tim]
It doesn't - the only comparison behavior changed by the PEP is in case of interzone comparison when at least one comparand is a "problem time" (which can only happen with a post-495 tzinfo). Then "==" is always False. That hack is the ugliest part of the PEP,
Correction: that's _only_ "ugly part" of the PEP. Except for that, it's quite elegant :-)
but was needed to preserve the hash invariant (d1 == d2 implies hash(d1) == hash(d2)).
BTW, while the PEP doesn't spell this out, trichotomy can fail in some such cases (those where "==" would have returned True had it not been forced to return False - then "<" and ">" will also be False).
[Guido]
And I guess we can't make < and > raise an exception for backward compatibility reasons. :-(
Bingo. But, in its favor, that would be less incompatible than removing hash() and dicts from the language ;-) Another oddity is that interzone subtraction always gets "the right" result, even in interzone cases where "x == y" is forced to return False despite that "x - y == timedelta(0)". cmp-like comparison still enjoys trichotomy in all cases. Note that, for a week or two, we _tried_ to get around all that by making x != y for intrazone x and y differing only in `fold`. But that was so at odds with the naive time model that worse kinds of incompatibility snuck in. So we can make intrazone "naive time" work as expected in all cases, or we can make by-magic interzone subtraction and comparison work as expected in all cases. We started and ended with the former, with a painful abandoned attempt at the latter in between. If there's a sane(*) way to do both simultaneously, in a backward-compatible way, it eluded everyone. (*) An "insane" way would be to partition all aware datetimes into equivalence classes based on lumping together all possible spellings of a "problem time" in all zones, so that hash() could treat them all the same. But that requires knowing all possible tzinfos that can ever be used before the first time hash() is called. Even then it would be messy to do.
On Tue, Sep 22, 2015 at 12:47 PM, Tim Peters <tim.peters@gmail.com> wrote:
I was just saying that in the specific, complicated, contrived expression Nick presented, that it always returns False (no matter which aware datetime he starts with) would be more of a head-scratcher than if it raised a "can't compare naive and aware datetimes" exception instead.
The current behavior is no fault of yours. Guido, Antoine and I share all the blame and credit for it. [1,2] [1]: http://bugs.python.org/issue15006 [2]: https://mail.python.org/pipermail/python-dev/2012-June/119933.html
On Tue, Sep 22, 2015 at 11:11 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Nick Coghlan] ...
dt ==
datetime.fromtimestamp(dt.astimezone(utc).astimezone(dt.tzinfo).timestamp()) ... [Guido] That can't be right -- There is no way any fromtimestamp() call can return a time in the gap.
[Alexander Belopolsky]
I don't think Nick said that.
[Tim Peters]
I do, except that he didn't ;-) Count the parens carefully.
OK, it looks like Nick has managed to confuse both authors of the PEP, but not Guido. :-) The .astimezone() conversions in Nick's expression are a red herring. They don't change the value of the timestamp. That's the invariant Guido mentioned: dt.timestamp() == dt.astimezone(utc).timestamp() == dt.astimezone(utc).astimezone(dt.tzinfo).timestamp() Now, if dt is in its tzinfo gap, then dt != datetime.fromtimestamp(dt.timestamp(), dt.tzinfo) Instead, you get something like this: datetime.fromtimestamp(dt.timestamp(), dt.tzinfo) == dt + (1 - 2*dt.fold) * gap where gap is the size of the gap expressed as a timedelta (typically gap = timedelta(hours=1)). In words, when you ask for 2:40 AM, but the clock jumps from 01:59 AM to 03:00 AM, the round trip through timestamp gives you 03:40 AM if fold=0 and 01:40 AM if fold=1. This rule is somewhat arbitrary, but it has many nice mathematical and "human" properties. (There is an (imperfect) analogy with the roots of a quadratic equation here: when no real solutions exist, the two complex solutions are a ± i*b and "nice to have" real values are a ± b.)
On Tue, Sep 22, 2015 at 3:01 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
... for times that don't exist, "dt.astimezone(utc).astimezone(dt.tzinfo)" will normalise them to be a time that actually exists in the original time zone, and that normalisation also effectively happens when calling "dt.timestamp()".
Yes. In fact, if you consider the canonical bijection between timestamps and datetimes (t = EPOCH + s * timedelta(0, 1); s = (t - EPOCH) / timedelta(0, 1)), t.astimezone(utc) and t.timestamp() become the same up to some annoying numerical details. The same logic applies to u.astimezone(tzinfo) and datetime.fromtimestamp(s). Note that I deliberately did not mark the units on the sketches: you can think of the UTC axis to be labeled by datetimes or by numeric timestamps. Note that dt != dt.astimezone(utc).astimezone(dt.tzinfo) is one way to detect that dt is in a gap, but I recommend (dt.replace(fold=0).utcoffset() > dt.replace(fold=1).utcoffset().)
participants (7)
-
Alexander Belopolsky
-
francismb
-
Guido van Rossum
-
Nathaniel Smith
-
Nick Coghlan
-
Tim Peters
-
Victor Stinner