Hi! I wrote PEP-431 two years ago, and never got around to implement it. This year I got some renewed motivation after Berker Peksağ made an effort of implementing it. I'm planning to work more on this during the PyCon sprints, and also have a BoF session or similar during the conference. Anyone interested in a session on this, mail me and we'll set up a time and place! //Lennart ------------------ If anyone is interested in the details of the problem, this is it. The big problem is the ambiguous times, like 02:30 a time when you move the clock back one hour, as there are two different 02:30's that day. I wrote down my experiences with looking into and trying to implement several different solutions. And the problem there is actually how to tell the datetime if it is before or after the changeover. == How others have solved it == === dateutil.tz: Ignore the problem === dateutil.tz simply ignores the problems with ambiguous datetimes, keeping them ambiguous. === pytz: One timezone instance per changeover === Pytz implements ambiguous datetimes by having one class per timezone. Each change in the UTC offset changes, either because of a DST changeover, or because the timezone changes, is represented as one instance of the class. All instances are held in a list which is a class attribute of the timezone class. You flag in which DST changeover you are by uising different instances as the datetimes tzinfo. Since the timezone this way knows if it is DST or not, the datetime as a whole knows if it's DST or not. Benefits: - Only known possible implementation without modifying stdlib, which of course was a requirement, as pytz is a third-party library. - DST offset can be quickly returned, as it does not need to be calculated. Drawbacks: - A complex and highly magical implementation of timezones that is hard to understand. - Required new normalize()/localize() functions on the timezone, and hence the API is not stdlib's API. - Hundreds of instances per timezone means slightly more memory usage. == Options for PEP 431 == === Stdlib option 0: Ignore it === I don't think this is an option, really. Listed for completness. === Stdlib option 1: One timezone instance per changeover === Option 1 is to do it like pytz, have one timezone instance per changeover. However, this is likely not possible to do without fundamentally changing the datetime API, or making it very hard to use. For example, when creating a datetime instance and passing in a tzinfo today this tzinfo is just attached to the datetime. But when having multiple instances of tzinfos this means you have to select the correct one to pass in. pytz solves this with the .localize() method, which let's the timezone class choose which instance to pass in. We can't pass in the timezone class into datetime(), because that would require datetime.__new__ to create new datetimes as a part of the timezone arithmetic. These in turn, would create new datetimes in __new__ as a part of the timezone arithmetic, which in turn, yeah, you get it... I haven't been able to solve that issue without either changing the API/usage, or getting infinite recursions. Benefits: - Proven soloution through pytz. - Fast dst() call. Drawbacks: - Trying to use this technique with the current API tends to create infinite recursions. It seems to require big API changes. - Slow datetime() instance creation. === Stdlib option 2: A datetime _is_dst flag === By having a flag on the datetime instance that says "this is in DST or not" the timezone implementation can be kept simpler. You also have to either calculate if the datetime is in a DST or not either when creating it, which demands datetime object creations, and causes infinite recursions, or you have to calculate it when needed, which means you can get "Ambiguous date time errors" at unexpected times later. Also, when trying to implement this, I get bogged down in the complexities of how tzinfo and datetime is calling each other back and forth, and when to pass in the current is_dst and when to pass in the the desired is_dst, etc. The API and current implementation is not designed with this case in mind, and it gets very tricky. Benefits: - Simpler tzinfo() implementations. Drawbacks: - It seems likely that we must change some API's. - This in turn may affect the pytz implementation. Or not, hard to say. - The DST offset must use slow timezone calculations. However, since datetimes are immutable it can be a cached, lazy, one-time operation. === Stdlib option 3: UTC internal representation === Having UTC as the internal representation makes the whole issue go away. Datetimes are no longer ambiguous, except when creating, so checks need to be done during creation, but that should be possible without datetime creation in this case, resolving the infinite recursion problem. Benefits: - Problem solved. - Minimal API changes. Drawbacks: - Backwards compatibility with pickles. - Possible other backwards incompatibility problems. - Both DST offset and date time display representation must use slow timezone calculations. However, since datetimes are immutable it can be a cached, lazy, one-time operation. I'm currently trying to implement solution #2 above. Feedback is welcome.
On Wed, Apr 8, 2015 at 11:18 AM, Lennart Regebro
=== Stdlib option 2: A datetime _is_dst flag ===
By having a flag on the datetime instance that says "this is in DST or not" the timezone implementation can be kept simpler.
I floated this idea [1] back in the days when we discussed the datetime.timestamp() method. The attraction was that such API would be familiar to the users of POSIX mktime and struct tm, but the history have shown that these POSIX APIs were insufficient in many situations and struct tm was extended by may libraries to include non-standard tm_gmtoff and tm_zone fields. With datetime, we also have a problem that POSIX APIs don't have to deal with: local time arithmetics. What is t + timedelta(1) when t falls on the day before DST change? How would you set the isdst flag in the result? [1] http://bugs.python.org/issue2736#msg124237
Hi Lennart, On 04/08/2015 09:18 AM, Lennart Regebro wrote:
I wrote PEP-431 two years ago, and never got around to implement it. This year I got some renewed motivation after Berker Peksağ made an effort of implementing it. I'm planning to work more on this during the PyCon sprints, and also have a BoF session or similar during the conference.
Anyone interested in a session on this, mail me and we'll set up a time and place!
I'm interested in the topic, and would probably attend a BoF at PyCon. Comments below:
If anyone is interested in the details of the problem, this is it.
The big problem is the ambiguous times, like 02:30 a time when you move the clock back one hour, as there are two different 02:30's that day. I wrote down my experiences with looking into and trying to implement several different solutions. And the problem there is actually how to tell the datetime if it is before or after the changeover.
== How others have solved it ==
=== dateutil.tz: Ignore the problem ===
dateutil.tz simply ignores the problems with ambiguous datetimes, keeping them ambiguous.
=== pytz: One timezone instance per changeover ===
Pytz implements ambiguous datetimes by having one class per timezone. Each change in the UTC offset changes, either because of a DST changeover, or because the timezone changes, is represented as one instance of the class.
All instances are held in a list which is a class attribute of the timezone class. You flag in which DST changeover you are by uising different instances as the datetimes tzinfo. Since the timezone this way knows if it is DST or not, the datetime as a whole knows if it's DST or not.
Benefits: - Only known possible implementation without modifying stdlib, which of course was a requirement, as pytz is a third-party library. - DST offset can be quickly returned, as it does not need to be calculated. Drawbacks: - A complex and highly magical implementation of timezones that is hard to understand. - Required new normalize()/localize() functions on the timezone, and hence the API is not stdlib's API. - Hundreds of instances per timezone means slightly more memory usage.
== Options for PEP 431 ==
=== Stdlib option 0: Ignore it ===
I don't think this is an option, really. Listed for completness.
=== Stdlib option 1: One timezone instance per changeover ===
Option 1 is to do it like pytz, have one timezone instance per changeover. However, this is likely not possible to do without fundamentally changing the datetime API, or making it very hard to use.
For example, when creating a datetime instance and passing in a tzinfo today this tzinfo is just attached to the datetime. But when having multiple instances of tzinfos this means you have to select the correct one to pass in. pytz solves this with the .localize() method, which let's the timezone class choose which instance to pass in.
We can't pass in the timezone class into datetime(), because that would require datetime.__new__ to create new datetimes as a part of the timezone arithmetic. These in turn, would create new datetimes in __new__ as a part of the timezone arithmetic, which in turn, yeah, you get it...
I haven't been able to solve that issue without either changing the API/usage, or getting infinite recursions.
Benefits: - Proven soloution through pytz. - Fast dst() call. Drawbacks: - Trying to use this technique with the current API tends to create infinite recursions. It seems to require big API changes. - Slow datetime() instance creation.
I think "proven solution" is a significant benefit. Today, anyone who is serious about correct timezone handling in Python is almost certainly using pytz. So is adopting pytz's expanded API into the stdlib really a big problem? It probably presents _fewer_ back-compatibility issues with real-world code than taking a different approach from pytz would.
=== Stdlib option 2: A datetime _is_dst flag ===
By having a flag on the datetime instance that says "this is in DST or not" the timezone implementation can be kept simpler.
Is this really adequate? pytz's implementation handles far more than "is DST or not", it also correctly handles historical timezone changes. How would those be handled under this proposal?
You also have to either calculate if the datetime is in a DST or not either when creating it, which demands datetime object creations, and causes infinite recursions, or you have to calculate it when needed, which means you can get "Ambiguous date time errors" at unexpected times later.
Also, when trying to implement this, I get bogged down in the complexities of how tzinfo and datetime is calling each other back and forth, and when to pass in the current is_dst and when to pass in the the desired is_dst, etc. The API and current implementation is not designed with this case in mind, and it gets very tricky.
Benefits: - Simpler tzinfo() implementations. Drawbacks: - It seems likely that we must change some API's. - This in turn may affect the pytz implementation. Or not, hard to say. - The DST offset must use slow timezone calculations. However, since datetimes are immutable it can be a cached, lazy, one-time operation.
=== Stdlib option 3: UTC internal representation ===
Having UTC as the internal representation makes the whole issue go away. Datetimes are no longer ambiguous, except when creating, so checks need to be done during creation, but that should be possible without datetime creation in this case, resolving the infinite recursion problem.
Benefits: - Problem solved. - Minimal API changes. Drawbacks: - Backwards compatibility with pickles. - Possible other backwards incompatibility problems. - Both DST offset and date time display representation must use slow timezone calculations. However, since datetimes are immutable it can be a cached, lazy, one-time operation.
If designing a library from scratch without any back-compat considerations, this is probably the first approach I would try. I would favor either solution 1 or 3. Carl
On Wed, Apr 8, 2015 at 7:37 PM, Carl Meyer
Hi Lennart,
On 04/08/2015 09:18 AM, Lennart Regebro wrote:
I wrote PEP-431 two years ago, and never got around to implement it. This year I got some renewed motivation after Berker Peksağ made an effort of implementing it. I'm planning to work more on this during the PyCon sprints, and also have a BoF session or similar during the conference.
Anyone interested in a session on this, mail me and we'll set up a time and place!
I'm interested in the topic, and would probably attend a BoF at PyCon.
Cool!
So is adopting pytz's expanded API into the stdlib really a big problem?
Maybe, maybe not. But that API is also needlessly complicated, precisely because it's working around the limitations of datetime.tzinfo. In the PEP I remove those limitations but keep the simpler API. With a solution based on how pytz does it, I don't think that's possible.
Is this really adequate? pytz's implementation handles far more than "is DST or not", it also correctly handles historical timezone changes. How would those be handled under this proposal?
Those would still be handled. The flag is only to flag if it's DST or not in a timestamp that is otherwise ambiguous. //Lennart
On 9 April 2015 at 04:38, Lennart Regebro
So is adopting pytz's expanded API into the stdlib really a big problem?
Maybe, maybe not. But that API is also needlessly complicated, precisely because it's working around the limitations of datetime.tzinfo. In the PEP I remove those limitations but keep the simpler API. With a solution based on how pytz does it, I don't think that's possible.
Speaking at the pytz author and maintainer, I can categorically state that pytz's extended API sucks.
Is this really adequate? pytz's implementation handles far more than "is DST or not", it also correctly handles historical timezone changes. How would those be handled under this proposal?
Those would still be handled. The flag is only to flag if it's DST or not in a timestamp that is otherwise ambiguous.
Yeah. I look forward to datetime growing an isdst flag, which if
nothing else means they can round trip with the 9-tuples used by the
time module and a simple LocalTz class written that uses time.mktime
and friends. I'd have done it myself if I remembered any C. At that
point, the most disgusting parts of pytz can be torn out, giving it
the standard documented API and I'm more than happy to do that bit.
--
Stuart Bishop
The Open Space was good, and the conclusion was that solution #2 indeed seems to be the right one. We also concluded that likely the datetime() constructor itself needs to grow an is_dst flag. There was no further insight into whether having an offset or a is_dst flag as an attribute, I think that will clear up during the implementation. //Lennart
OK, so I realized another thing today, and that is that arithmetic doesn't necessarily round trip. For example, 2002-10-27 01:00 US/Eastern comes both in DST and STD. But 2002-10-27 01:00 US/Eastern STD minus two days is 2002-10-25 01:00 US/Eastern DST However, 2002-10-25 01:00 US/Eastern DST plus two days is 2002-10-27 01:00 US/Eastern, but it is ambiguous if you want DST or not DST. And you can't pass in a is_dst flag to __add__, so the arithmatic must just pick one, and the sensible one is to keep to the same DST. That means that: tz = get_timezone('US/Eastern') dt = datetime(2002, 10, 27, 1, 0, tz=tz, is_dst=False) dt2 = dt - 420 + 420 assert dt == dt2 Will fail, which will be unexpected for most people. I think there is no way around this, but I thought I should flag for it. This is a good reason to do all your date time arithmetic in UTC. //Lennart
On 14 April 2015 at 21:04, Lennart Regebro
OK, so I realized another thing today, and that is that arithmetic doesn't necessarily round trip.
For example, 2002-10-27 01:00 US/Eastern comes both in DST and STD.
But 2002-10-27 01:00 US/Eastern STD minus two days is 2002-10-25 01:00 US/Eastern DST However, 2002-10-25 01:00 US/Eastern DST plus two days is 2002-10-27 01:00 US/Eastern, but it is ambiguous if you want DST or not DST. And you can't pass in a is_dst flag to __add__, so the arithmatic must just pick one, and the sensible one is to keep to the same DST.
import pytz from datetime import datetime, timedelta tz = pytz.timezone('US/Eastern') dt = tz.localize(datetime(2002, 10, 27, 1, 0), is_dst=False) dt2 = tz.normalize(dt - timedelta(days=2) + timedelta(days=2)) dt == dt2 True
tz.normalize(dt - timedelta(days=2)) datetime.datetime(2002, 10, 25, 2, 0, tzinfo=
) tz.normalize(tz.normalize(dt - timedelta(days=2)) + timedelta(days=2)) datetime.datetime(2002, 10, 27, 1, 0, tzinfo= )
2002-10-27 01:00 US/Eastern is_dst=0 is after the DST transition
(EST). Subtracting 48 hours from it crosses the DST boundary and
should give you 2002-10-27 02:00 US/Eastern is_dst=1, prior to the DST
transition (EDT). Adding 48 hours again goes past 2002-10-27 01:00
EDT, crosses the DST boundary, and gives you back 2002-10-27 01:00
EST.
--
Stuart Bishop
Yeah, I just realized this. As long as you use timedelta, the
difference is of course not one day, but 24 hours. That solves the
problem, but it is surprising in other ways.
In US/Eastern datetime.datetime(2002, 10, 27, 1, 0) -
datetime.timedelta(1) needs to become datetime.datetime(2002, 10, 26,
2, 0)
(Note the hour change)
I was thinking in calendrial arithmetic, which the datetime module
doesn't need to care about.
On Wed, Apr 15, 2015 at 12:59 AM, Stuart Bishop
On 14 April 2015 at 21:04, Lennart Regebro
wrote: OK, so I realized another thing today, and that is that arithmetic doesn't necessarily round trip.
For example, 2002-10-27 01:00 US/Eastern comes both in DST and STD.
But 2002-10-27 01:00 US/Eastern STD minus two days is 2002-10-25 01:00 US/Eastern DST However, 2002-10-25 01:00 US/Eastern DST plus two days is 2002-10-27 01:00 US/Eastern, but it is ambiguous if you want DST or not DST. And you can't pass in a is_dst flag to __add__, so the arithmatic must just pick one, and the sensible one is to keep to the same DST.
import pytz from datetime import datetime, timedelta tz = pytz.timezone('US/Eastern') dt = tz.localize(datetime(2002, 10, 27, 1, 0), is_dst=False) dt2 = tz.normalize(dt - timedelta(days=2) + timedelta(days=2)) dt == dt2 True
tz.normalize(dt - timedelta(days=2)) datetime.datetime(2002, 10, 25, 2, 0, tzinfo=
) tz.normalize(tz.normalize(dt - timedelta(days=2)) + timedelta(days=2)) datetime.datetime(2002, 10, 27, 1, 0, tzinfo= ) 2002-10-27 01:00 US/Eastern is_dst=0 is after the DST transition (EST). Subtracting 48 hours from it crosses the DST boundary and should give you 2002-10-27 02:00 US/Eastern is_dst=1, prior to the DST transition (EDT). Adding 48 hours again goes past 2002-10-27 01:00 EDT, crosses the DST boundary, and gives you back 2002-10-27 01:00 EST.
-- Stuart Bishop
http://www.stuartbishop.net/
OK, so I just had a realization. Because we want some internal flag to tell if the datetime is in DST or not, the datetime pickle format will change. And the datetime pickle format changing is the biggest reason I had against changing the internal representation to UTC. So because of this, perhaps we actually *should* change the internal representation to UTC, because that makes the issues I'm fighting with now so much simpler. (I'm currently trying to get arithmetic to do the right thing in all cases, which is crazy complicated). We can add support to unpickle previous datetimes, but we won't be able to add forwards compatibility, meaning that pickles saved in Python 3.5 will not be unpicklable in Python 3.4. Please discuss. //Lennart
On Thu, Apr 16, 2015 at 1:00 AM, Lennart Regebro
So because of this, perhaps we actually *should* change the internal representation to UTC, because that makes the issues I'm fighting with now so much simpler. (I'm currently trying to get arithmetic to do the right thing in all cases, which is crazy complicated).
If I understand you correctly, then, an aware datetime would represent a unique instant in time (modulo relativity), coupled with some metadata stating what civil timezone it should be understood in terms of. This is the same as a PostgreSQL "timestamp with time zone" field, and IMO is a pretty reliable way to do things. So count me as +1 for this proposal. Bikeshed: Would arithmetic be based on UTC time or Unix time? It'd be more logical to describe it as "adding six hours means adding six hours to the UTC time", but it'd look extremely odd when there's a leap second. ChrisA
On Wed, Apr 15, 2015 at 11:10 AM, Chris Angelico
Bikeshed: Would arithmetic be based on UTC time or Unix time? It'd be more logical to describe it as "adding six hours means adding six hours to the UTC time", but it'd look extremely odd when there's a leap second.
It would ignore leap seconds. If you want to call that unix time or not is a matter of opinion. Hm. I guess the internal representation *could* be EPOCH + offset, and local times could be calculated properties, which could be cached (or possibly calculated at creation). //Lennart
On Wed, Apr 15, 2015 at 11:43 AM, Lennart Regebro
On Wed, Apr 15, 2015 at 11:10 AM, Chris Angelico
wrote: Bikeshed: Would arithmetic be based on UTC time or Unix time? It'd be more logical to describe it as "adding six hours means adding six hours to the UTC time", but it'd look extremely odd when there's a leap second.
It would ignore leap seconds. If you want to call that unix time or not is a matter of opinion. Hm. I guess the internal representation *could* be EPOCH + offset, and local times could be calculated properties, which could be cached (or possibly calculated at creation).
In any case there wold probably need to be a PEP on that, and that means PEP 431 wouldn't make it into 3.5, unless somebody smarter than me want to take a shot at it. //Lennart
On Thu, Apr 16, 2015 at 1:43 AM, Lennart Regebro
On Wed, Apr 15, 2015 at 11:10 AM, Chris Angelico
wrote: Bikeshed: Would arithmetic be based on UTC time or Unix time? It'd be more logical to describe it as "adding six hours means adding six hours to the UTC time", but it'd look extremely odd when there's a leap second.
It would ignore leap seconds. If you want to call that unix time or not is a matter of opinion. Hm. I guess the internal representation *could* be EPOCH + offset, and local times could be calculated properties, which could be cached (or possibly calculated at creation).
I was just talking about leap seconds, here (which Unix time ignores), not about the internal representation, which is an implementation detail. If a timedelta is represented as a number of seconds, then "adding six hours" really means "adding 6*3600 seconds", and most people would be VERY surprised if one of those is "consumed" by a leap second; but it ought at least to be acknowledged in the docs. ChrisA
On 15 April 2015 at 17:00, Lennart Regebro
OK, so I just had a realization.
Because we want some internal flag to tell if the datetime is in DST or not, the datetime pickle format will change. And the datetime pickle format changing is the biggest reason I had against changing the internal representation to UTC.
So because of this, perhaps we actually *should* change the internal representation to UTC, because that makes the issues I'm fighting with now so much simpler. (I'm currently trying to get arithmetic to do the right thing in all cases, which is crazy complicated).
Huh. I didn't think you would need to change any arithmetic (but haven't looked at this for quite some time). You can already add or subtract timedeltas to timezone aware datetime instances. The problem with the existing implementation is the tzinfo instance does not have enough information to do correct conversions when the time is ambiguous, so it has to guess. With the addition of the is_dst hint to the datetime instance, it will no longer need to guess. Arithmetic remains 'add the timedelta to the naive datetime, and then punt it to the tzinfo to make any necessary adjustments' and I thought this would not need to be changed at all.
We can add support to unpickle previous datetimes, but we won't be able to add forwards compatibility, meaning that pickles saved in Python 3.5 will not be unpicklable in Python 3.4.
I don't think this can be avoided entirely. Any ideas I can come up
with that might help are worse than requiring devs to convert their
datetimes to strings in the rare case they need their 3.5 pickles read
with 3.4.
--
Stuart Bishop
On Wed, Apr 15, 2015 at 3:23 PM, Stuart Bishop
Huh. I didn't think you would need to change any arithmetic
Not really, the problem is in keeping the date normalized after each call, and doing so the right way.
Arithmetic remains 'add the timedelta to the naive datetime, and then punt it to the tzinfo to make any necessary adjustments' and I thought this would not need to be changed at all.
Just punting it to tzinfo to make adjustments, ie effectively just doing what normalize() does creates infinite recursion as there is more arithmetic in there, so it's not quite that simple.
I don't think this can be avoided entirely. Any ideas I can come up with that might help are worse than requiring devs to convert their datetimes to strings in the rare case they need their 3.5 pickles read with 3.4.
Pickle forward compatibility isn't really expected anyway...
On 15 April 2015 at 21:51, Lennart Regebro
On Wed, Apr 15, 2015 at 3:23 PM, Stuart Bishop
wrote:
Just punting it to tzinfo to make adjustments, ie effectively just doing what normalize() does creates infinite recursion as there is more arithmetic in there, so it's not quite that simple.
This sounds familiar. Its infinite recursion if the tzinfo does its
calculations using localized datetimes. If the tzinfo is stripped for
the calculations, there is no tzinfo to recurse into. At least this
was how I hoped it would work, and it sucks if it doesn't. You could
be right that using the UTC representation internally for datetimes
with a tzinfo makes the most sense.
--
Stuart Bishop
On Wed, Apr 15, 2015 at 5:28 PM, Stuart Bishop
On 15 April 2015 at 21:51, Lennart Regebro
wrote: On Wed, Apr 15, 2015 at 3:23 PM, Stuart Bishop
wrote:
Just punting it to tzinfo to make adjustments, ie effectively just doing what normalize() does creates infinite recursion as there is more arithmetic in there, so it's not quite that simple.
This sounds familiar. Its infinite recursion if the tzinfo does its calculations using localized datetimes. If the tzinfo is stripped for the calculations, there is no tzinfo to recurse into. At least this was how I hoped it would work, and it sucks if it doesn't. You could be right that using the UTC representation internally for datetimes with a tzinfo makes the most sense.
There is no infinite recursion in the way datetime module deals with zone conversions. However, implementors of tzinfo subclasses often overlook the fact that datetime module design mandates specific rules for what utcoffset() should return for the missing and ambiguous hours. Granted, the relevant section in the manual [1] is not an easy read and in fact for a long time that documentation itself was displaying a buggy implementation of the LocalTimezone class. [2] Understanding how the design works requires a bit of algebra [3], but I strongly recommend that anyone trying to improve the timezones support in the datetime module, print out those 200 lines of comments and go through them with a pencil following the proofs. Note that one of the key assumptions [3.2] in that write-up does not hold in real life. The assumption is that "standard time" offset does not depend on the point in time. However, I do believe that this assumption can be relaxed without invalidating the main result. I believe we can still have unambiguous fromutc() as long as standard time offset does not change "too often." Basically, if we (generously) allow utcoffset to vary from -24h to +24h, then a "sane" zone can be defined as the one where utcoffset changes at most once in any 48 hour period. If I am right about this and the algebra works out, then we don't need to change datetime module design to properly support all world timezones. [1] https://docs.python.org/3/library/datetime.html#datetime.tzinfo.fromutc [2] http://bugs.python.org/issue9063 [3] https://hg.python.org/cpython/file/132b5376bf34/Lib/datetime.py#l1935 [3.2] https://hg.python.org/cpython/file/132b5376bf34/Lib/datetime.py#l1948
Lennart Regebro
OK, so I realized another thing today, and that is that arithmetic doesn't necessarily round trip.
For example, 2002-10-27 01:00 US/Eastern comes both in DST and STD.
But 2002-10-27 01:00 US/Eastern STD minus two days is 2002-10-25 01:00 US/Eastern DST
"two days" is ambiguous here. It is incorrect if you mean 48 hours (the difference is 49 hours): #!/usr/bin/env python3 from datetime import datetime, timedelta import pytz tz = pytz.timezone('US/Eastern') then_isdst = False # STD then = tz.localize(datetime(2002, 10, 27, 1), is_dst=then_isdst) now = tz.localize(datetime(2002, 10, 25, 1), is_dst=None) # no utc transition print((then - now) // timedelta(hours=1)) # -> 49
However, 2002-10-25 01:00 US/Eastern DST plus two days is 2002-10-27 01:00 US/Eastern, but it is ambiguous if you want DST or not DST.
It is not ambiguous if you know what "two days" *in your particular application* should mean (`day+2` vs. +48h exactly): print(tz.localize(now.replace(tzinfo=None) + timedelta(2), is_dst=then_isdst)) # -> 2002-10-27 01:00:00-05:00 # +49h print(tz.normalize(now + timedelta(2))) # +48h # -> 2002-10-27 01:00:00-04:00 Here's a simple mental model that can be used for date arithmetics: - naive datetime + timedelta(2) == "same time, elapsed hours unknown" - aware utc datetime + timedelta(2) == "same time, +48h" - aware datetime with timezone that may have different utc offsets at different times + timedelta(2) == "unknown time, +48h" "unknown" means that you can't tell without knowning the specific timezone. It ignores leap seconds. The 3rd case behaves *as if* the calculations are performed using these steps (the actual implementation may be different): 1. convert an aware datetime object to utc (dt.astimezone(pytz.utc)) 2. do the simple arithmetics using utc time 3. convert the result to the original pytz timezone (utc_dt.astimezone(tz)) you don't need `.localize()`, `.normalize()` calls here.
And you can't pass in a is_dst flag to __add__, so the arithmatic must just pick one, and the sensible one is to keep to the same DST.
That means that:
tz = get_timezone('US/Eastern') dt = datetimedatetime(2002, 10, 27, 1, 0, tz=tz, is_dst=False) dt2 = dt - 420 + 420 assert dt == dt2
Will fail, which will be unexpected for most people.
I think there is no way around this, but I thought I should flag for it. This is a good reason to do all your date time arithmetic in UTC.
//Lennart
It won't fail: from datetime import datetime, timedelta import pytz tz = pytz.timezone('US/Eastern') dt = tz.localize(datetime(2002, 10, 27, 1), is_dst=False) delta = timedelta(seconds=420) assert dt == tz.normalize(tz.normalize(dt - delta) + delta) The only reason `tz.normalize()` is used so that tzinfo would be correct for the resulting datetime object; it does not affect the comparison otherwise: assert dt == (dt - delta + delta) #XXX tzinfo may be incorrect assert dt == tz.normalize(dt - delta + delta) # correct tzinfo for the final result
On Wed, Apr 8, 2015 at 11:18 AM, Lennart Regebro
I wrote PEP-431 two years ago, and never got around to implement it. This year I got some renewed motivation after Berker Peksağ made an effort of implementing it. I'm planning to work more on this during the PyCon sprints, and also have a BoF session or similar during the conference.
For those who were not at the conference, can someone summarize the post-PyCon status of this PEP? Is Barry still the "BDFL-Delegate"? Is there an updated draft? Should this discussion move to python-ideas?
On Tue, Apr 14, 2015 at 4:52 PM, Alexander Belopolsky
For those who were not at the conference, can someone summarize the post-PyCon status of this PEP?
Is Barry still the "BDFL-Delegate"? Is there an updated draft? Should this discussion move to python-ideas?
There is no status change except that it will need updates, but I'm waiting with that until I know what updates are needed, and that means an implementation is needed first. //Lennart
participants (7)
-
Akira Li
-
Alexander Belopolsky
-
Carl Meyer
-
Chris Angelico
-
Lennart Regebro
-
Ryan Hiebert
-
Stuart Bishop