Re: [Python-Dev] Aware datetime from naive local time Was: Status on PEP-431 Timezones

On Thu, Apr 9, 2015 at 4:51 PM, Isaac Schwabacher <ischwabacher@wisc.edu> wrote:
I am changing the subject so that we can focus on one question without diverting to PEP-size issues that are better suited for python ideas. I would like to add a functionality to the datetime module that would solve a seemingly simple problem: given a naive datetime instance assumed to be in local time, construct the corresponding aware datetime object with tzinfo set to an appropriate fixed offset datetime.timezone instance. Python 3 has this functionality implemented in the email package since version 3.3, and it appears to work well even in the ambiguous hour
However, in a location with a more interesting history, you can get a situation tha

Sorry for a truncated message. Please scroll past the quoted portion. On Thu, Apr 9, 2015 at 10:21 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
would look like this in the zoneinfo database: $ zdump -v -c 1992 Europe/Kiev ... Europe/Kiev Sat Mar 24 22:59:59 1990 UTC = Sun Mar 25 01:59:59 1990 MSK isdst=0 Europe/Kiev Sat Mar 24 23:00:00 1990 UTC = Sun Mar 25 03:00:00 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 21:59:59 1990 UTC = Sun Jul 1 01:59:59 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 22:00:00 1990 UTC = Sun Jul 1 01:00:00 1990 EEST isdst=1 Europe/Kiev Sat Sep 28 23:59:59 1991 UTC = Sun Sep 29 02:59:59 1991 EEST isdst=1 Europe/Kiev Sun Sep 29 00:00:00 1991 UTC = Sun Sep 29 02:00:00 1991 EET isdst=0 ... Look what happened on July 1, 1990. At 2 AM, the clocks in Ukraine were moved back one hour. So times like 01:30 AM happened twice there on that day. Let's see how Python handles this situation $ TZ=Europe/Kiev python3
So far so good, I've got the first of the two 01:30AM's. But what if I want the other 01:30AM? Well,
localtime(datetime(1990,7,1,1,30), isdst=0).strftime('%c %z %Z') 'Sun Jul 1 01:30:00 1990 +0300 EEST'
gives me "the other 01:30AM", but it is counter-intuitive: I have to ask for the standard (winter) time to get the daylight savings (summer) time. The uncertainty about how to deal with the repeated hour was the reason why email.utils.localtime-like interface did not make it to the datetime module. The main objection to the isdst flag was that in most situations, determining whether DST is in effect is as hard as finding the UTC offset, so reducing the problem of finding the UTC offset to the one of finding the value for isdst does not solve much. I now realize that the problem is simply in the name for the flag. While we cannot often tell what isdst should be and in some situations the actual DST status does not differentiate between the two possible times, we can always say whether we want to get the first or the second time. In other words, instead of localtime(dt, isdst=-1), we may want localtime(dt, which=0) where "which" is used to resolve the ambiguity: "which=0" means return the first (in UTC order) of the two times and "which=1" means return the second. (In the non-ambiguous cases "which" is ignored.) An alternative solution would be make localtime(dt) return a list of 0, 1 or 2 instances, but this will probably make a common usage (the case when the user does not care which time she gets) more cumbersome.

On 9 Apr 2015 23:15, "Alexander Belopolsky" <alexander.belopolsky@gmail.com> wrote:
Sorry for a truncated message. Please scroll past the quoted portion.
On Thu, Apr 9, 2015 at 10:21 PM, Alexander Belopolsky <
alexander.belopolsky@gmail.com> wrote: diverting to PEP-size issues that are better suited for python ideas.
It actually took me a long time to understand that the "isdst" flag in this context related to the following chain of reasoning: 1. Due to various reasons, local time offsets relative to UTC may change, thus repeating certain subsets of local time 2. Repeated local times usually relate to winding clocks back an hour at the end of a DST period 3. "isdst=True" thus refers to "before the local time change winds the clocks back", while "isdst=False" refers to *after* the clocks are wound back As Alexander says, you can reduce the amount of assumed knowledge needed to understand the API by focusing on the ambiguity resolution directly without assuming that the *reason* for the ambiguity is "end of DST period". problem, and that timezone offset changes (whether historical or cyclical) mean that the mapping isn't 1:1 - some expressible local times never actually happen, while others happen more than once. For the normal APIs, NonExistentTimeError would then correspond with the case where the record lookup API returned no results, while the suggested "which" index would handle the two results case without assuming the repeated local time was specifically due to the end of a DST period. Regards, Nick.

On Fri, Apr 10, 2015 at 6:38 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As Alexander says, you can reduce the amount of assumed knowledge needed to
understand the API by focusing on the ambiguity resolution directly without assuming that the *reason* for the ambiguity is "end of DST period".
This is an excellent summary of my original post. (It is in fact better than the post itself which I therefore did not include in the quote.) For the mathematically inclined, I can reformulate the problem as follows. For any given geographical location, loc, and a moment in time t expressed as UTC time, one can tell what time was shown on a "local clock-tower." This defines a function wall(loc, t). This function is a piece-wise linear function which may have regular or irregular discontinuities. Because of these discontinuities, an equation wall(loc, t) = lt may have 0, 1 or 2 solutions. The DST switchovers are an example of regular discontinuities. In most locations, they follow a somewhat predictable pattern with two discontinuities per year. Irregular discontinuities happen in locations with activist governments and don't follow any general rules. For most world locations past discontinuities are fairly well documented for at least a century and future changes are published with at least 6 months lead time.
The downside of this API is that naively written code is prone to crashes. Someone unaware of the invalid local times and not caring about the choice between ambiguities may write code like t = utc_times_from_local(lt)[0] which may work fine for many years before someone gets an IndexError and a backtrace in her server log.
The NonExistentTimeError has a similar problem as an API returning an empty list. Seeing NonExistentTimeError in a server log is not a big improvement over seeing an IndexError. Moreover, a program that rejects invalid times on input, but stores them for a long time may see its database silently corrupted after a zoneinfo update. Now it is time to make specific proposal. I would like to extend datetime.astimezone() method to work on naive datetime instances. Such instances will be assumed to be in local time and discontinuities will be handled as follows: 1. wall(t) == lt has a single solution. This is the trivial case and lt.astimezone(utc) and lt.astimezone(utc, which=i) for i=0,1 should return that solution. 2. wall(t) == lt has two solutions t1 and t2 such that t1 < t2. In this case lt.astimezone(utc) == lt.astimezone(utc, which=0) == t1 and lt.astimezone(utc, which=1) == t2. 3. wall(t) == lt has no solution. This happens when there is UTC time t0 such that wall(t0) < lt and wall(t0+epsilon) > lt (a positive discontinuity at time t0). In this case lt.astimezone(utc) should return t0 + lt - wall(t0). I.e., we ignore the discontinuity and extend wall(t) linearly past t0. Obviously, in this case the invariant wall(lt.astimezone(utc)) == lt won't hold. The "which" flag should be handled as follows: lt.astimezone(utc) == lt.astimezone(utc, which=0) and lt.astimezone(utc, which=0) == t0 + lt - wall(t0+eps). With the proposed features in place, one can use the naive code t = lt.astimezone(utc) and get predictable behavior in all cases and no crashes. A more sophisticated program can be written like this: t1 = lt.astimezone(utc, which=0) t2 = lt.astimezone(utc, which=1) if t1 == t2: t = t1 elif t2 > t1: # ask the user to pick between t1 and t2 or raise AmbiguousLocalTimeError else: t = t1 # warn the user that time was invalid and changed or raise InvalidLocalTimeError

Sorry to be brain dead here, but I'm a bit lost: On Fri, Apr 10, 2015 at 4:32 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
got it.
This is where I'm confused -- I can see how going from "wall" time ("local" time, etc) to UTC has 0, 1, or 2 solutions: One solution most of the time Zero solutions when we "spring forward" -- i.e. there is no 2:30 am on March 8, 2015 in the US timezones that use DST Two solutions when we "fall back", i.e. there are two 2:30 am Nov 1, 2015 in the US timezones that use DST But I can't see where there are multiple solutions the other way around -- doesn't a given UTC time map to one and only one "wall time" in a given timezone? Am I wrong, or is this a semantic question as to what "wall" time means? Thanks for any clarification, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Apr 13, 2015 at 1:24 PM, Chris Barker <chris.barker@noaa.gov> wrote:
You are right about what wall() means, but I should have been more explicit about knowns and unknowns in the wall(loc, t) = lt equation. In that equation I considered loc (the geographical place) and lt (the time on the clock tower) to be known and t (the universal (UTC) time) to be unknown. A solution to the equation is the value of the unknown (t) given the values of the knowns (loc and lt). The rest of your exposition is correct including "a given UTC time maps to one and only one 'wall time' in a given timezone." However, different UTC times may map to the same wall time and some expressible wall times are not results of a map of any UTC time.

On Mon, Apr 13, 2015 at 10:45 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
got it. I suggest you perhaps word it something like: wall_time = f( location, utc_time) and utc_time = f( location, utc_time ) These are two different problems, and one is much harder than the other! (though both are ugly!) you can, of course shoreten the names to "wall" and "utc" and "loc" if you like, but I kind of like long, readable names.. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Apr 13, 2015 at 2:05 PM, Chris Barker <chris.barker@noaa.gov> wrote:
You probably meant "utc_time = f( location, wall_time)" in the last equation, but that would still be wrong. A somewhat more correct equation would be utc_time = f^(-1)( location, wall_time) where f^(-1) is the inverse function of f, but since f in not monotonic, no such inverse exists. Finding the inverse of f is the same as solving the equation f(x) = y for any given y. If f is such that this equation has only one solution for all possible values of y then an inverse exists, but this is not so in our case.

On Mon, Apr 13, 2015 at 12:14 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
oops, yes.
but that would still be wrong.
A somewhat more correct equation would be
utc_time = f^(-1)( location, wall_time)
In this case I meant "f" as "a function of", so the two fs were not intended to be the same. Yes, one is the inverse of another, and in this case the inverse is not definable (at least not uniquely). I have no doubt you understand all this (better than I do), I'm just suggesting that in the discussion we find a way to be as clear as possible as to which function is being discussed when. But anyway -- thanks all for hashing this out -- getting something reasonable into datetime will be very nice. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
It is important to note that the different versions of the tz database may lead to different tzinfo (utc offset, tzname) even for *past* dates. i.e., (lt, tzid, isdst) is not enough because the result for (lt, tzid(2015b), isdst) may be different from (lt, tzid(X), isdst) where lt = local time e.g., naive datetime tzid = timezone from the tz database e.g., Europe/Kiev isdst = a boolean flag for disambiguation X != 2015b In other words, a fixed utc offset might not be sufficient even for past dates.
In pytz terms: `which = not isdst` (end-of-DST-like transition: isdst changes from True to False in the direction of utc time). It resolves AmbiguousTimeError raised by `tz.localize(naive, is_dst=None)`.
It is inconsistent with the previous case: here `which = isdst` but `which = not isdst` above. `lt.astimezone(utc, which=0) == t0 + lt - wall(t0+eps)` corresponds to: result = tz.normalize(tz.localize(lt, isdst=False)) i.e., `which = isdst` (t0 is at the start of DST and therefore isdst changes from False to True). It resolves NonExistentTimeError raised by `tz.localize(naive, is_dst=None)`. start-of-DST-like transition ("Spring forward"). For example, from datetime import datetime, timedelta import pytz tz = pytz.timezone('America/New_York') # 2am -- non-existent time print(tz.normalize(tz.localize(datetime(2015, 3, 8, 2), is_dst=False))) # -> 2015-03-08 03:00:00-04:00 # after the jump (wall(t0+eps)) print(tz.localize(datetime(2015, 3, 8, 3), is_dst=None)) # -> 2015-03-08 03:00:00-04:00 # same time, unambiguous # 2:01am -- non-existent time print(tz.normalize(tz.localize(datetime(2015, 3, 8, 2, 1), is_dst=False))) # -> 2015-03-08 03:01:00-04:00 print(tz.localize(datetime(2015, 3, 8, 3, 1), is_dst=None)) # -> 2015-03-08 03:01:00-04:00 # same time, unambiguous # 2:59am non-existent time dt = tz.normalize(tz.localize(datetime(2015, 3, 8, 2, 59), is_dst=True)) print(dt) # -> 2015-03-08 01:59:00-05:00 # before the jump (wall(t0-eps)) print(tz.normalize(dt + timedelta(minutes=1))) # -> 2015-03-08 03:00:00-04:00

Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
It looks incorrect. Here's the corresponding pytz code: from datetime import datetime import pytz tz = pytz.timezone('Europe/Kiev') print(tz.localize(datetime(1990, 7, 1, 1, 30), is_dst=False).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(tz.localize(datetime(1990, 7, 1, 1, 30), is_dst=True).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0400 MSD See also "Enhance support for end-of-DST-like ambiguous time" [1] [1] https://bugs.launchpad.net/pytz/+bug/1378150 `email.utils.localtime()` is broken: from datetime import datetime from email.utils import localtime print(localtime(datetime(1990, 7, 1, 1, 30)).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=0).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=1).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=-1).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST Versions: $ ./python -V Python 3.5.0a3+ $ dpkg -s tzdata | grep -i version Version: 2015b-0ubuntu0.14.04
"repeated hour" (time jumps back) can be treated like a end-of-DST transition, to resolve ambiguities [1].

On Wed, Apr 15, 2015 at 4:46 PM, Akira Li <4kir4.1i@gmail.com> wrote:
If you think there is a bug in email.utils.localtime - please open an issue at <bugs.python.org>.
I don't understand what you are complaining about. It is quite possible that pytz uses is_dst flag differently from the way email.utils.localtime uses isdst. I was not able to find a good description of what is_dst means in pytz, but localtime's isdst is documented as follows: a positive or zero value for *isdst* causes localtime to presume initially that summer time (for example, Daylight Saving Time) is or is not (respectively) in effect for the specified time. Can you demonstrate that email.utils.localtime does not behave as documented?

On Thu, Apr 16, 2015 at 1:14 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
Your question below suggests that you believe it is not a bug i.e., `email.utils.localtime()` is broken *by design* unless you think it is ok to ignore `+0400 MSD`. pytz works for me (I can get both `+0300 EEST` and `+0400 MSD`). I don't think `localtime()` can be fixed without the tz database. I don't know whether it should be fixed, let somebody else who can't use pytz to pioneer the issue. The purpose of the code example is to **inform** that `email.utils.localtime()` fails (it returns only +0300 EEST) in this case:
No need to be so defensive about it. *""repeated hour" (time jumps back) can be treated like a end-of-DST transition, to resolve ambiguities [1]."* is just a *an example* on how to fix the problem in the same way how it is done in pytz:
Here's "summer time" in both cases i.e., it is not *true* end-of-DST transition (that is why I've used the word *"like"* above). If we ignore ambiguous time that may occur more than twice then a boolean flag such as pytz's is_dst is *always* enough to resolve the ambiguity assuming we have access to the tz database. And yes, the example demonstrates that the behavior of pytz's is_dst and localtime()'s isdst is different. The example just shows that the current behavior of localtime() doesn't allow to get `+0400 DST` (on my system, see the software versions above) and how to get it (*adopt* the pytz behavior -- you need zoneinfo for that) i.e., the message is a problem and a possible solution -- no complains. [1] https://bugs.launchpad.net/pytz/+bug/1378150 -- Akira.

On Fri, Apr 17, 2015 at 8:19 PM, Akira Li <4kir4.1i@gmail.com> wrote:
There is nothing "defensive" in my question. I simply don't understand what you are complaining about other than your code using pytz produces different results from some other your code using email.utils.localtime. If you think you found a bug in email.utils.localtime - please explain it without referring to a third party library. It will also help if you do it at the bug tracker.

Sorry for a truncated message. Please scroll past the quoted portion. On Thu, Apr 9, 2015 at 10:21 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
would look like this in the zoneinfo database: $ zdump -v -c 1992 Europe/Kiev ... Europe/Kiev Sat Mar 24 22:59:59 1990 UTC = Sun Mar 25 01:59:59 1990 MSK isdst=0 Europe/Kiev Sat Mar 24 23:00:00 1990 UTC = Sun Mar 25 03:00:00 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 21:59:59 1990 UTC = Sun Jul 1 01:59:59 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 22:00:00 1990 UTC = Sun Jul 1 01:00:00 1990 EEST isdst=1 Europe/Kiev Sat Sep 28 23:59:59 1991 UTC = Sun Sep 29 02:59:59 1991 EEST isdst=1 Europe/Kiev Sun Sep 29 00:00:00 1991 UTC = Sun Sep 29 02:00:00 1991 EET isdst=0 ... Look what happened on July 1, 1990. At 2 AM, the clocks in Ukraine were moved back one hour. So times like 01:30 AM happened twice there on that day. Let's see how Python handles this situation $ TZ=Europe/Kiev python3
So far so good, I've got the first of the two 01:30AM's. But what if I want the other 01:30AM? Well,
localtime(datetime(1990,7,1,1,30), isdst=0).strftime('%c %z %Z') 'Sun Jul 1 01:30:00 1990 +0300 EEST'
gives me "the other 01:30AM", but it is counter-intuitive: I have to ask for the standard (winter) time to get the daylight savings (summer) time. The uncertainty about how to deal with the repeated hour was the reason why email.utils.localtime-like interface did not make it to the datetime module. The main objection to the isdst flag was that in most situations, determining whether DST is in effect is as hard as finding the UTC offset, so reducing the problem of finding the UTC offset to the one of finding the value for isdst does not solve much. I now realize that the problem is simply in the name for the flag. While we cannot often tell what isdst should be and in some situations the actual DST status does not differentiate between the two possible times, we can always say whether we want to get the first or the second time. In other words, instead of localtime(dt, isdst=-1), we may want localtime(dt, which=0) where "which" is used to resolve the ambiguity: "which=0" means return the first (in UTC order) of the two times and "which=1" means return the second. (In the non-ambiguous cases "which" is ignored.) An alternative solution would be make localtime(dt) return a list of 0, 1 or 2 instances, but this will probably make a common usage (the case when the user does not care which time she gets) more cumbersome.

On 9 Apr 2015 23:15, "Alexander Belopolsky" <alexander.belopolsky@gmail.com> wrote:
Sorry for a truncated message. Please scroll past the quoted portion.
On Thu, Apr 9, 2015 at 10:21 PM, Alexander Belopolsky <
alexander.belopolsky@gmail.com> wrote: diverting to PEP-size issues that are better suited for python ideas.
It actually took me a long time to understand that the "isdst" flag in this context related to the following chain of reasoning: 1. Due to various reasons, local time offsets relative to UTC may change, thus repeating certain subsets of local time 2. Repeated local times usually relate to winding clocks back an hour at the end of a DST period 3. "isdst=True" thus refers to "before the local time change winds the clocks back", while "isdst=False" refers to *after* the clocks are wound back As Alexander says, you can reduce the amount of assumed knowledge needed to understand the API by focusing on the ambiguity resolution directly without assuming that the *reason* for the ambiguity is "end of DST period". problem, and that timezone offset changes (whether historical or cyclical) mean that the mapping isn't 1:1 - some expressible local times never actually happen, while others happen more than once. For the normal APIs, NonExistentTimeError would then correspond with the case where the record lookup API returned no results, while the suggested "which" index would handle the two results case without assuming the repeated local time was specifically due to the end of a DST period. Regards, Nick.

On Fri, Apr 10, 2015 at 6:38 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As Alexander says, you can reduce the amount of assumed knowledge needed to
understand the API by focusing on the ambiguity resolution directly without assuming that the *reason* for the ambiguity is "end of DST period".
This is an excellent summary of my original post. (It is in fact better than the post itself which I therefore did not include in the quote.) For the mathematically inclined, I can reformulate the problem as follows. For any given geographical location, loc, and a moment in time t expressed as UTC time, one can tell what time was shown on a "local clock-tower." This defines a function wall(loc, t). This function is a piece-wise linear function which may have regular or irregular discontinuities. Because of these discontinuities, an equation wall(loc, t) = lt may have 0, 1 or 2 solutions. The DST switchovers are an example of regular discontinuities. In most locations, they follow a somewhat predictable pattern with two discontinuities per year. Irregular discontinuities happen in locations with activist governments and don't follow any general rules. For most world locations past discontinuities are fairly well documented for at least a century and future changes are published with at least 6 months lead time.
The downside of this API is that naively written code is prone to crashes. Someone unaware of the invalid local times and not caring about the choice between ambiguities may write code like t = utc_times_from_local(lt)[0] which may work fine for many years before someone gets an IndexError and a backtrace in her server log.
The NonExistentTimeError has a similar problem as an API returning an empty list. Seeing NonExistentTimeError in a server log is not a big improvement over seeing an IndexError. Moreover, a program that rejects invalid times on input, but stores them for a long time may see its database silently corrupted after a zoneinfo update. Now it is time to make specific proposal. I would like to extend datetime.astimezone() method to work on naive datetime instances. Such instances will be assumed to be in local time and discontinuities will be handled as follows: 1. wall(t) == lt has a single solution. This is the trivial case and lt.astimezone(utc) and lt.astimezone(utc, which=i) for i=0,1 should return that solution. 2. wall(t) == lt has two solutions t1 and t2 such that t1 < t2. In this case lt.astimezone(utc) == lt.astimezone(utc, which=0) == t1 and lt.astimezone(utc, which=1) == t2. 3. wall(t) == lt has no solution. This happens when there is UTC time t0 such that wall(t0) < lt and wall(t0+epsilon) > lt (a positive discontinuity at time t0). In this case lt.astimezone(utc) should return t0 + lt - wall(t0). I.e., we ignore the discontinuity and extend wall(t) linearly past t0. Obviously, in this case the invariant wall(lt.astimezone(utc)) == lt won't hold. The "which" flag should be handled as follows: lt.astimezone(utc) == lt.astimezone(utc, which=0) and lt.astimezone(utc, which=0) == t0 + lt - wall(t0+eps). With the proposed features in place, one can use the naive code t = lt.astimezone(utc) and get predictable behavior in all cases and no crashes. A more sophisticated program can be written like this: t1 = lt.astimezone(utc, which=0) t2 = lt.astimezone(utc, which=1) if t1 == t2: t = t1 elif t2 > t1: # ask the user to pick between t1 and t2 or raise AmbiguousLocalTimeError else: t = t1 # warn the user that time was invalid and changed or raise InvalidLocalTimeError

Sorry to be brain dead here, but I'm a bit lost: On Fri, Apr 10, 2015 at 4:32 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
got it.
This is where I'm confused -- I can see how going from "wall" time ("local" time, etc) to UTC has 0, 1, or 2 solutions: One solution most of the time Zero solutions when we "spring forward" -- i.e. there is no 2:30 am on March 8, 2015 in the US timezones that use DST Two solutions when we "fall back", i.e. there are two 2:30 am Nov 1, 2015 in the US timezones that use DST But I can't see where there are multiple solutions the other way around -- doesn't a given UTC time map to one and only one "wall time" in a given timezone? Am I wrong, or is this a semantic question as to what "wall" time means? Thanks for any clarification, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Apr 13, 2015 at 1:24 PM, Chris Barker <chris.barker@noaa.gov> wrote:
You are right about what wall() means, but I should have been more explicit about knowns and unknowns in the wall(loc, t) = lt equation. In that equation I considered loc (the geographical place) and lt (the time on the clock tower) to be known and t (the universal (UTC) time) to be unknown. A solution to the equation is the value of the unknown (t) given the values of the knowns (loc and lt). The rest of your exposition is correct including "a given UTC time maps to one and only one 'wall time' in a given timezone." However, different UTC times may map to the same wall time and some expressible wall times are not results of a map of any UTC time.

On Mon, Apr 13, 2015 at 10:45 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
got it. I suggest you perhaps word it something like: wall_time = f( location, utc_time) and utc_time = f( location, utc_time ) These are two different problems, and one is much harder than the other! (though both are ugly!) you can, of course shoreten the names to "wall" and "utc" and "loc" if you like, but I kind of like long, readable names.. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Apr 13, 2015 at 2:05 PM, Chris Barker <chris.barker@noaa.gov> wrote:
You probably meant "utc_time = f( location, wall_time)" in the last equation, but that would still be wrong. A somewhat more correct equation would be utc_time = f^(-1)( location, wall_time) where f^(-1) is the inverse function of f, but since f in not monotonic, no such inverse exists. Finding the inverse of f is the same as solving the equation f(x) = y for any given y. If f is such that this equation has only one solution for all possible values of y then an inverse exists, but this is not so in our case.

On Mon, Apr 13, 2015 at 12:14 PM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
oops, yes.
but that would still be wrong.
A somewhat more correct equation would be
utc_time = f^(-1)( location, wall_time)
In this case I meant "f" as "a function of", so the two fs were not intended to be the same. Yes, one is the inverse of another, and in this case the inverse is not definable (at least not uniquely). I have no doubt you understand all this (better than I do), I'm just suggesting that in the discussion we find a way to be as clear as possible as to which function is being discussed when. But anyway -- thanks all for hashing this out -- getting something reasonable into datetime will be very nice. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
It is important to note that the different versions of the tz database may lead to different tzinfo (utc offset, tzname) even for *past* dates. i.e., (lt, tzid, isdst) is not enough because the result for (lt, tzid(2015b), isdst) may be different from (lt, tzid(X), isdst) where lt = local time e.g., naive datetime tzid = timezone from the tz database e.g., Europe/Kiev isdst = a boolean flag for disambiguation X != 2015b In other words, a fixed utc offset might not be sufficient even for past dates.
In pytz terms: `which = not isdst` (end-of-DST-like transition: isdst changes from True to False in the direction of utc time). It resolves AmbiguousTimeError raised by `tz.localize(naive, is_dst=None)`.
It is inconsistent with the previous case: here `which = isdst` but `which = not isdst` above. `lt.astimezone(utc, which=0) == t0 + lt - wall(t0+eps)` corresponds to: result = tz.normalize(tz.localize(lt, isdst=False)) i.e., `which = isdst` (t0 is at the start of DST and therefore isdst changes from False to True). It resolves NonExistentTimeError raised by `tz.localize(naive, is_dst=None)`. start-of-DST-like transition ("Spring forward"). For example, from datetime import datetime, timedelta import pytz tz = pytz.timezone('America/New_York') # 2am -- non-existent time print(tz.normalize(tz.localize(datetime(2015, 3, 8, 2), is_dst=False))) # -> 2015-03-08 03:00:00-04:00 # after the jump (wall(t0+eps)) print(tz.localize(datetime(2015, 3, 8, 3), is_dst=None)) # -> 2015-03-08 03:00:00-04:00 # same time, unambiguous # 2:01am -- non-existent time print(tz.normalize(tz.localize(datetime(2015, 3, 8, 2, 1), is_dst=False))) # -> 2015-03-08 03:01:00-04:00 print(tz.localize(datetime(2015, 3, 8, 3, 1), is_dst=None)) # -> 2015-03-08 03:01:00-04:00 # same time, unambiguous # 2:59am non-existent time dt = tz.normalize(tz.localize(datetime(2015, 3, 8, 2, 59), is_dst=True)) print(dt) # -> 2015-03-08 01:59:00-05:00 # before the jump (wall(t0-eps)) print(tz.normalize(dt + timedelta(minutes=1))) # -> 2015-03-08 03:00:00-04:00

Alexander Belopolsky <alexander.belopolsky@gmail.com> writes:
It looks incorrect. Here's the corresponding pytz code: from datetime import datetime import pytz tz = pytz.timezone('Europe/Kiev') print(tz.localize(datetime(1990, 7, 1, 1, 30), is_dst=False).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(tz.localize(datetime(1990, 7, 1, 1, 30), is_dst=True).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0400 MSD See also "Enhance support for end-of-DST-like ambiguous time" [1] [1] https://bugs.launchpad.net/pytz/+bug/1378150 `email.utils.localtime()` is broken: from datetime import datetime from email.utils import localtime print(localtime(datetime(1990, 7, 1, 1, 30)).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=0).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=1).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST print(localtime(datetime(1990, 7, 1, 1, 30), isdst=-1).strftime('%c %z %Z')) # -> Sun Jul 1 01:30:00 1990 +0300 EEST Versions: $ ./python -V Python 3.5.0a3+ $ dpkg -s tzdata | grep -i version Version: 2015b-0ubuntu0.14.04
"repeated hour" (time jumps back) can be treated like a end-of-DST transition, to resolve ambiguities [1].

On Wed, Apr 15, 2015 at 4:46 PM, Akira Li <4kir4.1i@gmail.com> wrote:
If you think there is a bug in email.utils.localtime - please open an issue at <bugs.python.org>.
I don't understand what you are complaining about. It is quite possible that pytz uses is_dst flag differently from the way email.utils.localtime uses isdst. I was not able to find a good description of what is_dst means in pytz, but localtime's isdst is documented as follows: a positive or zero value for *isdst* causes localtime to presume initially that summer time (for example, Daylight Saving Time) is or is not (respectively) in effect for the specified time. Can you demonstrate that email.utils.localtime does not behave as documented?

On Thu, Apr 16, 2015 at 1:14 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
Your question below suggests that you believe it is not a bug i.e., `email.utils.localtime()` is broken *by design* unless you think it is ok to ignore `+0400 MSD`. pytz works for me (I can get both `+0300 EEST` and `+0400 MSD`). I don't think `localtime()` can be fixed without the tz database. I don't know whether it should be fixed, let somebody else who can't use pytz to pioneer the issue. The purpose of the code example is to **inform** that `email.utils.localtime()` fails (it returns only +0300 EEST) in this case:
No need to be so defensive about it. *""repeated hour" (time jumps back) can be treated like a end-of-DST transition, to resolve ambiguities [1]."* is just a *an example* on how to fix the problem in the same way how it is done in pytz:
Here's "summer time" in both cases i.e., it is not *true* end-of-DST transition (that is why I've used the word *"like"* above). If we ignore ambiguous time that may occur more than twice then a boolean flag such as pytz's is_dst is *always* enough to resolve the ambiguity assuming we have access to the tz database. And yes, the example demonstrates that the behavior of pytz's is_dst and localtime()'s isdst is different. The example just shows that the current behavior of localtime() doesn't allow to get `+0400 DST` (on my system, see the software versions above) and how to get it (*adopt* the pytz behavior -- you need zoneinfo for that) i.e., the message is a problem and a possible solution -- no complains. [1] https://bugs.launchpad.net/pytz/+bug/1378150 -- Akira.

On Fri, Apr 17, 2015 at 8:19 PM, Akira Li <4kir4.1i@gmail.com> wrote:
There is nothing "defensive" in my question. I simply don't understand what you are complaining about other than your code using pytz produces different results from some other your code using email.utils.localtime. If you think you found a bug in email.utils.localtime - please explain it without referring to a third party library. It will also help if you do it at the bug tracker.
participants (5)
-
Akira Li
-
Alexander Belopolsky
-
Chris Barker
-
MRAB
-
Nick Coghlan