Re: [Python-Dev] Status on PEP-431 Timezones
On 15-04-15, Akira Li <4kir4.1i@gmail.com> wrote:
Isaac Schwabacher <ischwabacher@wisc.edu> writes:
On 15-04-15, Akira Li <4kir4.1i@gmail.com> wrote:
Isaac Schwabacher <ischwabacher@wisc.edu> writes:
...
I know that you can do datetime.now(tz), and you can do datetime(2013, 11, 3, 1, 30, tzinfo=zoneinfo('America/Chicago')), but not being able to add a time zone to an existing naive datetime is painful (and strptime doesn't even let you pass in a time zone).
`.now(tz)` is correct. `datetime(..., tzinfo=tz)`) is wrong: if tz is a pytz timezone then you may get a wrong tzinfo (LMT), you should use `tz.localize(naive_dt, is_dst=False|True|None)` instead.
The whole point of this thread is to finalize PEP 431, which fixes the problem for which `localize()` and `normalize()` are workarounds. When this is done, `datetime(..., tzinfo=tz)` will be correct.
ijs
The input time is ambiguous. Even if we assume PEP 431 is implemented in some form, your code is still missing isdst parameter (or the analog). PEP 431 won't fix it; it can't resolve the ambiguity by itself. Notice is_dst paramter in the `tz.localize()` call (current API).
...yeah, I forgot to throw that in there. It was supposed to be there all along. Nothing to see here, move along.
.now(tz) works even during end-of-DST transitions (current API) when the local time is ambiguous.
I know that. That's what I was complaining about-- I was trying to talk about how astimezone() was going to be inadequate even after the PEP was implemented because it couldn't turn naive datetimes into aware ones, and people were giving examples that started with aware datetimes generated by now(tz), which completely went around the point I was trying to make. But it looks like astimezone() is going to grow an is_dst parameter, and everything will be OK. ijs
Sorry for reviving this thread, but I was cornered at EuroPython with a question about the status of this PEP. It seems the proposal ran out of steam and has now missed the Python 3.5 train. What happened? Is the problem unsolvable? Or could we get this into 3.6???
On Thu, Jul 23, 2015 at 5:07 PM, Guido van Rossum <guido@python.org> wrote:
Sorry for reviving this thread, but I was cornered at EuroPython with a question about the status of this PEP. It seems the proposal ran out of steam and has now missed the Python 3.5 train. What happened? Is the problem unsolvable? Or could we get this into 3.6???
It turns out it's very complex to solve this when internally storing the time as the local time. Basically you have to normalize the time (ie check if daylight savings have changed) when doing arithmetic, but normalize is doing arithmetic, and you get infinite recursion. I've tried various ways to solve this but ran out of steam/brainpower. I think we should look into storing as UTC internally instead. It's a big change (and also needs handling pickles in a backwards compatible way) but I think that's the way forward. //Lennart
Can you update the PEP with a small note about this and update the status to Postponed? Switching to UTC is a big change indeed. Or could you leave this issue unsolved and still make progress with the tz database? In any case, new discussion should then go back to python-ideas. On Thu, Jul 23, 2015 at 6:22 PM, Lennart Regebro <regebro@gmail.com> wrote:
Sorry for reviving this thread, but I was cornered at EuroPython with a question about the status of this PEP. It seems the proposal ran out of steam and has now missed the Python 3.5 train. What happened? Is the
On Thu, Jul 23, 2015 at 5:07 PM, Guido van Rossum <guido@python.org> wrote: problem
unsolvable? Or could we get this into 3.6???
It turns out it's very complex to solve this when internally storing the time as the local time. Basically you have to normalize the time (ie check if daylight savings have changed) when doing arithmetic, but normalize is doing arithmetic, and you get infinite recursion. I've tried various ways to solve this but ran out of steam/brainpower.
I think we should look into storing as UTC internally instead. It's a big change (and also needs handling pickles in a backwards compatible way) but I think that's the way forward.
//Lennart
-- --Guido van Rossum (python.org/~guido)
On Thu, Jul 23, 2015 at 12:22 PM, Lennart Regebro <regebro@gmail.com> wrote:
It turns out it's very complex to solve this when internally storing the time as the local time. Basically you have to normalize the time (ie check if daylight savings have changed) when doing arithmetic, but normalize is doing arithmetic, and you get infinite recursion.
This is not true. Tim's analysis immortalized [1] at the end of the datetime.py file, shows that UTC to local mapping can be unambiguously recovered from the local to UTC rules using a simple finite algorithm. Tim assumes [2] that standard (non-DST) time offset is constant throughout the history, but this requirement can be relaxed to offset changing no more than once in any 48 hour period (if you generously allow timezones from -24 to 24 hours). Actually, it looks like I am repeating what I wrote back in April, so I'll stop with a reference [3] to that post. [1]: https://hg.python.org/cpython/file/v3.5.0b1/Lib/datetime.py#l1935 [2]: https://hg.python.org/cpython/file/v3.5.0b1/Lib/datetime.py#l1948 [3]: https://mail.python.org/pipermail/python-dev/2015-April/139171.html
Well, I was going to stay silent, but math is something I can do without wasting anyone's time or embarrassing myself. I don't think this mail answers Lennart's concerns, but I do want to get it out there to compete with the comment in `datetime.py`. I apologize if the LaTeX density is too high; I don't trust that my e-mail client would transmit the message faithfully were I to render it myself. I disagree with the view Tim had of time zones when he wrote that comment (and that code). It sounds like he views US/Eastern and US/Central as time zones (which they are), but thinks of the various America/Indiana zones as switching back and forth between them, rather than being time zones in their own right. I think the right perspective is that a time zone *is* the function that its `fromutc()` method implements, although of course we need additional information in order to actually compute (rather than merely mathematically define) its inverse. Daylight Saving Time is a red herring, and assumptions 2 and 4 in that exposition are just wrong from this point of view. In the worst case, Asia/Riyadh's two years of solar time completely shatter these assumptions. I'm convinced that the right viewpoint on this is to view local time and UTC time each as isomorphic to $\RR$ (i.e., effectively as UNIX timestamps, minus the oft-violated guarantee that timestamps are in UTC), and to consider the time zone as \[ fromutc : \RR \to \RR. \] (Leap seconds are a headache for this perspective, but it can still support them with well-placed epicycles.) Then our assumptions (inspired by zoneinfo) about the nature of this map are as follows: * $fromutc$ is piecewise defined, with each piece being continuous and strictly monotonic increasing. Let us call the set of discontinuities $\{ utc_i \in \RR | i \in \ZZ \}$, where the labels are in increasing order, and define $fromutc_i$ to be the $i$-th piece. (The theoretical treatment doesn't suffer if there are only finitely many discontinuities, since we can place additional piece boundaries at will where no discontinuities exist; obviously, an implementation would not take this view.) * The piece $fromutc_i : [utc_i, utc_{i+1}) \to [local_{start, i}, local_{end, i})$ and its inverse, which we will call $fromlocal_i$, are both readily computable. In particular, this means that $local_{start, i} = fromutc(utc_i)$ and $local_{end, i}$ is the limit of $fromutc(t)$ as $t$ approaches $utc_{i+1}$ from the left, and that these values are known. Note that the (tzfile(5))[1] format and (zic(8)[2]) both assume that $fromutc_i$ is of the form $t \mapsto t + off_i$, where $off_i$ is a constant. This assumption is true in practice, but is stronger than we actually need. * The sequences $\{ local_{start, i} | i \in \ZZ \}$ and $\{ local_{end, i} | i \in \ZZ \}$ are strictly increasing, and $local_{end, i-1} < local_{start, i+1}$ for all $i \in \ZZ$. This final condition is enough to guarantee that the preimage of any local time under $fromutc$ contains at most two UTC times. This assumption would be violated if, for example, some jurisdiction decided to fall back two hours by falling back one hour and then immediately falling back a second hour. I recommend the overthrow of any such jurisdiction and its (annexation by the Netherlands)[3]. Without the third assumption, it's impossible to specify a UTC time by a (local time, time zone, DST flag) triple since there may be more than two UTC times corresponding to the same local time, and computing $fromlocal$ becomes more complicated, but the problem can still be solved by replacing the DST flag by an index into the preimage. (Lennart, I think this third assumption is the important part of your "no changes within 48 hours of each other" assumption, which is violated by Asia/Riyadh. Is it enough?) Once we take this view, computing $fromutc(t)$ is trivial: find $i$ with $utc_i \le t < utc_{i+1}$ by binary search (presumably optimized to an $O(1)$ average case by using a good initial guess), and compute $fromutc_i(t)$. Computing $fromlocal(t)$ is somewhat more difficult. The first thing to address is that, as written, $fromlocal$ is not a function; in order to make it one, we need to pass it more information. We could define $fromlocal(t, i) = fromlocal_i(t)$, but that's too circular to be useful. Likewise with my (silly) earlier proposal to store $(local, offset)$ pairs-- then $fromlocal(t, off) = t - off$. What we really need is a (partial, which is better than multi-valued!) function $fromlocal : \RR \times \{True, False\} \to \RR$ that takes a local time and a DST flag and returns a UTC time. We define $fromlocal(local, flag)$ to be the first $utc \in \RR$ such that $fromutc(utc) = local$ when $flag$ is $True$ and the last such $utc$ when $flag$ is $False$. (Our implementation will presumably also allow $flag$ to be $None$, in which case we require $utc$ to be unique.) To compute $fromlocal$, we'll begin with a lemma: If $t < utc_i$, then $fromutc(t) < local_{end, i-1}$; and likewise, if $utc_i \le t$, then $local_{start, i} \le fromutc(t)$. If $t < utc_i$, then we must have $utc_j \le t < utc_{j+1}$ for some $j \le i-1$, so $local_{start, j} \le fromutc(t) < local_{end, j} \le local_{end, i-1}$ because $local_{end, i}$ is increasing in $i$. The proof of the second part is similar, using the fact that $local_{start, i}$ is increasing. Now we compute $fromlocal(t, True)$ by finding the minimal $i$ such that $t < local_{end, i}$. Then for $s < utc_i$ we must have by the lemma that $fromutc(s) < local_{end, i-1} < t$, so $s < fromlocal(t, True)$ if the latter exists. For $s \ge utc_i$ the lemma gives us $fromutc(s) \ge local_{start, i}$, so if $t < local_{start, i}$ we see that there is no $s$ with $fromutc(s) = t$, so $fromlocal(t, True)$ is undefined (which we will implement as a NonexistentTimeError). If on the other hand $t \ge local_{start, i}$, we have $t \in [local_{start, i}, local_{end, i})$, so that $fromutc(fromlocal_i(t)) = t$. Because $fromlocal_i$ is monotonic on this interval, $fromlocal(t, True) = fromlocal_i(t)$ is minimal. An analogous argument establishes that if $i$ is maximal with $local_{start, i} \le t$, then $fromlocal(t, False) = fromlocal_i(t)$ if $t < local_{end, i}$ and $fromlocal(t, False)$ is undefined otherwise. All of these computations can be accomplished by searches of ordered lists and applications of $fromlocal_i$. We can also include the option for $fromlocal(t, None)$ to require uniqueness by computing $fromlocal(t, True)$ and verifying that $t < local_{start, i+1}$ where $i$ is as in the computation of $fromlocal(t, True)$, raising an AmbiguousTimeError if this condition is not met. Notice that the definition of time zone that I've given here does not mention Daylight Saving Time, but this isn't a problem in most cases because ambiguity happens almost exclusively at "fall back" transitions, in which the first period is DST and the second period is STD. I argue that the rare ambiguities for which this does not hold are best resolved by divorcing our DST flag from Daylight Saving Time; that is, by defining the flag to mean "choose the first ambiguous time"; otherwise, we fail to handle jurisdictional transitions not involving DST. With this perspective, arithmetic becomes "translate to UTC, operate, translate back", which is as it should be. The only arithmetic that is performed outside of this framework is in the implementation of $fromutc_i$ and $fromlocal_i$, which operate on naive timestamps. But IIUC what Lennart is complaining about is the fact that the DST flag isn't part of and can't be embedded into a local time, so it's impossible to fold the second parameter to $fromlocal$ into $t$. Without that, a local time isn't rich enough to designate a single point in time and the whole edifice breaks. ijs [1]: http://linux.die.net/man/5/tzfile [2]: http://linux.die.net/man/8/zic [3]: https://what-if.xkcd.com/53/ Top-posted from Microsoft Outlook Web App; may its designers be consigned for eternity to that circle of hell in which their dog food is consumed. From: Python-Dev <python-dev-bounces+ischwabacher=wisc.edu@python.org> on behalf of Alexander Belopolsky <alexander.belopolsky@gmail.com> Sent: Thursday, July 23, 2015 20:28 To: Lennart Regebro Cc: Python-Dev Subject: Re: [Python-Dev] Status on PEP-431 Timezones On Thu, Jul 23, 2015 at 12:22 PM, Lennart Regebro <regebro@gmail.com> wrote: It turns out it's very complex to solve this when internally storing the time as the local time. Basically you have to normalize the time (ie check if daylight savings have changed) when doing arithmetic, but normalize is doing arithmetic, and you get infinite recursion. This is not true. Tim's analysis immortalized [1] at the end of the datetime.py file, shows that UTC to local mapping can be unambiguously recovered from the local to UTC rules using a simple finite algorithm. Tim assumes [2] that standard (non-DST) time offset is constant throughout the history, but this requirement can be relaxed to offset changing no more than once in any 48 hour period (if you generously allow timezones from -24 to 24 hours). Actually, it looks like I am repeating what I wrote back in April, so I'll stop with a reference [3] to that post. [1]: https://hg.python.org/cpython/file/v3.5.0b1/Lib/datetime.py#l1935 [2]: https://hg.python.org/cpython/file/v3.5.0b1/Lib/datetime.py#l1948 [3]: https://mail.python.org/pipermail/python-dev/2015-April/139171.html
On 25/07/2015 00:06, ISAAC J SCHWABACHER wrote: I got to "Daylight Saving Time is a red herring," and stopped reading. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
... I disagree with the view Tim had of time zones when he wrote that comment (and that code). It sounds like he views US/Eastern and US/Central as time zones (which they are), but thinks of the various America/Indiana zones as switching back and forth between them, rather than being time zones in their own right
You can think of them anyway you like. The point of the code was to provide a simple & efficient way to convert from UTC to local time in all "time zones" in known actual use at the time; the point of the comment was to explain the limitations of the code. Although, as Allexander noted, the stated assumptions are stronger than needed.
I think the right perspective is that a time zone *is* the function that its `fromutc()` method implements,
Fine by me ;-)
although of course we need additional information in order to actually compute (rather than merely mathematically define) its inverse. Daylight Saving Time is a red herring,
Overstated. DST is in fact the _only_ real complication in 99.99% of time zones (perhaps even 99.9913% ;-) ). As the docs say, if you have some crazy-ass time zone in mind, fine, that's why fromutc() was exposed (so your; crazy-ass tzinfo class can override it).
and assumptions 2 and 4
Nitpick: 4 is a consequence of 2, not an independent assumption.
in that exposition are just wrong from this point of view.
As above, there is no particular POV in this code: just a specific fromutc() implementation, comments that explain its limitations, and an invitation in the docs to override it if it's not enough for your case.
In the worst case, Asia/Riyadh's two years of solar time completely shatter these assumptions.
Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity".
[eliding a more-general view of what time zones "really" are]
I'm not eliding it because I disagree with it, but because time zones are political constructions. "The math" we make up may or may not be good enough to deal with all future political abominations; for example:
... This assumption would be violated if, for example, some jurisdiction decided to fall back two hours by falling back one hour and then immediately falling back a second hour. I recommend the overthrow of any such jurisdiction and its (annexation by the Netherlands)[3].
That's not objectively any more bizarre than Riyadh Solar Time. Although, if I've lived longer than you, I may be more wary about the creative stupidity of political schemes ;-)
... (Lennart, I think this third assumption is the important part of your "no changes within 48 hours of each other" assumption,
The "48 hours" bit came from Alexander. I'm personally unclear on what Lennart's problems are.
... All of these computations can be accomplished by searches of ordered lists and applications of $fromlocal_i$.
Do you have real-world use cases in mind beyond supporting long-abandoned Riyadh Solar time?
... With this perspective, arithmetic becomes "translate to UTC, operate, translate back", which is as it should be.
There _was_ a POV in the datetime design about that: no, that's not how it should be. Blame Guido ;-) If I add, say, 24 hours to noon today, I want to get noon tomorrow, and couldn't care less whether DST started or stopped (or any other political adjustment was made) in between. For that reason, it was wholly intentional that datetime + timedelta treats datetime as "naive". If that's not what someone wants, fine, but then they don't want Python's datetime arithmetic BTW, there's no implication that they're "wrong" for wanting something different; what would be wrong is insisting that datetime's POV is "wrong". Both views are valid and useful, depending on the needs of the application. One had to picked as the built-in behavior, and "naive" won.
... But IIUC what Lennart is complaining about
I don't, and I wish he would be more explicit about what "the problem(s)" is(are).
is the fact that the DST flag isn't part of and can't be embedded into a local time, so it's impossible to fold the second parameter to $fromlocal$ into $t$. Without that, a local time isn't rich enough to designate a single point in time and the whole edifice breaks.
You can blame Guido for that too ;-) , but in this case I disagree(d) with him: Guido was overly (IMO) annoyed by that the only apparent purpose for a struct tm's tm_ isdst flag was to disambiguate local times in a relative handful of cases. His thought: an entire bit just for that?! My thought: get over it, it's one measly bit. my-kingdom-for-bit-ingly y'rs - tim
On Fri, Jul 24, 2015 at 9:39 PM, Tim Peters <tim.peters@gmail.com> wrote:
But IIUC what Lennart is complaining about
I don't, and I wish he would be more explicit about what "the problem(s)" is(are).
is the fact that the DST flag isn't part of and can't be embedded into a local time, so it's impossible to fold the second parameter to $fromlocal$ into $t$. Without that, a local time isn't rich enough to designate a single point in time and the whole edifice breaks.
You can blame Guido for that too ;-) , but in this case I disagree(d) with him: Guido was overly (IMO) annoyed by that the only apparent purpose for a struct tm's tm_ isdst flag was to disambiguate local times in a relative handful of cases. His thought: an entire bit just for that?! My thought: get over it, it's one measly bit.
IIUC, Lennart came to (a wrong IMHO) conclusion that one bit is not enough and you must either keep datetime in UTC or store the UTC offset with datetime. My position is that one bit is enough to disambiguate local time in all sane situations, but the name "isdst" is misleading because discontinuities in UTC to Local function (from now on called L(t)) may be due to causes other than DST transitions. The math here is very simple: there are two kinds of discontinuities: you either move the local clock forward by a certain amount or you move it back. Let's call these (unimaginatively) discontinuities of the first and second kind. When you have a discontinuity of the first kind, you have a range of values u for which the equation u = L(t) has no solution for t. However, if we linearly extrapolate L(t) from before the discontinuity forward, we get a linear function Lb(t) and we can solve u = Lb(t) for any value of u. The problem, however is that we can also extend L(t) linearly from the time after the discontinuity to all times and get another function La(t) which will also allow you to solve equation u = La(t) for all times. Without user input, there is no way to tell which solution she expects. This is the 1-bit of information that we need. The situation with the discontinuity of the second kind is similar, but even simpler. Here, u = L(t) has two solutions and we need 1-bit of information to disambiguate.
And I would want to remind everyone again that this is not a question of the problem being impossible. It's just really complex to get right in all cases, and that always having the UTC timestamp around gets rid of most of that complexity. On Sat, Jul 25, 2015 at 3:39 AM, Tim Peters <tim.peters@gmail.com> wrote:
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
... I disagree with the view Tim had of time zones when he wrote that comment (and that code). It sounds like he views US/Eastern and US/Central as time zones (which they are), but thinks of the various America/Indiana zones as switching back and forth between them, rather than being time zones in their own right
You can think of them anyway you like. The point of the code was to provide a simple & efficient way to convert from UTC to local time in all "time zones" in known actual use at the time; the point of the comment was to explain the limitations of the code. Although, as Allexander noted, the stated assumptions are stronger than needed.
I think the right perspective is that a time zone *is* the function that its `fromutc()` method implements,
Fine by me ;-)
although of course we need additional information in order to actually compute (rather than merely mathematically define) its inverse. Daylight Saving Time is a red herring,
Overstated. DST is in fact the _only_ real complication in 99.99% of time zones (perhaps even 99.9913% ;-) ). As the docs say, if you have some crazy-ass time zone in mind, fine, that's why fromutc() was exposed (so your; crazy-ass tzinfo class can override it).
and assumptions 2 and 4
Nitpick: 4 is a consequence of 2, not an independent assumption.
in that exposition are just wrong from this point of view.
As above, there is no particular POV in this code: just a specific fromutc() implementation, comments that explain its limitations, and an invitation in the docs to override it if it's not enough for your case.
In the worst case, Asia/Riyadh's two years of solar time completely shatter these assumptions.
Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity".
[eliding a more-general view of what time zones "really" are]
I'm not eliding it because I disagree with it, but because time zones are political constructions. "The math" we make up may or may not be good enough to deal with all future political abominations; for example:
... This assumption would be violated if, for example, some jurisdiction decided to fall back two hours by falling back one hour and then immediately falling back a second hour. I recommend the overthrow of any such jurisdiction and its (annexation by the Netherlands)[3].
That's not objectively any more bizarre than Riyadh Solar Time. Although, if I've lived longer than you, I may be more wary about the creative stupidity of political schemes ;-)
... (Lennart, I think this third assumption is the important part of your "no changes within 48 hours of each other" assumption,
The "48 hours" bit came from Alexander. I'm personally unclear on what Lennart's problems are.
... All of these computations can be accomplished by searches of ordered lists and applications of $fromlocal_i$.
Do you have real-world use cases in mind beyond supporting long-abandoned Riyadh Solar time?
... With this perspective, arithmetic becomes "translate to UTC, operate, translate back", which is as it should be.
There _was_ a POV in the datetime design about that: no, that's not how it should be. Blame Guido ;-) If I add, say, 24 hours to noon today, I want to get noon tomorrow, and couldn't care less whether DST started or stopped (or any other political adjustment was made) in between. For that reason, it was wholly intentional that datetime + timedelta treats datetime as "naive". If that's not what someone wants, fine, but then they don't want Python's datetime arithmetic BTW, there's no implication that they're "wrong" for wanting something different; what would be wrong is insisting that datetime's POV is "wrong". Both views are valid and useful, depending on the needs of the application. One had to picked as the built-in behavior, and "naive" won.
... But IIUC what Lennart is complaining about
I don't, and I wish he would be more explicit about what "the problem(s)" is(are).
is the fact that the DST flag isn't part of and can't be embedded into a local time, so it's impossible to fold the second parameter to $fromlocal$ into $t$. Without that, a local time isn't rich enough to designate a single point in time and the whole edifice breaks.
You can blame Guido for that too ;-) , but in this case I disagree(d) with him: Guido was overly (IMO) annoyed by that the only apparent purpose for a struct tm's tm_ isdst flag was to disambiguate local times in a relative handful of cases. His thought: an entire bit just for that?! My thought: get over it, it's one measly bit.
my-kingdom-for-bit-ingly y'rs - tim
[Lennart Regebro <regebro@gmail.com>]
And I would want to remind everyone again that this is not a question of the problem being impossible. It's just really complex to get right in all cases, and that always having the UTC timestamp around gets rid of most of that complexity.
Could you please be explicit about what "the problem" is? Everyone here is guessing at what you think "the problem" is.
On Sat, Jul 25, 2015 at 7:12 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Lennart Regebro <regebro@gmail.com>]
And I would want to remind everyone again that this is not a question of the problem being impossible. It's just really complex to get right in all cases, and that always having the UTC timestamp around gets rid of most of that complexity.
Could you please be explicit about what "the problem" is? Everyone here is guessing at what you think "the problem" is.
The problem is that it is exceedingly complicated to get all the calculations back and forth between local time and UTC to be correct at all times and for all cases. It really doesn't get more specific than that. I don't remember which exact problem it was that made me decide that this was not the correct solution and that we should use UTC internally, but I don't think that matters, because I'm also sure that it was not the last case, as I was far from near the end in adding testcases. Once again I'm sure it's not impossible to somehow come up with an implementation and an API that can do this based on local time, but once again I am of the opinion that it is the wrong thing to do. We should switch to using UTC internally, because that will make everything so much simpler. I am in no way against other people implementing this PEP, but I think you will end up with very complex code that will be hard to maintain. There really is a reason every other date time implementation I know of uses UTC internally, and there really is a reason why everyone always recommends storing date times in UTC with the time zone or offset separately. //Lennart
On Sat, Jul 25, 2015 at 2:40 AM, Lennart Regebro <regebro@gmail.com> wrote:
There really is a reason every other date time implementation I know of uses UTC internally, and there really is a reason why everyone always recommends storing date times in UTC with the time zone or offset separately.
Current datetime design does not prevent your application from storing date-times in UTC. You can store them in naive datetime instances, but the recommended approach is to use datetime instances with tzinfo=timezone.utc.
On Jul 25, 2015, at 09:15, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Sat, Jul 25, 2015 at 2:40 AM, Lennart Regebro <regebro@gmail.com> wrote: There really is a reason every other date time implementation I know of uses UTC internally, and there really is a reason why everyone always recommends storing date times in UTC with the time zone or offset separately.
Current datetime design does not prevent your application from storing date-times in UTC. You can store them in naive datetime instances, but the recommended approach is to use datetime instances with tzinfo=timezone.utc.
Yes, and now he wants to do the same thing for the internals of the datetime module, for the same reasons that it's the best thing most everywhere else. It's just going to take some significant effort to make that happen.
[Lennart Regebro <regebro@gmail.com>]
And I would want to remind everyone again that this is not a question of the problem being impossible. It's just really complex to get right in all cases, and that always having the UTC timestamp around gets rid of most of that complexity.
[Tim]
Could you please be explicit about what "the problem" is? Everyone here is guessing at what you think "the problem" is.
[Lennart]
The problem is that it is exceedingly complicated to get all the calculations back and forth between local time and UTC to be correct at all times and for all cases. It really doesn't get more specific than that. I don't remember which exact problem it was that made me decide that this was not the correct solution and that we should use UTC internally, but I don't think that matters, because I'm also sure that it was not the last case, as I was far from near the end in adding testcases.
I believe everyone here is saying it "shouldn't be" exceedingly complicated, or even particularly hard, if you add the is_dst flags the PEP says it would add. But is the PEP complete? Under the "Postponement" section, it says: The implementation has turned out to be exceedingly complex, due to having to convert back and forth between the local time and UTC during arithmetic and adjusting the DST for each arithmetic step, with ambiguous times being particularly hard to get right. However, the _body_ of the PEP said nothing whatsoever about altering arithmetic. The body of the PEP sounds like it's mainly just proposing to fold the pytz package into the core. Perhaps doing _just_ that much would get this project unstuck? Hope springs eternal :-)
Once again I'm sure it's not impossible to somehow come up with an implementation and an API that can do this based on local time, but once again I am of the opinion that it is the wrong thing to do. We should switch to using UTC internally, because that will make everything so much simpler.
Like what? I'm still looking for a concrete example of what "the problem" is (or even "a" problem).
I am in no way against other people implementing this PEP, but I think you will end up with very complex code that will be hard to maintain.
Somebody first needs to define what "the problem" is ;-)
There really is a reason every other date time implementation I know of uses UTC internally,
Yes, but the fundamental reason datetime does not is that Guido consciously and deliberately decided that "naive datetime" would be most useful most often for most users. That's why "naive" objects are the default. And even for "aware" objects, arithmetic staying within a single time zone was deliberately specified to be "naive" too. My guess is that all other datetime implementations you know of have no concept of "naive" datetimes. let alone make naive datetimes primary. Small wonder, if so, that they're all different in this way. That's a design decision not everyone likes, and certainly isn't suitable for all purposes, but the debate over that ended a dozen years ago when the decision was made. If your vision of PEP 431 _changes_ that design decision (which it sure _sounds_ like it wants to based on what you're typing here, but which PEP 431 itself does not appear to say - impossible to tell which from here without any specific example(s)), that may account for all sorts of complications that aren't apparent to me.
and there really is a reason why everyone always recommends storing date times in UTC with the time zone or offset separately.
Well, that's the second thing they recommend - and they can already do that. The first thing to recommend is to use naive objects in any application where that's possible, so that you don't have to bother with _any_ time zone esoterica, surprises, complications or overheads. After all, it's 7:54 PM as I type this, and that's perfectly clear to me ;-)
On Sun, Jul 26, 2015 at 2:56 AM, Tim Peters <tim.peters@gmail.com> wrote:
However, the _body_ of the PEP said nothing whatsoever about altering arithmetic. The body of the PEP sounds like it's mainly just proposing to fold the pytz package into the core. Perhaps doing _just_ that much would get this project unstuck? Hope springs eternal :-)
The pytz package has an API and a usage that is different from the datetime() module. One of the things you need to do is that after each time you do arithmetic, you have to normalize the result. This is done because the original API design did not realize the difficulties and complexities of timezone handling and therefore left out things like ambiguous times. The PEP attemps to improved the datetime modules API so that it can handle the ambiguous times. It also says that the implementation will be based on pytz, because it was my assumption that this would be easy, since pytz already handles ambiguous times. During my attempt of implementing it I realized it wasn't easy at all, and it wasn't as easy as folding pytz into the core. Yes, the PEP gives that impression, because that was the assumption when I wrote the draft. Just folding pytz into the core without modifying the API defeats the whole purpose of the PEP, since installing pytz is a trivial task.
Like what? I'm still looking for a concrete example of what "the problem" is (or even "a" problem).
A problem is that you have a datetime, and add a timedelata to it, and it should then result in a datetime that is actually that timedelta later. And if you subtract the same timedelta from the result, it should return a datetime that is equal to the original datetime. This sounds ridiculously simple, and is ridiculously difficult to make happen in all cases that we want to support (Riyahd time zone and leap seconds not included). That IS the specific, concrete problem, and if you don't believe me, there is nothing I can do to convince you. Perhaps I am a complete moron and simply incompetent to do this, and in that case I'm sure you could implement this over a day, and then please do so, but for the love of the founders of computing I'm not going to spend more time repeating it on this mailing list, because then we would do better in having you implement this instead of reading emails. Me repeating this a waste of time for everyone involved, and I will now stop.
<discussing why Python's datetime is dfferent>
I was not involved in the discussion then, and even if I had been, that's still before I knew anything about the topic. I don't know what the arguments were, and I don't think it's constructive to try to figure out exactly why that decision was made. That is all to similar to assigning blame, which only makes people feel bad. Those who get blamed feel bad, and those who blame feel like dicks and onlookers get annoyed. Let us look forward instead. I am operating both without any need to defend that decision, as I was not involved in it, and I am operating with 20/20 hindsight as I am one of the few people having tried to implement a timezone implementation that supports ambiguous datetimes based on that decision. And then it is perfectly clear and obvious that the decision was a mistake and that we should rectify it. The only question for me is how and when.
At the risk of being off-topic, I realize I really DO NOT currently understand datetime in its current incarnation. It's too bad PEP 431 proves so difficult to implement. Even using `pytz` is there any way currently to get sensible answers to, e.g.: from datetime import * from pytz import timezone pacific = timezone('US/Pacific') pacific.localize(datetime(2015, 11, 1, 1, 30)) # Ambiguous time pacific.localize(datetime(2015, 3, 8, 2, 30)) # Non-existent time That is, what if I had *not* just looked up when the time change happens, and was innocently trying to define one of those datetimes above? Is there ANY existing way to have an error raised—or check in some other way—for the fact that one of the times occurs twice on my clock, and the other never occurs at all? On Sat, Jul 25, 2015 at 8:31 PM, Lennart Regebro <regebro@gmail.com> wrote:
On Sun, Jul 26, 2015 at 2:56 AM, Tim Peters <tim.peters@gmail.com> wrote:
However, the _body_ of the PEP said nothing whatsoever about altering arithmetic. The body of the PEP sounds like it's mainly just proposing to fold the pytz package into the core. Perhaps doing _just_ that much would get this project unstuck? Hope springs eternal :-)
The pytz package has an API and a usage that is different from the datetime() module. One of the things you need to do is that after each time you do arithmetic, you have to normalize the result. This is done because the original API design did not realize the difficulties and complexities of timezone handling and therefore left out things like ambiguous times.
The PEP attemps to improved the datetime modules API so that it can handle the ambiguous times. It also says that the implementation will be based on pytz, because it was my assumption that this would be easy, since pytz already handles ambiguous times. During my attempt of implementing it I realized it wasn't easy at all, and it wasn't as easy as folding pytz into the core.
Yes, the PEP gives that impression, because that was the assumption when I wrote the draft. Just folding pytz into the core without modifying the API defeats the whole purpose of the PEP, since installing pytz is a trivial task.
Like what? I'm still looking for a concrete example of what "the problem" is (or even "a" problem).
A problem is that you have a datetime, and add a timedelata to it, and it should then result in a datetime that is actually that timedelta later. And if you subtract the same timedelta from the result, it should return a datetime that is equal to the original datetime.
This sounds ridiculously simple, and is ridiculously difficult to make happen in all cases that we want to support (Riyahd time zone and leap seconds not included). That IS the specific, concrete problem, and if you don't believe me, there is nothing I can do to convince you. Perhaps I am a complete moron and simply incompetent to do this, and in that case I'm sure you could implement this over a day, and then please do so, but for the love of the founders of computing I'm not going to spend more time repeating it on this mailing list, because then we would do better in having you implement this instead of reading emails. Me repeating this a waste of time for everyone involved, and I will now stop.
<discussing why Python's datetime is dfferent>
I was not involved in the discussion then, and even if I had been, that's still before I knew anything about the topic. I don't know what the arguments were, and I don't think it's constructive to try to figure out exactly why that decision was made. That is all to similar to assigning blame, which only makes people feel bad. Those who get blamed feel bad, and those who blame feel like dicks and onlookers get annoyed. Let us look forward instead.
I am operating both without any need to defend that decision, as I was not involved in it, and I am operating with 20/20 hindsight as I am one of the few people having tried to implement a timezone implementation that supports ambiguous datetimes based on that decision. And then it is perfectly clear and obvious that the decision was a mistake and that we should rectify it.
The only question for me is how and when. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/mertz%40gnosis.cx
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
[Tim]
However, the _body_ of the PEP said nothing whatsoever about altering arithmetic. The body of the PEP sounds like it's mainly just proposing to fold the pytz package into the core. Perhaps doing _just_ that much would get this project unstuck? Hope springs eternal :-)
[Lennart Regebro <regebro@gmail.com>]
The pytz package has an API and a usage that is different from the datetime() module. One of the things you need to do is that after each time you do arithmetic, you have to normalize the result. This is done because the original API design did not realize the difficulties and complexities of timezone handling and therefore left out things like ambiguous times.
Oh, they were realized - indeed, the pytz docs point to Python's tzinfo docs to explain the ambiguities, and the latter docs existed before ;-) day 1. The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back. You never _have_ to normalize() in pytz. But it's needed if you _don't_ follow pytz's explicit The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans advice, and want to do "aware" arithmetic directly in a non-UTC time zone. Python's datetime never intended to support that directly. Quite the contrary. I know people who feel otherwise tend to think of that as a lazy compromise (or some such), but naive arithmetic was intended to be "a feature". Fight the design every step of the way, and, yup, you get problems every step of the way.
The PEP attemps to improved the datetime modules API so that it can handle the ambiguous times.
No problem with that. I always thought the lack of storing is_dst-like info was datetime's biggest wart.
It also says that the implementation will be based on pytz, because it was my assumption that this would be easy, since pytz already handles ambiguous times. During my attempt of implementing it I realized it wasn't easy at all, and it wasn't as easy as folding pytz into the core.
Is it the case that pytz also "fails" in the cases your attempts "fail"? In any case, if you're trying to change how "aware" datetime arithmetic works, that's a major and backward-incompatible change. Does Guido realize it? As before, it's not at all clear from the PEP.
Yes, the PEP gives that impression, because that was the assumption when I wrote the draft. Just folding pytz into the core without modifying the API defeats the whole purpose of the PEP, since installing pytz is a trivial task.
"Batteries included" has some attractions all on its own. On top of that, adding is_dst-like flags to appropriate methods may have major attractions. Changing the semantics of datetime arithmetic has major attractions to some people, but also major drawbacks - regardless, since changing it turns Guido's original design on its head, he really needs to Pronounce on that part.
Like what? I'm still looking for a concrete example of what "the problem" is (or even "a" problem).
A problem is that you have a datetime, and add a timedelata to it, and it should then result in a datetime that is actually that timedelta later. And if you subtract the same timedelta from the result, it should return a datetime that is equal to the original datetime.
This sounds ridiculously simple
Ah, but it already happens that way - because the builtin datetime arithmetic is "naive". The docs have always promised this: """ datetime2 = datetime1 + timedelta (1) datetime2 = datetime1 - timedelta (2) 1) datetime2 is a duration of timedelta removed from datetime1, moving forward in time if timedelta.days > 0, or backward if timedelta.days < 0. The result has the same tzinfo attribute as the input datetime, and datetime2 - datetime1 == timedelta after. OverflowError is raised if datetime2.year would be smaller than MINYEAR or larger than MAXYEAR. Note that no time zone adjustments are done even if the input is an aware object. 2) Computes the datetime2 such that datetime2 + timedelta == datetime1. As for addition, the result has the same tzinfo attribute as the input datetime, and no time zone adjustments are done even if the input is aware. This isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation can overflow in cases where datetime1 - timedelta does not. """
, and is ridiculously difficult to make happen in all cases that we want to support (Riyahd time zone and leap seconds not included). That IS the specific, concrete problem, and if you don't believe me, there is nothing I can do to convince you.
I apologize if I've come off as unduly critical - I truly have been _only_ trying to find out what "the problem" is. That helps! Thank you. Note that I've had nothing to do with datetime (except to use it) for about a decade. I have no idea what you, or anyone else, has said about it for years & years until this very thread caught my attention this week. Heck, for all I know, Guido _demanded_ that datetime arithmetic be changed - although I doubt it ;-)
Perhaps I am a complete moron and simply incompetent to do this, and in that case I'm sure you could implement this over a day, and then please do so, but for the love of the founders of computing I'm not going to spend more time repeating it on this mailing list, because then we would do better in having you implement this instead of reading emails. Me repeating this a waste of time for everyone involved, and I will now stop.
Then special thanks for repeating it one more time, since it's the first time I've heard it :-) FWIW, I'm sure Isaac and Alexander already know how to make this work, although it requires (as you found out the hard way) more than just copying in bits of pytz, and DST transitions aren't the only non-insane potential problem. But whether, specifically, the semantics of datetime arithmetic _should_ change is a major question, one not explicitly mentioned in the PEP, so this project may have to move back to square 1 on that specific question. Resolving ambiguities is a different question.
... I was not involved in the discussion then, and even if I had been, that's still before I knew anything about the topic. I don't know what the arguments were, and I don't think it's constructive to try to figure out exactly why that decision was made. That is all to similar to assigning blame, which only makes people feel bad. Those who get blamed feel bad, and those who blame feel like dicks and onlookers get annoyed. Let us look forward instead.
No problem: there's only Guido to blame, and he's so used to that he doesn't even notice it anymore ;-)
I am operating both without any need to defend that decision, as I was not involved in it, and I am operating with 20/20 hindsight as I am one of the few people having tried to implement a timezone implementation that supports ambiguous datetimes based on that decision. And then it is perfectly clear and obvious that the decision was a mistake and that we should rectify it.
There's more than one decision affecting this In cases where a single local time corresponds to more than one UTC time (typically at the end of DST, when a local hour repeats), datetime never did give any clear way to do "the intended" conversion from that local time _to_ UTC. But resolving such ambiguities has nothing to do with how arithmetic works: it's utterly unsolvable by any means short of supplying new info ("which UTC value is intended?" AKA is_dst).
The only question for me is how and when.
Well, since there's more than one decision, I'm afraid there's also more than one question ;-)
On Sun, Jul 26, 2015 at 8:05 AM, Tim Peters <tim.peters@gmail.com> wrote:
The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back.
We can't explicitly implement incorrect timezone aware arithmetic and then expect people to not use it. We can make the arithmetic correct, and we can raise an error when doing tz-aware arithmetic in a non-fixed timezone. But having an implementation we know is incorrect and telling people "don't do that" doesn't seem like a good solution here. Why do we even have timezone aware datetimes if we don't intend them for usage? There could just be naive datetimes, and timezones, and let strftime take a timezone that is used when formatting. And we could make date-time creation into a function that parses data including a timezone, and returns the UTC time of that data. But then again, if we do that, we could just as well have that timezone as an attribute on the datetime object, and let strftime use so it doesn't have to be passed in. And we could let the __init__ of the datetime take a timezone and do that initial conversion to UTC.
Python's datetime never intended to support that directly.
I think it should. It's expected that it supports it, and there is no real reason not to support it. The timezone handling becomes complicated if you base yourself on localtime, and simple if you base yourself on UTC. As you agree, we recommend to people to use UTC at all times, and only use timezones for input and output. Well, what I'm now proposing is to take that recommendation to heart, and change datetime's implementation so it does exactly that. I saw the previous mention of "pure" vs "practical", and that is often a concern. Here it clearly is not. This is a choice between impure, complicated and impractical, and pure, simple and practical.
Is it the case that pytz also "fails" in the cases your attempts "fail"?
No, that is not the case. And if you wonder why I just don't do it like pytz does it, it's because that leads to infinite recursion, much as discussions on this mailing list does. ;-) And this is because we need to normalize the datetime after arithmatic, but normalizing is arithmetics.
"Batteries included" has some attractions all on its own. On top of that, adding is_dst-like flags to appropriate methods may have major attractions.
Ah, but it already happens that way
No, in fact it does not. Pytz makes that happen only through a separate explicit normalize() call (and some deep cleverness to keep track of which timezone offset it is located in). dateutil.tz can't guarantee these things to be true, because it doesn't keep track of ambiguous times. So no, it does not already happen that way.
from dateutil.zoneinfo import gettz from datetime import * dt = datetime(2015, 11, 1, 0, 30, tzinfo=est) dt2 = dt + timedelta(hours=1)
utc = gettz('Etc/UTC') dtutc = dt.astimezone(utc) dt2utc = dt2.astimezone(utc) (dt2utc-dtutc).total_seconds() 7200.0
You add one hour, and you get a datetime that happens two hours later. So no, it does not already happen that way. In pytz the datetime will be adjusted after you do the normalize call.
I apologize if I've come off as unduly critical - I truly have been _only_ trying to find out what "the problem" is. That helps! Thank you. Note that I've had nothing to do with datetime (except to use it) for about a decade. I have no idea what you, or anyone else, has said about it for years & years until this very thread caught my attention this week. Heck, for all I know, Guido _demanded_ that datetime arithmetic be changed - although I doubt it ;-)
It's not a question of changing datetime arithmetic per se. The PEP does indeed mean it has to be changed, but only to support ambiguous and non-existent times. It's helpful to me to understand, which I hadn't done before, that this was never intended to work. That helps me argue for changing datetimes internal implementation, once I get time to do that. (I'm currently moving, renovating a new house, trying fix up a garden that has been neglected for years, and insanely, write my own code editor, all at the same time, so it won't be anytime soon).
There's more than one decision affecting this In cases where a single local time corresponds to more than one UTC time (typically at the end of DST, when a local hour repeats), datetime never did give any clear way to do "the intended" conversion from that local time _to_ UTC. But resolving such ambiguities has nothing to do with how arithmetic works: it's utterly unsolvable by any means short of supplying new info ("which UTC value is intended?" AKA is_dst).
The "changing arithmetic" discussion is a red herring. Now my wife insist I help her pack, so this is the end of this discussion for me. If i continue it will be only as a part of discussing how we change how datetime works internally. //Lennart
On 26 July 2015 at 18:12, Lennart Regebro <regebro@gmail.com> wrote:
On Sun, Jul 26, 2015 at 8:05 AM, Tim Peters <tim.peters@gmail.com> wrote:
The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back.
We can't explicitly implement incorrect timezone aware arithmetic and then expect people to not use it. We can make the arithmetic correct, and we can raise an error when doing tz-aware arithmetic in a non-fixed timezone. But having an implementation we know is incorrect and telling people "don't do that" doesn't seem like a good solution here.
Why do we even have timezone aware datetimes if we don't intend them for usage? There could just be naive datetimes, and timezones, and let strftime take a timezone that is used when formatting. And we could make date-time creation into a function that parses data including a timezone, and returns the UTC time of that data.
But then again, if we do that, we could just as well have that timezone as an attribute on the datetime object, and let strftime use so it doesn't have to be passed in. And we could let the __init__ of the datetime take a timezone and do that initial conversion to UTC.
I think we need to make sure to separate out the question of the semantic model presented to users from the internal implementation model here. As a user, if the apparent semantics of time zone aware date time arithmetic are accurately represented by "convert time to UTC -> perform arithmetic -> convert back to stated timezone", then I *don't care* how that is implemented internally. This is the aspect Tim is pointing out is a change from the original design of the time zone aware arithmetic in the datetime module. I personally think its a change worth making that reflects additional decades of experience with time zone aware datetime arithmetic, but the PEP should be clear that it *is* a change. As Alexander points out, the one bit of information which needs to be provided by me as a *user* of such an API (rather than its implementor), is how to handle ambiguities in the initial conversion to UTC (whether to interpret any ambiguous time reference I supply as a pre-rollback or post-rollback time). Similarly, the API needs to tell *me* whether a returned time in a period of ambiguity is pre-rollback or post-rollback. At the moment the "pre-rollback" flag is specifically called "is_dst", since rolling clocks back at the end of DST period is the most common instance of ambiguous times. That then causes confusion since "DST" in common usage refers to the entire period from the original roll forward to the eventual roll back, but the extra bit is only relevant to time zone arithmetic during the final two overlapping hours when the clocks are rolled back each year (and is in fact relevant any time a clock rollback occurs, even if the reason for the rollback has nothing to do with DST). The above paragraphs represent the full extent of my *personal* interest in the matter of the datetime module changing the way it handles timezones - I think there's a right answer from a usability perspective, and I think it involves treating UTC as the universal time zone used for all datetime arithmetic, and finding a less confusing name for the "isdst" flag (such as "prerollback", or inverting the sense of it to "postrollback", such that 0/False referred to the first time encountered, and 1/True referred to the second time encountered). There's a *separate* discussion, which relates to how best to *implement* those semantics, given the datetime module implementation we already have. For the original decimal module, we went with the approach of storing the data in a display friendly format, and then converting it explicitly as needed to and from a working representation for arithmetic purposes. While it seems plausible to me that such an approach may also work well for datetime arithmetic that presents the appearance of all datetime arithmetic taking place in terms of UTC, that's a guess based on general principles, not something based on a detailed knowledge of datetime in particular (and, in particular, with no knowledge of the performance consequences, or if we have any good datetime focused benchmarks akin to the telco benchmark that guided the original decimal module implementation). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Jul 26, 2015 at 11:33 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As a user, if the apparent semantics of time zone aware date time arithmetic are accurately represented by "convert time to UTC -> perform arithmetic -> convert back to stated timezone", then I *don't care* how that is implemented internally.
This is the aspect Tim is pointing out is a change from the original design of the time zone aware arithmetic in the datetime module. I personally think its a change worth making that reflects additional decades of experience with time zone aware datetime arithmetic, but the PEP should be clear that it *is* a change.
These semantics are already available in python 3:
t = datetime(2015, 3, 7, 17, tzinfo=timezone.utc).astimezone() t.strftime('%D %T %z %Z') '03/07/15 12:00:00 -0500 EST' (t+timedelta(1)).strftime('%D %T %z %Z') '03/08/15 12:00:00 -0500 EST' # a valid time, but not what you see on the wall clock (t+timedelta(1)).astimezone().strftime('%D %T %z %Z') '03/08/15 13:00:00 -0400 EDT' # this is what the wall clock would show
Once CPython starts vendoring a complete timezone database, it would be trivial to extend .astimezone() so that things like t.astimezone('US/Eastern') work as expected. What is somewhat more challenging, is implementing a tzinfo subclass that can be used to construct datetime instances with the following behavior:
t = datetime(2015, 3, 7, 12, tzinfo=timezone('US/Eastern')) t.strftime('%D %T %z %Z') '03/07/15 12:00:00 -0500 EST' (t + timedelta(1)).strftime('%D %T %z %Z') '03/08/15 12:00:00 -0400 EDT'
The solution to this problem has been provided as a documentation example [1] for many years, but also for many years it contained a subtle bug [2] which illustrates that one has to be careful implementing those things. Although the examples [1] in the documentation only cover simple US timezones, they cover a case of changing DST rules and changing STD rules can be implemented similarly. Whether we want such tzinfo implementations in stdlib, is a valid question, but it should be completely orthogonal to the question of vendoring a TZ database. If we agree that vendoring a TZ database is a good thing, we can make .astimezone() understand how to construct a fixed offset timezone from a location and call it a day. [1]: https://hg.python.org/cpython/file/default/Doc/includes/tzinfo-examples.py [2]: http://bugs.python.org/issue9063
On 26 July 2015 at 16:33, Nick Coghlan <ncoghlan@gmail.com> wrote:
As a user, if the apparent semantics of time zone aware date time arithmetic are accurately represented by "convert time to UTC -> perform arithmetic -> convert back to stated timezone", then I *don't care* how that is implemented internally.
This is the aspect Tim is pointing out is a change from the original design of the time zone aware arithmetic in the datetime module. I personally think its a change worth making that reflects additional decades of experience with time zone aware datetime arithmetic, but the PEP should be clear that it *is* a change.
I think the current naive semantics are useful and should not be discarded lightly. At an absolute minimum, there should be a clear, documented way to get the current semantics under any changed implementation. As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1). If the time changes to DST overnight, I still want the alarm to go off at 7am. Even though +1 day is in this case actually + 25 (or is it 23?) hours. That's the current semantics. To be honest, I would imagine, from experience with programmers writing naive algorithms, that the current semantics is a lot less prone to error when used by such people. People forget about timezones until they are bitten by them, and if they are using the convert to UTC->calculate->convert back model, their code ends up with off-by-1-hour bugs. Certainly such mistakes can be fixed, and the people who make them educated, but I like the fact that Python's typical behaviour is to do what a non-expert would expect. By all means have the more sophisticated approach available, but if it's the default then naive users have to either (1) learn the subtleties of timezones, or (2) learn how to code naive datetime behaviour in Python before they can write their code. If the current behaviour remains the default, then *when* the naive user learns about the subtleties of timezones, they can switch to the TZ-aware datetime - but that's a single learning step, and it can be taken when the user is ready. Paul PS I don't think the above is particularly original - IIRC, it's basically Guido's argument for naive datetimes from when they were introduced. I think his example was checking his watch while on a transatlantic plane flight, but the principle is the same.
[Paul Moore <p.f.moore@gmail.com>]
I think the current naive semantics are useful and should not be discarded lightly. At an absolute minimum, there should be a clear, documented way to get the current semantics under any changed implementation.
Realistically, default arithmetic behavior can't change in Python 3 (let alone Python 2). Pushing for a different design is fine, but that can't be sold on the grounds that current behavior is "a bug" - it's working as designed, as intended, and as documented, and hasn't materially changed in the dozen-or-so years since it was introduced. It's not even that the proposed alternative arithmetic is "better", either: while it's certainly more suitable for some applications, it's certainly worse for others. Making an incompatible change would be (& should be) a hard sell even if there were a much stronger case for it than there is here. But that's just arithmetic. Some way to disambiguate local times, and support for most zoneinfo time zones, are different issues.
As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1).
If the time changes to DST overnight, I still want the alarm to go off at 7am. Even though +1 day is in this case actually + 25 (or is it 23?) hours. That's the current semantics.
There was a long list of use cases coming to the same conclusion. The current arithmetic allows uniform patterns in local time to be coded in uniform, straightforward ways. Indeed, in "the obvious" ways. The alternative behavior favors uniform patterns in UTC, but who cares? ;-) Few local clocks show UTC. Trying to code uniform local-time behaviors using "aware arithmetic" (which is uniform in UTC. but may be "lumpy" in local time) can be a nightmare. The canonical counterexample is a nuclear reactor that needs to be vented every 24 hours. To which the canonical rejoinder is that the programmer in charge of that system is criminally incompetent if they're using _any_ notion of time other than UTC ;-)
To be honest, I would imagine, from experience with programmers writing naive algorithms, that the current semantics is a lot less prone to error when used by such people. People forget about timezones until they are bitten by them, and if they are using the convert to UTC->calculate->convert back model, their code ends up with off-by-1-hour bugs. Certainly such mistakes can be fixed, and the people who make them educated, but I like the fact that Python's typical behaviour is to do what a non-expert would expect. By all means have the more sophisticated approach available, but if it's the default then naive users have to either (1) learn the subtleties of timezones, or (2) learn how to code naive datetime behaviour in Python before they can write their code. If the current behaviour remains the default, then *when* the naive user learns about the subtleties of timezones, they can switch to the TZ-aware datetime - but that's a single learning step, and it can be taken when the user is ready.
There is a design flaw here, IMO: when they switch to a TZ-aware datetime, they _still_ get "naive" arithmetic within that time zone. It's at best peculiar that such a datetime is _called_ "aware" yet still ignores the time zone rules when doing arithmetic. I would have preferred a sharper distinction, like "completely naive" (tzinfo absent) versus "completely aware" (tzinfo present). But, again, it's working as designed, intended and documented. One possibility to get "the other" behavior in a backward-compatible way: recognize a new magic attribute on a tzinfo instance, say, __aware_arithmetic__. If it's present, arithmetic on a datetime with such a tzinfo member "acts as if" arithmetic were done by converting to UTC first, doing the arithmetic, then converting back. Otherwise (magic new attribute not present) arithmetic remains naive. Bonus: then you could stare at datetime code and have no idea which kind of arithmetic is being used ;-)
PS I don't think the above is particularly original - IIRC, it's basically Guido's argument for naive datetimes from when they were introduced. I think his example was checking his watch while on a transatlantic plane flight, but the principle is the same.
Yup, your account is fair (according to me ;-) ). Here's Guido's first message on the topic: https://mail.python.org/pipermail/python-dev/2002-March/020648.html
On Mon, Jul 27, 2015 at 4:04 AM, Tim Peters <tim.peters@gmail.com> wrote:
Realistically, default arithmetic behavior can't change in Python 3 (let alone Python 2).
Then we can't implement timezones in a reasonable way with the current API, but have to have something like pytz's normalize() function or similar. I'm sorry I've wasted everyones time with this PEP. //Lennart
On 27 Jul 2015, at 04:04, Tim Peters <tim.peters@gmail.com> wrote:
As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1).
If the time changes to DST overnight, I still want the alarm to go off at 7am. Even though +1 day is in this case actually + 25 (or is it 23?) hours. That's the current semantics.
There was a long list of use cases coming to the same conclusion. The current arithmetic allows uniform patterns in local time to be coded in uniform, straightforward ways. Indeed, in "the obvious" ways. The alternative behavior favors uniform patterns in UTC, but who cares? ;-) Few local clocks show UTC. Trying to code uniform local-time behaviors using "aware arithmetic" (which is uniform in UTC. but may be "lumpy" in local time) can be a nightmare.
The canonical counterexample is a nuclear reactor that needs to be vented every 24 hours. To which the canonical rejoinder is that the programmer in charge of that system is criminally incompetent if they're using _any_ notion of time other than UTC ;-)
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that. “+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding). Ronald
On 27 July 2015 at 15:57, Ronald Oussoren <ronaldoussoren@mac.com> wrote:
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that. “+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding).
Precisely. Paul
Paul Moore writes:
On 27 July 2015 at 15:57, Ronald Oussoren <ronaldoussoren@mac.com> wrote:
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that. “+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding).
Precisely.
Er, to be exact, it's an accurate statement of imprecision (although when talking about time and computation, both "accuracy" and "precision" are ambiguous). Fortunately, there's a "tim" in "time"!
On Tue, Jul 28, 2015 at 12:57 AM, Ronald Oussoren <ronaldoussoren@mac.com> wrote:
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that. “+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding).
https://www.youtube.com/watch?v=ppfpa5XgZHI MATLAB defines "+ 1 month" as, if I'm not mistaken, "add the time it would take to go from the beginning of time to the beginning of January of the year 0 (which is totally a thing, by the way)". I'm fairly sure that this is the most WAT-worthy definition possible, as it means that adding one month does nothing, and adding two months adds the length of January (31 days)... and adding three months adds January + February, *in a leap year*. But I agree that adding days and adding hours are different things. If I add one day, I expect that the time portion should not change, in the given timezone. (With the exception that DST switches might mean that that time doesn't exist.) If I add 86400 seconds, I expect that it should add 86400 ISO seconds to the time period, which might not be the same thing. If you convert a datetime to a different timezone, add 86400 seconds, and convert back to the original timezone, I would expect the result to be the same as adding 86400 seconds to the original, unless there's something seriously bizarre going on with the size of the second. But if you convert, add 1 day, and convert back, you will get a different result if the two differ on DST. Does that sound plausible? ChrisA
I agree and my 2 cents: I can expect something different depending on the timezone and DST if I add years months weeks days hours minutes seconds to a given datetime Even though, in 90% of the cases, there is a more or less obvious conversion formula between all of them. But consider months to days. That is not clear at all. On 27.07.2015 19:11, Chris Angelico wrote:
On Tue, Jul 28, 2015 at 12:57 AM, Ronald Oussoren <ronaldoussoren@mac.com> wrote:
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that. “+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding). https://www.youtube.com/watch?v=ppfpa5XgZHI
MATLAB defines "+ 1 month" as, if I'm not mistaken, "add the time it would take to go from the beginning of time to the beginning of January of the year 0 (which is totally a thing, by the way)". I'm fairly sure that this is the most WAT-worthy definition possible, as it means that adding one month does nothing, and adding two months adds the length of January (31 days)... and adding three months adds January + February, *in a leap year*.
But I agree that adding days and adding hours are different things. If I add one day, I expect that the time portion should not change, in the given timezone. (With the exception that DST switches might mean that that time doesn't exist.) If I add 86400 seconds, I expect that it should add 86400 ISO seconds to the time period, which might not be the same thing. If you convert a datetime to a different timezone, add 86400 seconds, and convert back to the original timezone, I would expect the result to be the same as adding 86400 seconds to the original, unless there's something seriously bizarre going on with the size of the second. But if you convert, add 1 day, and convert back, you will get a different result if the two differ on DST. Does that sound plausible?
ChrisA _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de
[Ronald Oussoren <ronaldoussoren@mac.com>]
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that.
But it's a fact that they _are_ the same in naive time, which Python's datetime single-timezone arithmetic implements: - A minute is exactly 60 seconds. - An hour is exactly 60 minutes. - A day is exactly 24 hours. - A week is exactly 7 days. No context is necessary: those are always true in naive time, and that lack of mess is "a feature" to those who accept it for what it is.
“+ 1 month” is another example of that (which the datetime module punts completely and can be a source of endless bikeshidding).
Note that the only units timedelta accepts have clear (utterly inarguable) meanings in naive time. That's intentional too. For example, "a month" and "a year" have no clear meanings (as durations) in naive time, so timedelta doesn't even pretend to support them. Despite all appearance to the contrary in this thread, naive time is bikeshed-free: it's easy for someone to know all there is to know about it by the time they're 12 ;-) datetime + timedelta(days=1) is equivalent to datetime + timedelta(hours=24) is equivalent to datetime + timedelta(minutes=60*24) is equivalent to datetime + timedelta(seconds=60*60*24) is equivalent to datetime + timedelta(microseconds=1000000*60*60*24) Naive time is easy to understand, reason about, and work with. When it comes to the real world, political adjustments to and within time zones can make the results dodgy, typically in the two DST-transition hours per year when most people living in a given time zone are sleeping. How much complexity do you want to endure in case they wake up? ;-) Guido's answer was "none in arithmetic - push all the complexity into conversions - then most uses can blissfully ignore the complexities". And note that because DST transitions "cancel out" over the span of a year, the benefits and the few dodgy cases don't really change regardless of whether you add one week or a hundred thousand weeks (although there's no way to predict what governments will decide the local clock "should say" a hundred thousand weeks from now - it's only predictable in naive time).
On Tue, Jul 28, 2015 at 4:49 AM, Tim Peters <tim.peters@gmail.com> wrote:
But it's a fact that they _are_ the same in naive time, which Python's datetime single-timezone arithmetic implements:
- A minute is exactly 60 seconds.
No leap second support, presumably. Also feature? ChrisA
On Mon, Jul 27, 2015 at 12:07 PM, Chris Angelico <rosuav@gmail.com> wrote:
- A minute is exactly 60 seconds.
No leap second support, presumably. Also feature?
Leap seconds come in when you convert to a Calendar representation -- a minute is 60 seconds, always -- even when passing over a leap second. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
[Tim]
But it's a fact that they _are_ the same in naive time, which Python's datetime single-timezone arithmetic implements:
- A minute is exactly 60 seconds. ...
[Chris Angelico <rosuav@gmail.com>]
No leap second support, presumably. Also feature?
Absolutely none, and absolutely "a feature", but that didn't start with the datetime module. Read Guido's original 13+ year old message about "naive time": https://mail.python.org/pipermail/python-dev/2002-March/020648.html Note especially this part: """ I'm thinking that for most *business* uses of date and time, we should have the same attitude towards DST that we've already decided to take towards leap seconds. """ Guido has never had the slightest use for leap seconds in any part of Python's implementation, and has consistently opposed attempts to incorporate them. This was well established long before datetime was even an idea. Here's another, later quote from him: """ Python's datetime objects will not support leap seconds in any way, shape or form. A tzinfo object that does support leap seconds is on its own, but I don't see the point since Python will never represent a time as a number of seconds since some epoch. (If you want to get a POSIX time_t value, you'll have to convert first to local time, then to a struct tm, and then use mktime().) """
On 27 jul. 2015, at 20:49, Tim Peters <tim.peters@gmail.com> wrote:
[Ronald Oussoren <ronaldoussoren@mac.com>]
IMHO “+ 1 days” and “+ 24 hours” are two different things. Date arithmetic is full of messy things like that.
But it's a fact that they _are_ the same in naive time, which Python's datetime single-timezone arithmetic implements:
...
Naive time is easy to understand, reason about, and work with. When it comes to the real world, political adjustments to and within time zones can make the results dodgy, typically in the two DST-transition hours per year when most people living in a given time zone are sleeping. How much complexity do you want to endure in case they wake up? ;-) Guido's answer was "none in arithmetic - push all the complexity into conversions - then most uses can blissfully ignore the complexities".
I totally agree with that, having worked on applications that had to deal with time a lot and including some where the end of a day was at 4am the following day. That app never had to deal with DST because not only are the transitions at night, the are also during the weekend. Treating time as UTC with conversions at the application edge might be "cleaner" in some sense, but can make code harder to read for application domain experts. It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love. That said, I have had the need for date delta objects that can deal with deltas expressed at days or months but it is easy enough to write your own library for that that can deal with the local conventions for those. Ronald
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/27/2015 06:11 PM, Ronald Oussoren wrote:
Treating time as UTC with conversions at the application edge might be "cleaner" in some sense, but can make code harder to read for application domain experts.
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Interesting. My experience is exactly the opposite: the datetimes which "application domain experts" cared about *always* needed to be non-naive (zone captured explicitly or from the user's machine and converted to UTC/GMT for storage). As with encoded bytes, allowing a naive instance inside the borders the system was always a time-bomb bug (stuff would blow up at a point far removed from which it was introduced). The instances which could have safely been naive were all logging-related, where the zone was implied by the system's timezone (nearly always UTC). I guess the difference is that I'm usually writing apps whose users can't be presumed to be in any one timezone. Even in those cases, having the logged datetimes be incomparable to user-facing ones would make them less useful. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJVttcdAAoJEPKpaDSJE9HYSlEP/0b7A9swT3m0uImdmzSZNJCW EShQuxkclKlADP0Qqvshbiew1lsdqSPTZQ5QOUnqxeo+F0C1pCSABgFmXA3Jjzon lxbwOGFFDhBburJ/F5zAO3XnawvL2p/M4dV+3Zea2inO0X+iNUuHvNjwx/e/qR4i XdC8IyyJZtsFFL+l5KAv7xOT6SaCOB7OrVTZySHrhmfeziClzeBC8GWI00zYIQjj BYQJB+lLhSBdb3b4u2fqhGtbrFtTHDDHEPC/mWdWAvzJN98YaeOtiTOAqiIg5Xai TssJwAvonxOy5P8f5XdW03kbaKqmslWVk/0xIT7svjJnfPXVFzHoFJZZAJMEt34s uZXu79g5ype8gyIJceXZV9/iS6GKHhfUlNTRvemJZb1aiq1QJ26otv2n97yqFbdo PYfbjSU5EhK7h42138QYCM1JmKmIEIBbb+RN5O5ZaJqWEs1IstaMI1K7rM/Gt9dj +Du0wV85Vi0ydgrZ2w8z2ZCL3bnl5wW7y8mBiSNWx1OEK7zRn/tq7/+nd9bFi1L0 8KIY8xJn5t0SU+5BSpisxTSAdX8JD6bAPy3wZlNDP8FFfB9zUyCrhRE58cDsvPdO BQYteyWrpGQJxf2i5UQTLruW2JK3i1lL0en4spucQnBI/eHs7VVHMbfpOpdXhcIl TR7c9fNwV0kn7EggTajx =nRMI -----END PGP SIGNATURE-----
[Ronald Oussoren]
Treating time as UTC with conversions at the application edge might be "cleaner" in some sense, but can make code harder to read for application domain experts.
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
[Tres Seaver <tseaver@palladion.com>]
Interesting. My experience is exactly the opposite: the datetimes which "application domain experts" cared about *always* needed to be non-naive (zone captured explicitly or from the user's machine and converted to UTC/GMT for storage). As with encoded bytes, allowing a naive instance inside the borders the system was always a time-bomb bug (stuff would blow up at a point far removed from which it was introduced).
I strongly suspect that by "naive objects" here Ronald was really talking about "naive _arithmetic_": the current behavior where adding 24 hours (or 1 day - exactly the same thing for a timedelta) to an aware datetime yields a new datetime with the same tzinfo member and the same local time components, but where the day (and possibly month, and possibly year) components have moved forward by a day (in the plain English meaning of "a day"). That behavior has been discussed approximately infinitely often so far in this thread, and was the overriding context in the message from which Ronald's quote was pulled.
The instances which could have safely been naive were all logging-related, where the zone was implied by the system's timezone (nearly always UTC). I guess the difference is that I'm usually writing apps whose users can't be presumed to be in any one timezone. Even in those cases, having the logged datetimes be incomparable to user-facing ones would make them less useful.
I bet the same is true for Ronald in some of his apps. So what do _you_ do with datetime arithmetic, Tres? Do you do datetime calculations at all, or just store/retrieve values as-is? If the former, are you disturbed that adding timedelta(hours=24) to an aware datetime object never changes the time components (only the day, and possibly also month, and possibly also year)? If that has disturbed you, did you find a way to accomplish what you wanted instead - or are you still stuck? ;-)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/27/2015 09:36 PM, Tim Peters wrote:
So what do _you_ do with datetime arithmetic, Tres? Do you do datetime calculations at all, or just store/retrieve values as-is? If the former, are you disturbed that adding timedelta(hours=24) to an aware datetime object never changes the time components (only the day, and possibly also month, and possibly also year)? If that has disturbed you, did you find a way to accomplish what you wanted instead - or are you still stuck? ;-)
Sample use cases: - - Embargo a pre-prepared story until 8:00 AM US/Central next Monday. - - Likewise, but allow it to run for three weeks. - - Create a recurring event which occurs from 7:00 - 9:00 PM US/Eastern on the last Thursday of each month. - - Issue a bid for a commodity lot N days before its expiration date; update that bid (if another bid has occurred) at the same time each day until expiration. - - Mark messages published on a distributed event channel to allow clients to sequence them unambiguously. - - For a given sequence of events: if no subsequent matching event occurs within five calendar days of the last event in the sequence, issue a "resolved" event, terminating the sequence. - - The same, except define the interval using "business days" (including applying a user-defined holiday calendar). - - Measure / bucket widgets produced across multiple production lines by quarter / month / day / shift / hour, and generate reports comparing results week-over-week, quarter-over-quarter, etc. In none of those cases involving "days" was the "one day is 24 hours, exactly" a sufficient approximation, and none of them could tolerate naive datetimes. Typically, the application used a "date interval" object (or a recurrence object) which generated date offsets without assuming that a "day" was 86400 seconds. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJVt3ArAAoJEPKpaDSJE9HYfCcP/07RNZY6Vp5wfcR8wnv3Zk/Z 3SgaEWPIG7s8ysqUhcxT/E6pMdgrLDwSm681Ceh8SDFKdvmXIgSO4UXdsHz6X9Ja gffUk1p5m/A1p0GFdcIMN9EHI1Ligtrf/s0gYJ+b0TqiDUW9mpD1xOmQaNK2/eE4 xf3iYSdFgvcqNMlIzQ+AyzP53M9npv78zCqr/LI18mRczMOHENb98jXeWycIMHyV TbHL/cZ///Uj1IqAmydezj4K0biPwUeMsNeqzzuMbDsiVFdZn+rql9N+V4BuzINZ ivmvdEIFdBqFoRcJJyoWsuqaR8GX0i/2LTVgj4Xcustj1Wnh2Aq+2yUNi0DQvjxh Y79QbVPtPyjkzFUh/dZG5hLSAEWxXtbaFsinq1eN+hraBXHAN4sTdUeL1zGV7Pz5 SSQXwe2cabqALzjbpSiLN8gZ3s7DbcVn4uDLsS3L7iyoC5Y51puZut6ui+TmdbgK fG2zvkRNayOyiRa1vymNZsjiM9XYrNABVhuVdM+xgqCe62q+bcUKKVKRIZY1JWq4 Fh0hy9MVPeT51oFuaIAPQJfPKleSLf8xElHZ9M0Gm4PbJDvmr04AjZ7MHWicXsqR pbhlbfIDO8c2Pt7JfjLPGY/0UZi0ZVeJV8bD5EaM3xcn80DLKW9UL+8Yg4h9br59 RURP7/N6S66jAEHfcUFo =3LBp -----END PGP SIGNATURE-----
On 28 Jul 2015, at 03:13, Tres Seaver <tseaver@palladion.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/27/2015 06:11 PM, Ronald Oussoren wrote:
Treating time as UTC with conversions at the application edge might be "cleaner" in some sense, but can make code harder to read for application domain experts.
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Interesting. My experience is exactly the opposite: the datetimes which "application domain experts" cared about *always* needed to be non-naive (zone captured explicitly or from the user's machine and converted to UTC/GMT for storage). As with encoded bytes, allowing a naive instance inside the borders the system was always a time-bomb bug (stuff would blow up at a point far removed from which it was introduced).
The instances which could have safely been naive were all logging-related, where the zone was implied by the system's timezone (nearly always UTC). I guess the difference is that I'm usually writing apps whose users can't be presumed to be in any one timezone. Even in those cases, having the logged datetimes be incomparable to user-facing ones would make them less useful.
I usually write application used by local users where the timezone is completely irrelevant, including DST. Stuff needs to be done at (say) 8PM, ands that’s 8PM local time. Switching to and from UTC just adds complications. I’m lucky enough that most datetime calculations happen within one work week and therefore never have to cross DST transitions. For longer periods I usually only care about dates, and almost never about the number of seconds between two datetime instances. That makes the naive datetime from the stdlib a very convenient programming model. And I’m in a country that’s small enough to have only one timezone. IMHO Unicode is different in that regard, there the application logic can clearly be expressed as text and the encoding to/from bytes can safely be hidden in the I/O layer. Often the users I deal with can follow the application logic w.r.t. text handling, but have no idea about encodings (but do care about accented characters). With some luck they can provide a sample file that allows me to deduce the encoding that should be used, and most applications are moving to UTF-8. BTW. Note that I’m not saying that a timezone aware datetime is bad, just that they are not always necessary. Ronald
On 28 Jul 2015, at 03:13, Tres Seaver <tseaver@palladion.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/27/2015 06:11 PM, Ronald Oussoren wrote:
Treating time as UTC with conversions at the application edge might be "cleaner" in some sense, but can make code harder to read for application domain experts.
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Interesting. My experience is exactly the opposite: the datetimes which "application domain experts" cared about *always* needed to be non-naive (zone captured explicitly or from the user's machine and converted to UTC/GMT for storage). As with encoded bytes, allowing a naive instance inside the borders the system was always a time-bomb bug (stuff would blow up at a point far removed from which it was introduced).
The instances which could have safely been naive were all logging-related, where the zone was implied by the system's timezone (nearly always UTC). I guess the difference is that I'm usually writing apps whose users can't be presumed to be in any one timezone. Even in those cases, having the logged datetimes be incomparable to user-facing ones would make them less useful.
I usually write application used by local users where the timezone is completely irrelevant, including DST. Stuff needs to be done at (say) 8PM, ands that’s 8PM local time. Switching to and from UTC just adds complications. I’m lucky enough that most datetime calculations happen within one work week and therefore never have to cross DST transitions. For longer periods I usually only care about dates, and almost never about the number of seconds between two datetime instances. That makes the naive datetime from the stdlib a very convenient programming model. And I’m in a country that’s small enough to have only one timezone. IMHO Unicode is different in that regard, there the application logic can clearly be expressed as text and the encoding to/from bytes can safely be hidden in the I/O layer. Often the users I deal with can follow the application logic w.r.t. text handling, but have no idea about encodings (but do care about accented characters). With some luck they can provide a sample file that allows me to deduce the encoding that should be used, and most applications are moving to UTF-8. BTW. Note that I’m not saying that a timezone aware datetime is bad, just that they are not always necessary. Ronald
On Tue, Jul 28, 2015 at 12:11 AM, Ronald Oussoren <ronaldoussoren@mac.com> wrote:
I totally agree with that, having worked on applications that had to deal with time a lot and including some where the end of a day was at 4am the following day. That app never had to deal with DST because not only are the transitions at night, the are also during the weekend.
If you don't have to deal with DST, then you don't have to have tzinfo's in your date objects. You can have just truly naive objects without DST information, and this will work just fine. I think most people's expectations are that datetime's that *are* time zone aware, should actually deal correctly with those time zones.
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Yes, they most certainly should. I will try to shut up now, but let me be clear on that the time zone support as it stands now is intentionally broken. Not naive, *broken*. All the usecases people have here for supporting "naive" objects would work just as well if they actually used naive objects, ie datetimes with no timezone info. If you explicitly do NOT want the datetimeobject to care about timezones, then you should not add a timezone to the object.
On 07/27/2015 10:08 PM, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 12:11 AM, Ronald Oussoren wrote:
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Yes, they most certainly should. I will try to shut up now, but let me be clear on that the time zone support as it stands now is intentionally broken. Not naive, *broken*. All the usecases people have here for supporting "naive" objects would work just as well if they actually used naive objects, ie datetimes with no timezone info. If you explicitly do NOT want the datetimeobject to care about timezones, then you should not add a timezone to the object.
Lennart, are you saying you would leave naive objects alone, and "fix" the tz-aware objects only? -- ~Ethan~
On Tue, Jul 28, 2015 at 7:27 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
Lennart, are you saying you would leave naive objects alone, and "fix" the tz-aware objects only?
Naive objects are not broken, so they can't be fixed. Which I guess means "yes". :-) //Lennart
On 07/27/2015 10:47 PM, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 7:27 AM, Ethan Furman wrote:
Lennart, are you saying you would leave naive objects alone, and "fix" the tz-aware objects only?
Naive objects are not broken, so they can't be fixed. Which I guess means "yes". :-)
Ah, cool! I'm on board with that! So the next question is how much of the current tz-aware datetime's behavior can be changed? -- ~Ethan~
[Ronald Oussoren]
I totally agree with that, having worked on applications that had to deal with time a lot and including some where the end of a day was at 4am the following day. That app never had to deal with DST because not only are the transitions at night, the are also during the weekend.
[Lennart Regebro]
If you don't have to deal with DST, then you don't have to have tzinfo's in your date objects.
There are no tzinfos on date objects. I assume Ronald is talking about datetime objects.
You can have just truly naive objects without DST information, and this will work just fine.
I suspect not at all: Ronald pretty obviously wants to mirror the local clock, he just doesn't care about what happens in the tiny number of cases adjacent to a DST transition buondary, because those boundaries occur "at night ... during the weekend", times his app predictably never needed to worry about.
I think most people's expectations are that datetime's that *are* time zone aware, should actually deal correctly with those time zones.
They "almost always" do, you know ;-) You want perfection in every case. Others are delighted to trade off perfection in "twice a year wee hour on a weekend" cases for straightforward "move N units in local time" local-time arithmetic
It might be nice to have time zone aware datetime objects with the right(TM) semantics, but those can and should not replace the naive objects we know and love.
Yes, they most certainly should.
They can't, at least not before Python 4. Nobody can break over a decade's worth of user code for something so objectively minor. Even in Python 4, you'd still need to get Guido's approval to reject his design, and I doubt he'd be willing (but, of course, I may be wrong about that). There are other, backward-compatible, ways to get what you want (although I don't really see a need: it's a one-line Python function for each kind of tz-perfection-in-all-cases arithmetic you favor). For example, I offhandedly suggested adding a new magic attribute to tzinfo objects; if present, datetime would compute "the other" kind of arithmetic. Another idea I saw last night was to create a new timedelta-like class, then change datetime's arithmetic to act differently when asked to use an instance of that new class in arithmetic (although, unlike the magic tzinfo attribute, that couldn't affect the behavior of datetime-datetime subtraction).
I will try to shut up now, but let me be clear on that the time zone support as it stands now is intentionally broken. Not naive, *broken*.
It does indeed intentionally deviate from reality in some cases.
All the usecases people have here for supporting "naive" objects would work just as well if they actually used naive objects, ie datetimes with no timezone info. If you explicitly do NOT want the datetimeobject to care about timezones, then you should not add a timezone to the object.
As at the start, I'm sure Ronald does/did care about mirroring local time, including DST. He just doesn't care about what happens at the relative handful of "problem times". Lots of apps are similar. Someone yesterday asked for an example of _how_ he could code a cross-timezone app to schedule remote meetings with his students. They need to occur at the same local time (for the student) once per week, and he wanted a 5-minute (something like that) warning before the meeting started in his own time zone. I'm not sure whether anyone answered him yet. This is almost certainly another case where nobody cares what happens _near_ DST transition boundaries (these are humans, so neither side would agree to meet routinely at wee hours on a weekend). So it's all easy to do with Python as it exists: "naive arithmetic" is exactly what he needs to schedule a series of meetings at the same local times separated by (naive) local weeks. first_meeting_time = datetime(Y, M, D, H, tzinfo=student_timezone) student_times = [first_meeting_time + timedelta(weeks=i) for i in range(NUM_MEETINGS)] my_times = [student_time.astimezone(my_timezone) for student_time in student_times] DST transitions are vital to track on both sides, but no time in use will appear near a transition boundary - "naive time" is a perfectly adequate approximation, because it agrees with reality at every time the app cares about. And "naive datetime arithmetic" is the only kind of arithmetic of use here.
On Tue, Jul 28, 2015 at 9:00 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Lennart Regebro]
If you don't have to deal with DST, then you don't have to have tzinfo's in your date objects.
There are no tzinfos on date objects. I assume Ronald is talking about datetime objects.
Of course, I meant datetime objects. In everything else, I stand by my original claim. If you want naive datetime obejcts, you should use naive datetime objects. My opinion is and remains that intentionally breaking datetime arithmetic to make non-naive objects behave in a naive way was a mistake. //Lennart
[Lennart Regebro <regebro@gmail.com>]
Of course, I meant datetime objects. In everything else, I stand by my original claim. If you want naive datetime obejcts, you should use naive datetime objects.
That's tautological ("if you want X, you should use X"). I'm not sure what you intended to say. But it's a fact that some apps do need DST-aware datetime objects, and also need naive datetime arithmetic on those objects. The point isn't that there's no way to get the latter if Python datetime arithmetic changed; the point is that it _already works_ for them, and has for 12 years. You can't break apps without overwhelmingly compelling reason(s). Please move on to think about backward-compatible ways to get what you want instead. In the meantime, writing little functions to do the convert/arithmetic/convert dance is "the obvious" way to get what you want.
My opinion is and remains that intentionally breaking datetime arithmetic to make non-naive objects behave in a naive way was a mistake.
While other people think it was a fine and useful compromise. It's certainly fortunate that repetition changes minds ;-) Regardless, that decision is ancient history now.
On Mon, Jul 27, 2015 at 12:15 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I think the current naive semantics are useful and should not be discarded lightly. At an absolute minimum, there should be a clear, documented way to get the current semantics under any changed implementation.
As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1).
That's a calendar operation made with a timedelta. The "days" attribute here is indeed confusing as it doesn't mean 1 day, it means 24 hours. //Lennart
[Paul Moore <p.f.moore@gmail.com>]
.... As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1).
[Lennart Regebro <regebro@gmail.com>]
That's a calendar operation made with a timedelta.
It's an instance of single-timezone datetime arithmetic, of the datetime + timedelta form. Your examples have been of the same form. Note that after Paul's tomorrows_alarm = todays_alarm + timedelta(days=1) it's guaranteed that assert tomorrows_alarm - todays_alarm == timedelta(days=1) will succeed too.
The "days" attribute here is indeed confusing as it doesn't mean 1 day, it means 24 hours.
Which, in naive arithmetic, are exactly the same thing. That's essentially why naive arithmetic is the default: it doesn't insist on telling people that everything they know is wrong ;-) There's nothing confusing about Paul's example _provided that_ single-timezone arithmetic is naive. It works exactly as he intends every time, and obviously so. Seriously, try this exercise: how would you code Paul's example if "your kind" of arithmetic were in use instead? For a start, you have no idea in advance how many hours you may need to add to get to "the same local time tomorrow". 24 won't always work Indeed, no _whole_ number of hours may work (according to one source I found, Australia's Lord Howe Island uses a 30-minute DST adjustment). So maybe you don't want to do it by addition. What then? Pick apart the year, month and day components, then simulate "naive arithmetic" by hand? The point is that there's no _obvious_ way to do it then. I'd personally strip off the tzinfo member, leaving a wholly naive datetime where arithmetic "works correctly" ;-) , add the day, then attach the original tzinfo member again. But for a dozen years it's sufficed to do what Paul did.
On Mon, Jul 27, 2015 at 9:09 AM, Tim Peters <tim.peters@gmail.com> wrote:
It's an instance of single-timezone datetime arithmetic, of the datetime + timedelta form.
No, because in that case it would always move 24 hours, and it doesn't. It sometimes moves 23 hours or 25 hours, when crossing a DST border. That is a calendar operation, in the disguise of a datetime arithmetic. But this discussion is now moot, if we can't change this, then we can't change this, and PEP431 is dead. The only reasonable way out if this mess is a completely new module for dates and times that doesn't make these kind of fundamental mistakes. I sincerely doubt I have the time to implement that this decade. //Lennart
On Mon, Jul 27, 2015 at 9:09 AM, Tim Peters <tim.peters@gmail.com> wrote:
But for a dozen years it's sufficed to do what Paul did.
No, it never did "suffice". It's just that people have been doing various kinds of workarounds to compensate for these design flaws. I guess they need to continue to do that for the time being. //Lennart
On 27 July 2015 at 08:34, Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 9:09 AM, Tim Peters <tim.peters@gmail.com> wrote:
But for a dozen years it's sufficed to do what Paul did.
No, it never did "suffice". It's just that people have been doing various kinds of workarounds to compensate for these design flaws. I guess they need to continue to do that for the time being.
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock? I understand your point that "it's a calendar operation", but that doesn't help me. I still don't know how you want me to *do* the operation. Whatever the outcome of this discussion, any PEP needs to explain how to implement this operation, because at the moment, it's done with +timedelta(days=1) and that won't be the case under the PEP. I'm not trying to shoot down your proposal here, just trying to understand it. Paul
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question. //Lennart
On 27 July 2015 at 09:54, Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question.
OK, I see. But it does propose to remove that operation from datetime. Thanks for the clarification. Am I right to think that because you say "implement calendar operations" this is not, as far as you are aware, something that already exists in the stdlib (outside of datetime)? I'm certainly not aware of an alternative way of doing it. Paul
On Mon, Jul 27, 2015 at 11:05 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Am I right to think that because you say "implement calendar operations" this is not, as far as you are aware, something that already exists in the stdlib (outside of datetime)? I'm certainly not aware of an alternative way of doing it.
Right, I think you need to use relativedelta (or rrule) from dateutil, unless you want to do it yourself, which of course in most cases is quite easy. //Lennart
On Mon, Jul 27, 2015 at 10:54:02AM +0200, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question.
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow). To my naive mind, I would have thought of calendar operations to be things like: - print a calendar; - add or remove an appointment; - send, accept or decline an invitation What do you think calendar operations are, and how do they differ from datetime operations? And most importantly, how can we tell them apart? Thanks, -- Steve
On Jul 27, 2015, at 9:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Jul 27, 2015 at 10:54:02AM +0200, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question.
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
To my naive mind, I would have thought of calendar operations to be things like:
- print a calendar; - add or remove an appointment; - send, accept or decline an invitation
What do you think calendar operations are, and how do they differ from datetime operations? And most importantly, how can we tell them apart?
The way I interpreted it is that "calendar operations" require knowledge of events like daylight savings time that require a more complete knowledge of the calendar, rather than just a naive notion of what a date and time are.
Am I the only one feeling like this entire thread should be moved to python-ideas at this point? Top-posted from my Windows Phone ________________________________ From: Steven D'Aprano<mailto:steve@pearwood.info> Sent: 7/27/2015 7:14 To: python-dev@python.org<mailto:python-dev@python.org> Subject: Re: [Python-Dev] Status on PEP-431 Timezones On Mon, Jul 27, 2015 at 10:54:02AM +0200, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question.
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow). To my naive mind, I would have thought of calendar operations to be things like: - print a calendar; - add or remove an appointment; - send, accept or decline an invitation What do you think calendar operations are, and how do they differ from datetime operations? And most importantly, how can we tell them apart? Thanks, -- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.c...
On 28 July 2015 at 00:27, Steve Dower <Steve.Dower@microsoft.com> wrote:
Am I the only one feeling like this entire thread should be moved to python-ideas at this point?
Since this is an area where the discussion of implementation details and the discussion of the developer experience can easily end up at cross purposes, I'm wondering if there may be value in actually splitting those two discussions into different venues by creating a datetime-sig, and specifically inviting the pytz and dateutil developers to participate in the SIG as well. The traffic on a similarly niche group like import-sig is only intermittent, but it means that by the time we bring suggestions to python-ideas or python-dev, we've already thrashed out the low level arcana and know that whatever we're proposing *can* be made to work, leaving the core lists to focus on the question of whether or not the change *should* be made. Whether or not to do that would be up to the folks with a specific interest in working with dates and times, though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, Jul 27, 2015 at 4:45 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 28 July 2015 at 00:27, Steve Dower <Steve.Dower@microsoft.com> wrote:
Am I the only one feeling like this entire thread should be moved to python-ideas at this point?
Since this is an area where the discussion of implementation details and the discussion of the developer experience can easily end up at cross purposes, I'm wondering if there may be value in actually splitting those two discussions into different venues by creating a datetime-sig, and specifically inviting the pytz and dateutil developers to participate in the SIG as well.
+1 for that.
On Mon, Jul 27, 2015 at 7:49 AM Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 4:45 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 28 July 2015 at 00:27, Steve Dower <Steve.Dower@microsoft.com> wrote:
Am I the only one feeling like this entire thread should be moved to python-ideas at this point?
Since this is an area where the discussion of implementation details and the discussion of the developer experience can easily end up at cross purposes, I'm wondering if there may be value in actually splitting those two discussions into different venues by creating a datetime-sig, and specifically inviting the pytz and dateutil developers to participate in the SIG as well.
+1 for that.
Alexander and Tim, you okay with moving this conversation to a datetime-sig if we got one created?
On Mon, Jul 27, 2015 at 5:13 PM, Tim Peters <tim.peters@gmail.com> wrote:
[Brett Cannon <brett@python.org>] \> Alexander and Tim, you okay with moving this conversation to a datetime-sig
if we got one created?
Fine by me!
+1 Didn't datetime-sig exist some 12 years ago? It would be nice to get some continuity from that effort.
On 27/07/2015 15:45, Nick Coghlan wrote:
On 28 July 2015 at 00:27, Steve Dower <Steve.Dower@microsoft.com> wrote:
Am I the only one feeling like this entire thread should be moved to python-ideas at this point?
Since this is an area where the discussion of implementation details and the discussion of the developer experience can easily end up at cross purposes, I'm wondering if there may be value in actually splitting those two discussions into different venues by creating a datetime-sig, and specifically inviting the pytz and dateutil developers to participate in the SIG as well.
The traffic on a similarly niche group like import-sig is only intermittent, but it means that by the time we bring suggestions to python-ideas or python-dev, we've already thrashed out the low level arcana and know that whatever we're proposing *can* be made to work, leaving the core lists to focus on the question of whether or not the change *should* be made.
Whether or not to do that would be up to the folks with a specific interest in working with dates and times, though.
Cheers, Nick.
Would it be worth doing a straw poll to gauge how many people really need this, from my perspective anyway, level of complexity? I've used datetimes a lot, but I don't even need naive timezones, completely dumb suits me. Alternatively just go ahead, knowing that if the proposal isn't accepted into the stdlib it can at least go on pypi. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
On Mon, Jul 27, 2015 at 4:27 PM, Steve Dower <Steve.Dower@microsoft.com> wrote:
Am I the only one feeling like this entire thread should be moved to python-ideas at this point?
Well, there isn't any idea to discuss. :-) It's just an explanation of the problem space. Perhaps it should be moved somewhere else though.
On Tue, Jul 28, 2015 at 12:13 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Jul 27, 2015 at 10:54:02AM +0200, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 10:47 AM, Paul Moore <p.f.moore@gmail.com> wrote:
I'm confused by your position. If it's 7am on the clock behind me, right now, then how (under the model proposed by the PEP) do I find the datetime value where it will next be 7am on the clock?
PEP-431 does not propose to implement calendar operations, and hence does not address that question.
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
To my naive mind, I would have thought of calendar operations to be things like:
- print a calendar; - add or remove an appointment; - send, accept or decline an invitation
What do you think calendar operations are, and how do they differ from datetime operations? And most importantly, how can we tell them apart?
Concrete example: I meet with my students on a weekly basis, with the scheduled times being defined in the student's timezone. For instance, I might meet with a student every Thursday at 1PM America/Denver. When there's a DST switch in that timezone, our meetings will be either 167 or 169 hours apart. I want a script that waits until five minutes before my next meeting (with anyone) and then plays "Let It Go" in a randomly-selected language. How would I go about doing this with pytz and/or datetime? How would I do this under the new proposal? (FWIW, I currently have exactly such a script, but it's backed by my Google Calendar. I'm seriously considering rewriting it to use a simple text file as its primary source, in which case my options are Python (datetime+pytz) and Pike (built-in Calendar module). This does not strike me as an altogether unusual kind of script to write.) And no, Steve, you're not the only one. But maybe the entire thread should completely halt until there's a PEP draft to discuss, and *then* reopen on -ideas. ChrisA
On Mon, Jul 27, 2015 at 4:13 PM, Steven D'Aprano <steve@pearwood.info> wrote:
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different. I called it a "calendar" operation, because these operation include such things as "add one year", where you expect to get the 27th of July 2016, but you will get the 26th if you use a timedelta, because 2016 is a leap year. So we need to separate date (or calendar) operations from time operations. The same thing goes with months, add 30 days, and you'll sometimes get the same result as if you add one month and sometimes not. timedelta adds time, not days, month or years. Except when you cross a DST border where it suddenly, surprisingly and intentionally may add more or less time than you told it to. //Lennart
On 2015-07-27 15:46, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 4:13 PM, Steven D'Aprano <steve@pearwood.info> wrote:
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different.
I called it a "calendar" operation, because these operation include such things as "add one year", where you expect to get the 27th of July 2016, but you will get the 26th if you use a timedelta, because 2016 is a leap year. So we need to separate date (or calendar) operations from time operations. The same thing goes with months, add 30 days, and you'll sometimes get the same result as if you add one month and sometimes not.
Also, if you "add one year" to 29 February 2016, what date do you get?
timedelta adds time, not days, month or years. Except when you cross a DST border where it suddenly, surprisingly and intentionally may add more or less time than you told it to.
On 7/27/2015 11:21 AM, MRAB wrote:
Also, if you "add one year" to 29 February 2016, what date do you get?
I believe the 'conventional' answer is 1 March 2017. That is also 1 Mar 2016 + 1 year. 1 March 2017 - 1 year would be 1 Mar 2016. Leap days get cheated. -- Terry Jan Reedy
Terry Reedy writes:
On 7/27/2015 11:21 AM, MRAB wrote:
Also, if you "add one year" to 29 February 2016, what date do you get?
I believe the 'conventional' answer is 1 March 2017. That is also 1 Mar 2016 + 1 year. 1 March 2017 - 1 year would be 1 Mar 2016. Leap days get cheated.
I doubt there is a *single* "conventional" answer. With respect to *calendar* years, I suspect that more children born on February 29 celebrate their birthdays on February 28 than on March 1. Another angle: I would imagine that a "real" conversation would go something like "A: Let's meet one year from today." "B: Today's the 29th, you know. How about the 28th?" If it wasn't mentioned, and one arrived on the 28th, and another on the 1st, we'd consider that a comedy, not one person's error of calculation according to "conventional" rules. I suspect that what you are calling "conventional" is actually due to observing the cognitive bias called "substitution" (of a solvable problem for an insoluble one). That is, the problem of "computing the date 1 timedelta year later" is substituted for the problem of "computing the calendar date 1 year later". But that's inappropriate, because these are two different use cases. Calendar time is for coordination, such as birthday parties, and timedelta time is for process control, such as venting nuclear reactors. Of course timedelta can be considered as a calendar by assigning a place and an epoch, and it's convenient to assign one more or less consistent with the "consensus calendar" in some "reasonable" place. Eg, UTC as a calendar is approximately the calendar in London. But this is fundamentally arbitrary. Consider the events that define "Easter Sunday". In this framework, I suppose one could characterize datetimes as allowing "simple calculations with time intervals suitable for people who don't hold meetings at 1:30am and who don't have birthdays or wedding anniversaries on Feb 29, and are not separated from important events by astronomical distances, nor likely to move at speeds greater than 0.01% of the speed of light before the event happens". Ie, almost all of us almost all of the time. I'm completely satisfied by Tim's answers and think "almost all" is good enough for the stdlib for now. If a separate module that actually succeeds in eliminating the discrepancies between datetime and calendar time as required by "coordinating process control time" with human requirements for "simultaneous presence at meetings", I'd be for including that calendar module in the stdlib, and deprecating datetime. But *elimination* is a high bar, and I think the stdlib would have to be backward-compatible if it fails that criterion (probably by including both modules).
Hi, As it's very hard to keep up with the pace of this thread, instead of addressing any particular response I would like to add some (hopefully) useful context. While Java was historically known for the worst date/time handling ever (e.g. months starting with 0), in Java 8 a new module was added named javax.time[1]; It contains (amongst others) the following classes: LocalTime (= datetime.time) LocalDate (= datetime.date) LocalDateTime (= datetime.datetime without tzinfo) OffsetDateTime (= datetime.datetime + datetime.timezone) ZonedDateTime (AFAIU equivalent to how Lenart wants the datetime.datetime + IANA timezone to work) Instant (a calendar independent representation of a point in time using UTC-SLS) Duration (= datetime.timedelta) Period (e.g. 1 year, 2 months and 3 days - no real counterpart in Python) (I'm not sure which class would be equivalent to what Tim describes.) While having some Java-style boilerplate, this API is both pure and very practical. Each class serves a bit different purpose and covers different use cases without ambiguity and implicit assertions. Maybe instead of trying to decide who is "wrong" and which approach is "broken", Python just needs a more clear separation between timezone aware objects and "naive" ones? [1]: https://docs.oracle.com/javase/8/docs/api/java/time/package-summary.html Best Reagards, Łukasz Rekucki
On Tue, Jul 28, 2015 at 10:25 AM, Łukasz Rekucki <lrekucki@gmail.com> wrote:
Maybe instead of trying to decide who is "wrong" and which approach is "broken", Python just needs a more clear separation between timezone aware objects and "naive" ones?
Well, the separation is pretty clear already. Tim wants to have naive timezone aware objects, ie datetime objects that have a time zone but ignores the time zone, except when converting to other time zones. I have yet to see a use case for that. //Lennart
[Łukasz Rekucki <lrekucki@gmail.com>]
Maybe instead of trying to decide who is "wrong" and which approach is "broken", Python just needs a more clear separation between timezone aware objects and "naive" ones?
[Lennart Regebro <regebro@gmail.com>]
Well, the separation is pretty clear already.
I preemptively ;-) agreed with Lukasz on this: it's downright strange that we have datetime objects the docs call "aware" that nevertheless _act_ as if they were "naive" in some cases of arithmetic. That was the intended design, but it is strange in that respect. So it goes.
Tim wants to have naive timezone aware objects, ie datetime objects that have a time zone but ignores the time zone, except when converting to other time zones.
That's not about what I want. It's what Python _does_. We can't wish away how the language already works. Guido designed it that way (with an enormous amount of public input & debate), and I did the vast bulk of the implementations. It so happens I like Guido's design, but none of that ever depended on what Tim wanted datetime to do (except in minor respects, like adding the .replace() and .fromutc() methods, which weren't part of the original design - I made those up based on painful experience with many iterations of ever-changing prototypes - but my role in arithmetic was just to implement the design specs).
I have yet to see a use case for that.
Of course you have. When you address them, you usually dismiss them as "calendar operations" (IIRC). In some of those cases, you correctly pointed out that the user could have done as well (based on all they explicitly revealed about their app's requirements) with a naive datetime object. In other cases, you made the same claim but seemingly incorrectly (like any app that needs to track the local clocks across multiple timezones with inter-zone conversions, and also needs to do "calendar operations" in those timezones, and - unsurprisingly! - uses Python's datetime arithmetic to implement such operations. It's unsurprising they do so because that was always the intended and documented way to do so - and it has always worked for these purposes). But it doesn't matter whether you _call_ them "calendar operations", or anything else. What they're called doesn't change any of the high-order bits: they are use cases, they already work, they have worked that way for a dozen years (an eternity in "computer time"), they were always intended to work that way, and the docs have always said they work that way. I do think you're missing my fundamental objection: no matter what intended and documented thing datetime (or any other module) has done all along, and regardless of whether I loved it or hated it, I'd be just as annoying about insisting we cannot intentionally break existing code using that thing in non-trivial ways without a justification so compelling that I can't recall a case of it ever happening. If Python had been doing "Lennart arithmetic" all along, and Guido proposed changing to "Guido arithmetic" in Python 2 and/or Python 3, I'd be equally opposed to his proposal (& despite that I prefer Guido's vision of how datetime arithmetic should work - "tough luck - it's too late". And he'd eventually agree. Be like Guido ;-) ). Given Guido's brief message, I'll cut the weasel words, confident that I'm still channeling his intent in this area: major backward-incompatible changes (like altering the meaning of arithmetic using already-existing objects and operations) are not going to happen. It's clear now that he'd even remain opposed to such a change in a hypothetical everything-can-change Python 4, because he still likes his original design. So please move on. New objects, new operations, new functions, new methods, new optional arguments on existing methods, even new modules ... all remain on the table. Changing the meaning of code that some users are perfectly happy with was never really _on_ the table (which would have been made clear before if the PEP had spelled out that changes to the results of arithmetic wouldn't be restricted to a tiny number of annoying-to-everyone-regardless edge cases). Even convincing _everyone_ that "it's broken" (which will never happen either) wouldn't change one jot of this. Look:
0.1 + 0.1 + 0.1 == 0.3 False
That will remain "broken" forever too (except, perhaps, until Python 4). and despite that it's one of the most frequent of all user complaints. So at worst you're in good company ;-)
On Tue, Jul 28, 2015 at 10:26 PM, Tim Peters <tim.peters@gmail.com> wrote:
I have yet to see a use case for that.
Of course you have. When you address them, you usually dismiss them as "calendar operations" (IIRC).
Those are not usecases for this broken behaviour. I agree there is a usecase for where you want to add one day to an 8am datetime, and get 8am the next day. Calling them "date operations" or "calendar operations" is not dismissing them. I got into this whole mess because I implemented calendars. That use case is the main usecase for those operations. But that usecase is easily handled in several ways. Already today in how datetime works, you have two solutions: The first is to use a time zone naive datetime. This is what most people who want to ignore time zones should use. The other is to separate the datetime into date and time objects and add a day to the date object. But most importantly, there are better ways to solve this that datetime today doesn't support at all: 1. You can use something like dateutil.rrule for repeating events. (works today, but requires a third-party module). 2. timedelta could not incorrectly pretend that one day is always 24 hours, but actually handle days separately, and always increase the days when a day is given. (This would however mean we no longer can support fractional days). 3. There could be a datetelta() object that makes operations on dates, leaving hours, minuts, seconds and microseconds alone, or something like Lubridates Perod and Delta objects, where a Period() essentially operates on wall time, and a Duration() operates on real time. So that's not the usecase I'm asking for. I am specifically asking for a usecase where I want an object that is timezone aware, but ignores the timezone for everything other than conversion to other timezones. Because that's what datetime implements. That's what I want a usecase for. "I want the same time next day" is not such a usecase. And I don't want that use case for you to convince me that we shouldn't change datetime. You say it breaks too much. OK, if you say so. I don't know. I want to know if that use case actually exists, because I don't think it does.
But it doesn't matter whether you _call_ them "calendar operations", or anything else. What they're called doesn't change any of the high-order bits: they are use cases, they already work, they have worked that way for a dozen years (an eternity in "computer time"), they were always intended to work that way, and the docs have always said they work that way.
They only work like that because people have adapted to how datetime does things. If datetime had done this properly from the start, it would have worked even better.
I do think you're missing my fundamental objection: no matter what intended and documented thing datetime (or any other module) has done all along, and regardless of whether I loved it or hated it, I'd be just as annoying about insisting we cannot intentionally break existing code
I stopped arguying for changing datetime two days ago. I've also mentioned that several times.
using that thing in non-trivial ways without a justification so compelling that I can't recall a case of it ever happening.
Well, I've seen several of those on Stackoverflow. //Lennart
On Wed, 29 Jul 2015 06:26:44 +0200, Lennart Regebro <regebro@gmail.com> wrote:
On Tue, Jul 28, 2015 at 10:26 PM, Tim Peters <tim.peters@gmail.com> wrote:
I have yet to see a use case for that.
Of course you have. When you address them, you usually dismiss them as "calendar operations" (IIRC).
Those are not usecases for this broken behaviour.
I agree there is a usecase for where you want to add one day to an 8am datetime, and get 8am the next day. Calling them "date operations" or "calendar operations" is not dismissing them. I got into this whole mess because I implemented calendars. That use case is the main usecase for those operations.
But that usecase is easily handled in several ways. Already today in how datetime works, you have two solutions: The first is to use a time zone naive datetime. This is what most people who want to ignore time zones should use. The other is to separate the datetime into date and time objects and add a day to the date object.
I said I was done commenting, and this is supposed to move to the datetime-sig, but this "lack of use cases" keeps coming up, so I'm going to make one more comment, repeating something I said earlier. What *I* want aware datetimes to do is give me the correct timezone label when I format them, given the date and time information they hold. The naive arithmetic is perfect, the conversion between timezones is fine, the only thing missing from my point of view is a timezone database such that if I tell a datetime it is in zone X, it will print the correct offset and/or timezone label when I format it as a string. That's my use case; and it is, I would venture to guess, *the* most common use case that datetime does not currently support, and I was very disappointed to find that pytz didn't support it either (except via 'normalize' calls, but why should I have to call normalize every time I do datetime arithmetic? It should just *do* it.) Anything more would be gravy from my point of view. Now, maybe tzinfo can't actually support this, but I haven't heard Tim say that it can't. Since the datetime is always passed to tzinfo, I think it can. --David PS: annoying story: I scheduled an event six months in advance on my tablet's calendar, and it scheduled it using what was my *current* GMT offset (it calls it a time zone) even though it knew what date I was scheduling it on. Which meant the alarm in my calendar was off by an hour. I hope they have fixed this bug. I relay this because it is exactly the same problem I find to be present in pytz. If the calendar had been using aware datetimes and naive arithmetic as I describe above, the alarm would not have been off by an hour.
[Lennart Regebro] \>>> I have yet to see a use case for that. [Tim]
Of course you have. When you address them, you usually dismiss them as "calendar operations" (IIRC).
'[Lennart]
Those are not usecases for this broken behaviour.
I agree there is a usecase for where you want to add one day to an 8am datetime, and get 8am the next day. Calling them "date operations" or "calendar operations" is not dismissing them. I got into this whole mess because I implemented calendars. That use case is the main usecase for those operations.
But that usecase is easily handled in several ways. Already today in how datetime works, you have two solutions: The first is to use a time zone naive datetime. This is what most people who want to ignore time zones should use. The other is to separate the datetime into date and time objects and add a day to the date object.
But most importantly, there are better ways to solve this that datetime today doesn't support at all:
1. You can use something like dateutil.rrule for repeating events. (works today, but requires a third-party module). 2. timedelta could not incorrectly pretend that one day is always 24 hours, but actually handle days separately, and always increase the days when a day is given. (This would however mean we no longer can support fractional days). 3. There could be a datetelta() object that makes operations on dates, leaving hours, minuts, seconds and microseconds alone, or something like Lubridates Perod and Delta objects, where a Period() essentially operates on wall time, and a Duration() operates on real time.
So that's not the usecase I'm asking for. I am specifically asking for a usecase where I want an object that is timezone aware, but ignores the timezone for everything other than conversion to other timezones. Because that's what datetime implements. That's what I want a usecase for. "I want the same time next day" is not such a usecase.
And I don't want that use case for you to convince me that we shouldn't change datetime. You say it breaks too much. OK, if you say so. I don't know. I want to know if that use case actually exists, because I don't think it does.
But it doesn't matter whether you _call_ them "calendar operations", or anything else. What they're called doesn't change any of the high-order bits: they are use cases, they already work, they have worked that way for a dozen years (an eternity in "computer time"), they were always intended to work that way, and the docs have always said they work that way.
They only work like that because people have adapted to how datetime does things. If datetime had done this properly from the start, it would have worked even better.
I do think you're missing my fundamental objection: no matter what intended and documented thing datetime (or any other module) has done all along, and regardless of whether I loved it or hated it, I'd be just as annoying about insisting we cannot intentionally break existing code
I stopped arguying for changing datetime two days ago. I've also mentioned that several times.
using that thing in non-trivial ways without a justification so compelling that I can't recall a case of it ever happening.
Well, I've seen several of those on Stackoverflow.
//Lennart
On 07/27/2015 07:46 AM, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 4:13 PM, Steven D'Aprano wrote:
To me, Paul's example is a datetime operation: you start with a datetime (7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different.
I have to disagree. If I have my alarm at 7am (localtime ;) so I can be at work at 8am I don't care exactly how many seconds have passed, that alarm better go off at 7am local time. -- ~Ethan~
On Mon, Jul 27, 2015 at 9:47 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 07/27/2015 07:46 AM, Lennart Regebro wrote:
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different.
I have to disagree. If I have my alarm at 7am (localtime ;) so I can be at work at 8am I don't care exactly how many seconds have passed, that alarm better go off at 7am local time.
Right. And then adding 86400 seconds to it is not the right thing to do. //Lennart
On 07/27/2015 01:42 PM, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 9:47 PM, Ethan Furman wrote:
On 07/27/2015 07:46 AM, Lennart Regebro wrote:
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different.
I have to disagree. If I have my alarm at 7am (localtime ;) so I can be at work at 8am I don't care exactly how many seconds have passed, that alarm better go off at 7am local time.
Right. And then adding 86400 seconds to it is not the right thing to do.
Yes, it is, because that's the number that will get me to 7am the next day. My program has no control over the computer's clock -- it merely works with what it is told by the computer's clock. -- ~Ethan~
To use Alexander's example:
--> t = datetime(2015, 3, 7, 12, tzinfo=timezone('US/Eastern')) --> t.strftime('%D %T %z %Z') '03/07/15 12:00:00 -0500 EST'
--> (t + timedelta(1)).strftime('%D %T %z %Z') '03/08/15 12:00:00 -0400 EDT'
The data (aka the time) should act naively, but the metadata (aka the timezone) is what should be changing [1]. -- ~Ethan~ [1] Which is to say that naive datetime's should continue as-is, and aware datetimes should exhibit the above behavior.
On 7/27/2015 1:42 PM, Lennart Regebro wrote:
On Mon, Jul 27, 2015 at 9:47 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
On 07/27/2015 07:46 AM, Lennart Regebro wrote:
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different. I have to disagree. If I have my alarm at 7am (localtime ;) so I can be at work at 8am I don't care exactly how many seconds have passed, that alarm better go off at 7am local time. Right. And then adding 86400 seconds to it is not the right thing to do.
It is the right thing to do... but one also adds/subtracts 3600 seconds from it before going to bed 2 days a year, due to government interference, unless it is an atomic clock or cell-phone, which do those updates automatically.
On Mon, Jul 27, 2015 at 12:47 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
To me, Paul's example is a datetime operation: you start with a datetime
(7am today), perform arithmetic on it by adding a period of time (one day), and get a datetime as the result (7am tomorrow).
Well, OK, let's propose these wordings: It looks like a date operation, ie, add one to the date, but in reality it's a time operation, ie add 86400 seconds to the time. These things sound similar but are very different.
I have to disagree. If I have my alarm at 7am (localtime ;) so I can be at work at 8am I don't care exactly how many seconds have passed, that alarm better go off at 7am local time.
sure, but that is very much a Calendar operation -- "7am tomorrow", On the other hand, if you wanted to sleep a particular length of time,t hen you might want your alarm to go off "in 8 hours" -- that is a different operation. Calendar operations are very, very useful, but not on the table in this discussion, are they? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, 27 Jul 2015 02:09:19 -0500, Tim Peters <tim.peters@gmail.com> wrote:
Seriously, try this exercise: how would you code Paul's example if "your kind" of arithmetic were in use instead? For a start, you have no idea in advance how many hours you may need to add to get to "the same local time tomorrow". 24 won't always work Indeed, no _whole_ number of hours may work (according to one source I found, Australia's Lord Howe Island uses a 30-minute DST adjustment). So maybe you don't want to do it by addition. What then? Pick apart the year, month and day components, then simulate "naive arithmetic" by hand?
The point is that there's no _obvious_ way to do it then. I'd personally strip off the tzinfo member, leaving a wholly naive datetime where arithmetic "works correctly" ;-) , add the day, then attach the original tzinfo member again.
I *think* I'd be happy with that solution. I'm not sure if the opinion of a relatively inexperienced timezone user (whose head hurts when thinking about these things) is relevant, but in case it is: My brief experience with pytz is that it gets this all "wrong". (Wrong is in quotes because it isn't wrong, it just doesn't match my use cases). If I add a timedelta to a pytz datetime that crosses the DST boundary, IIRC I get something that still claims it is in the previous "timezone" (which it therefore seemed to me was really a UTC offset), and I have to call 'normalize' to get it to be in the correct "timezone" (UTC offset). I don't remember what that does to the time, and I have no intuition about it (I just want it to do the naive arithmetic!) This makes no sense to me, since I thought a tzinfo object was supposed to represent the timezone including the DST transitions. I presume this comes from the fact that datetime does naive arithmetic and pytz is trying to paste non-naive arithmetic on top? So, would it be possible to take the timezone database support from pytz, and continue to implement naive-single-zone arithmetic the way Tim proposes, and have it automatically produce the correct UTC offset and UTC-offset-label afterward, without a normalize call? I assumed that was what the PEP was proposing, but I never read it so I can't complain that I was wrong :) I have a feeling that I'm completely misunderstanding things, since tzinfo is still a bit of a mystery to me. Based on this discussion it seems to me that (1) datetime has to/should continue to do naive arithmetic and (2) if you need to do non-naive UTC based calculations (or conversions between timezones) you should be converting to UTC *anyway* (explicit is better than implicit). The addition of the timezone DB would then give us the information to *add* tools that do conversions between time zones &c. At Tim says, the issue of disambiguation is separate, and I seem to recall a couple of proposals from the last time this thread went around for dealing with that. --David
On Mon, Jul 27, 2015 at 3:59 PM, R. David Murray <rdmurray@bitdance.com> wrote:
I'm not sure if the opinion of a relatively inexperienced timezone user (whose head hurts when thinking about these things) is relevant, but in case it is:
My brief experience with pytz is that it gets this all "wrong". (Wrong is in quotes because it isn't wrong, it just doesn't match my use cases). If I add a timedelta to a pytz datetime that crosses the DST boundary, IIRC I get something that still claims it is in the previous "timezone" (which it therefore seemed to me was really a UTC offset), and I have to call 'normalize' to get it to be in the correct "timezone" (UTC offset).
Right.
I don't remember what that does to the time, and I have no intuition about it (I just want it to do the naive arithmetic!)
But what is it that you expect? That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM. Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? This may actually leave you with a datetime that does not exist, so that is not something you can consistently do. Dates and times are tricky, and especially with DST it is simply possible to make an implementation that is intuitive in all cases to those who don't know about the problems. What we should aim for is an implementation that is easy to understand and hard to make errors with.
This makes no sense to me, since I thought a tzinfo object was supposed to represent the timezone including the DST transitions. I presume this comes from the fact that datetime does naive arithmetic and pytz is trying to paste non-naive arithmetic on top?
Exactly.
So, would it be possible to take the timezone database support from pytz, and continue to implement naive-single-zone arithmetic the way Tim proposes, and have it automatically produce the correct UTC offset and UTC-offset-label afterward, without a normalize call?
That depends on your definition of "correct". //Lennart
On Mon, 27 Jul 2015 16:37:47 +0200, Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 3:59 PM, R. David Murray <rdmurray@bitdance.com> wrote:
I don't remember what that does to the time, and I have no intuition about it (I just want it to do the naive arithmetic!)
But what is it that you expect?
"I just want it to do the naive arithmetic"
So, would it be possible to take the timezone database support from pytz, and continue to implement naive-single-zone arithmetic the way Tim proposes, and have it automatically produce the correct UTC offset and UTC-offset-label afterward, without a normalize call?
That depends on your definition of "correct".
If I have a time X on date Y in timezone Z, it is either this UTC offset or that UTC offset, depending on what the politicians decided. Couple that with naive arithmetic, and I think you have something easily understandable from the end user perspective, and useful for a wide variety (but far from all) use cases. I'll stop now :) --David
On 27 July 2015 at 15:37, Lennart Regebro <regebro@gmail.com> wrote:
That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM.
Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? This may actually leave you with a datetime that does not exist, so that is not something you can consistently do.
OK, that pretty much validates what I thought might be the case at the end of my recent lengthy email. What you're saying is that the idea of a timedelta is insufficiently rich here - just saying "1 hour" doesn't give you enough information to know which of the two expectations the user has. (The fact that one of those expectations isn't viable is not actually relevant here). OK, I see what your point is now. No idea how to solve it, but at least I understand what you're getting at (I think). Thanks for not giving up on the thread! Does thinking of the problem in terms of timedeltas not containing enough information to make a_time + a_timedelta a well-defined operation if a_time uses a non-fixed-offset timezone, make it any easier to find a way forward? Paul.
On Mon, Jul 27, 2015 at 5:12 PM, Paul Moore <p.f.moore@gmail.com> wrote:
Does thinking of the problem in terms of timedeltas not containing enough information to make a_time + a_timedelta a well-defined operation if a_time uses a non-fixed-offset timezone, make it any easier to find a way forward?
Well, I think it is a well-defined operation, but that datetime currently does it wrongly, to be honest. Adding 3600 seconds to a datetime should move that datetime 3600 seconds forward at all time. I just do not see a usecase for doing anything else, to be honest. But if somebody has one, I'm all ears. The problem here is that the issue of "get the next day" has been mixed into timedeltas, when it in my opinion is an entirely different issue that should be kept separate from timedeltas. It is possible to implement something so that you can both have "realtimedeltas" and "walltimedeltas" where adding one hour might give you two hours (or an error) but as per above I can't think of a usecase. //Lennart
On Jul 27 2015, Lennart Regebro <regebro@gmail.com> wrote:
That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM.
Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? This may actually leave you with a datetime that does not exist, so that is not something you can consistently do.
Apologies for asking yet another dumb question about this, but I have the impression that a lot of other people are struggling with the basics here too. Can you tell us which of the two operations datetime currently implements? And when people talk about explicitly converting to UTC and back, does that mean that if you're (again, with the current implementation) converting to UTC, *then* add the one hour, and then convert back, you get the other operation (that you don't get when you directly add 1 day)? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«
On Mon, Jul 27, 2015 at 6:15 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Jul 27 2015, Lennart Regebro <regebro@gmail.com> wrote:
That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM.
Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? This may actually leave you with a datetime that does not exist, so that is not something you can consistently do.
Apologies for asking yet another dumb question about this, but I have the impression that a lot of other people are struggling with the basics here too.
Can you tell us which of the two operations datetime currently implements?
It's intended that the first one is implemented, meaning that datetime.now() + timedelta(hours=24) can result in a datetime somewhere between 23 and 25 hours into the future. Or well, any amount, in theory, I guess some changes are more than an hour, but that's very unusual.
And when people talk about explicitly converting to UTC and back, does that mean that if you're (again, with the current implementation) converting to UTC, *then* add the one hour, and then convert back, you get the other operation (that you don't get when you directly add 1 day)?
Yes, exactly. //Lennart
On Mon, Jul 27, 2015 at 12:30 PM, Lennart Regebro <regebro@gmail.com> wrote:
On Mon, Jul 27, 2015 at 6:15 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Jul 27 2015, Lennart Regebro <regebro@gmail.com> wrote:
(The *first* option)
That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM.
(The *second* option)
Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? ... Can you tell us which of the two operations datetime currently implements?
It's intended that the first one is implemented, meaning that datetime.now() + timedelta(hours=24) can result in a datetime somewhere between 23 and 25 hours into the future.
I think this describes what was originally your *second*, not *first* option. It will also help if you focus on one use case at a time. Your original example dealt with adding 1 hour, but now you switch to adding 24. In my previous email, I explained what is currently doable using the datetime module:
t = datetime(2014,11,2,5,tzinfo=timezone.utc).astimezone() t.strftime("%D %T%z %Z") '11/02/14 01:00:00-0400 EDT' (t+timedelta(hours=1)).astimezone().strftime("%D %T%z %Z") '11/02/14 01:00:00-0500 EST'
Is this your *first* or your *second* option? Note that this is not what is "intended". This is an actual Python 3.4.3 session.
On Mon, Jul 27, 2015 at 6:42 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
I think this describes what was originally your *second*, not *first* option.
Yes, you are absolutely correct, I didn't read my own description of the options carefully enough.
On 27 July 2015 at 17:30, Lennart Regebro <regebro@gmail.com> wrote:
Apologies for asking yet another dumb question about this, but I have the impression that a lot of other people are struggling with the basics here too.
Can you tell us which of the two operations datetime currently implements?
It's intended that the first one is implemented, meaning that datetime.now() + timedelta(hours=24) can result in a datetime somewhere between 23 and 25 hours into the future. Or well, any amount, in theory, I guess some changes are more than an hour, but that's very unusual.
Maybe that's what the PEP intends. But the stdlib as currently implemented simply adds the appropriate number to the relevant field (i.e., "increase the hour count with one"). It's not possible to detect the difference between these two using only stdlib timezones, though, as those are all fixed-offset. Both Tim and I have pointed out that Guido's original intention was precisely what is implemented, for better or worse. What Guido's view was on DST-aware timezones and whether the behaviour was appropriate for those, I don't know personally (maybe Tim does). It may well be that it was "let's not think about it for now". If we can ever straighten out what the question is, maybe Guido can chip in and answer it :-) Paul PS Ideally, Guido could pop into his time machine and fix the whole issue at source. But apparently he can't, because the time machine isn't DST-aware...
On Mon, Jul 27, 2015 at 12:15 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
On Jul 27 2015, Lennart Regebro <regebro@gmail.com> wrote:
That you add one hour to it, and the datetime moves forward one hour in actual time? That's doable, but during certain circumstance this may mean that you go from 1AM to 1AM, or from 1AM to 3AM.
Or do you expect that adding one hour will increase the hour count with one, ie that the "wall time" increases with one hour? This may actually leave you with a datetime that does not exist, so that is not something you can consistently do.
Apologies for asking yet another dumb question about this, but I have the impression that a lot of other people are struggling with the basics here too.
I believe your questions are addressed to Lennart, but let me offer my answer to the first:
Can you tell us which of the two operations datetime currently implements?
The first one, but not as directly as one might wish. (I think the situation is similar to that of pytz's normalize(), but I am not an expert on pytz.)
t = datetime(2014,11,2,5,tzinfo=timezone.utc).astimezone() t.strftime("%D %T%z %Z") '11/02/14 01:00:00-0400 EDT' (t+timedelta(hours=1)).astimezone().strftime("%D %T%z %Z") '11/02/14 01:00:00-0500 EST'
On 27 July 2015 at 14:59, R. David Murray <rdmurray@bitdance.com> wrote:
I have a feeling that I'm completely misunderstanding things, since tzinfo is still a bit of a mystery to me.
You're not the only one :-) I think the following statements are true. If they aren't, I'd appreciate clarification. I'm going to completely ignore leap seconds in the following - I hope that's OK, I don't understand leap seconds *at all* and I don't work in any application areas where they are relevant (to my knowledge) so I feel that for my situation, ignoring them (and being able to) is reasonable. Note that I'm not talking about internal representations - this is purely about user-visible semantics. 1. "Naive" datetime arithmetic means treating a day as 24 hours, an hour as 60 minutes, etc. Basically base-24/60/60 arithmetic. 2. If you're only working in a single timezone that's defined as UTC or a fixed offset from UTC, naive arithmetic is basically all there is. 3. Converting between (fixed offset) timezones is a separate issue from calculation - but it's nothing more than applying the relevant offsets. 4. Calculations involving 2 different timezones (fixed-offset ones as above) is like any other exercise involving values on different scales. Convert both values to a common scale (in this case, a common timezone) and do the calculation there. Simple enough. 5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back). Are we OK to this point? This much comprises what I would class as a "naive" (i.e. 99% of the population ;-)) understanding of datetimes. The stdlib datetime module handles naive datetime values, and fixed-offset timezones, fine, as far as I can see. (I'm not sure that the original implementation included fixed-offset tzinfo objects, but the 3.4 docs say they are there now, so that's fine). Looking at the complicated cases, the only ones I'm actually aware of in practice are the ones that switch to DST and back, so typically have two offsets that differ by an hour, switching between the two at some essentially arbitrary points. If there are other more complex forms of timezone, I'd like to never need to know about them, please ;-) The timezones we're talking about here are things like "Europe/London", not "GMT" or "BST" (the latter two are fixed-offset). There are two independent issues with complex timezones: 1. Converting to and from them. That's messy because the conversion to UTC needs more information than just the date & time (typically, for example, there is a day when 01:45:00 maps to 2 distinct UTC times). This is basically the "is_dst" bit that Tim discussed in an earlier post. The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used). Once we have the extra information, though, doing conversions is just a matter of applying a set of rules. 2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations. So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode. It seems to me that the problem is that for this latter issue, it's the *timedelta* object that's not rich enough. You can't say "add 1 day, and by 1 day I mean keep the same time tomorrow" as opposed to "add 1 day, and by that I mean 24 hours"[1]. In some ways, it's actually no different from the issue of adding 1 month to a date (which is equally ill-defined, but people "know what they mean" to just as great an extent). Python bypasses the latter by not having a timedelta for "a month". C (and the time module) bypasses the former by limiting all time offsets to numbers of seconds - datetime gave us a richer timedelta object and hence has extra problems. I don't have any solutions to this final issue. But hopefully the above analysis (assuming it's accurate!) helps clarify what the actual debate is about, for those bystanders like me who are interested in following the discussion. With luck, maybe it also gives the experts an alternative perspective from which to think about the problem - who knows? Paul [1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones. Is that what Lennart has been trying to say in his posts?
On 2015-07-27 15:59, Paul Moore wrote:
On 27 July 2015 at 14:59, R. David Murray <rdmurray@bitdance.com> wrote:
I have a feeling that I'm completely misunderstanding things, since tzinfo is still a bit of a mystery to me.
You're not the only one :-)
I think the following statements are true. If they aren't, I'd appreciate clarification. I'm going to completely ignore leap seconds in the following - I hope that's OK, I don't understand leap seconds *at all* and I don't work in any application areas where they are relevant (to my knowledge) so I feel that for my situation, ignoring them (and being able to) is reasonable.
Note that I'm not talking about internal representations - this is purely about user-visible semantics.
Would it help if it was explicit and we had LocalDateTime and UTCDateTime?
1. "Naive" datetime arithmetic means treating a day as 24 hours, an hour as 60 minutes, etc. Basically base-24/60/60 arithmetic. 2. If you're only working in a single timezone that's defined as UTC or a fixed offset from UTC, naive arithmetic is basically all there is. 3. Converting between (fixed offset) timezones is a separate issue from calculation - but it's nothing more than applying the relevant offsets. 4. Calculations involving 2 different timezones (fixed-offset ones as above) is like any other exercise involving values on different scales. Convert both values to a common scale (in this case, a common timezone) and do the calculation there. Simple enough. 5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back).
Are we OK to this point? This much comprises what I would class as a "naive" (i.e. 99% of the population ;-)) understanding of datetimes.
The stdlib datetime module handles naive datetime values, and fixed-offset timezones, fine, as far as I can see. (I'm not sure that the original implementation included fixed-offset tzinfo objects, but the 3.4 docs say they are there now, so that's fine).
Looking at the complicated cases, the only ones I'm actually aware of in practice are the ones that switch to DST and back, so typically have two offsets that differ by an hour, switching between the two at some essentially arbitrary points. If there are other more complex forms of timezone, I'd like to never need to know about them, please ;-)
The timezones we're talking about here are things like "Europe/London", not "GMT" or "BST" (the latter two are fixed-offset).
There are two independent issues with complex timezones:
1. Converting to and from them. That's messy because the conversion to UTC needs more information than just the date & time (typically, for example, there is a day when 01:45:00 maps to 2 distinct UTC times). This is basically the "is_dst" bit that Tim discussed in an earlier post. The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used). Once we have the extra information, though, doing conversions is just a matter of applying a set of rules.
2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations. So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode.
It seems to me that the problem is that for this latter issue, it's the *timedelta* object that's not rich enough. You can't say "add 1 day, and by 1 day I mean keep the same time tomorrow" as opposed to "add 1 day, and by that I mean 24 hours"[1]. In some ways, it's actually no different from the issue of adding 1 month to a date (which is equally ill-defined, but people "know what they mean" to just as great an extent). Python bypasses the latter by not having a timedelta for "a month". C (and the time module) bypasses the former by limiting all time offsets to numbers of seconds - datetime gave us a richer timedelta object and hence has extra problems.
I don't have any solutions to this final issue. But hopefully the above analysis (assuming it's accurate!) helps clarify what the actual debate is about, for those bystanders like me who are interested in following the discussion. With luck, maybe it also gives the experts an alternative perspective from which to think about the problem - who knows?
Paul
[1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones. Is that what Lennart has been trying to say in his posts?
On 27 July 2015 at 16:26, MRAB <python@mrabarnett.plus.com> wrote:
Note that I'm not talking about internal representations - this is purely about user-visible semantics.
Would it help if it was explicit and we had LocalDateTime and UTCDateTime?
I don't see how. Why should I care about the internal representation? Paul
On Mon, Jul 27, 2015 at 10:59 AM, Paul Moore <p.f.moore@gmail.com> wrote:
The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used). Once we have the extra information, though, doing conversions is just a matter of applying a set of rules.
It is slightly more complicated than that. There are locations (even in the US, I've heard) where clocks have been moved back for reasons other than DST. I described one such example in my earlier post [1]. On July 1, 1990, at 2AM, the people of Ukraine celebrated their newly acquired independence by moving their clocks back to 1AM thus living through "1990-07-01T01:45" local time twice. This happened in the middle of summer and daylight savings time was in effect before and after the transition, so you cannot use isdst to disambiguate between the first and the second "01:45". On the other hand, these rare events are not that different from more or less regular DST transitions. You still have either a non-existent or ambiguous local times interval and you can resolve the ambiguity by adding 1 bit of information. The only question is what should we call the flag that will supply that information? IMO, "isdst" is a wrong name for dealing with the event I described above. [1]: https://mail.python.org/pipermail/python-dev/2015-April/139099.html
On Jul 27, 2015, at 10:37 AM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On the other hand, these rare events are not that different from more or less regular DST transitions. You still have either a non-existent or ambiguous local times interval and you can resolve the ambiguity by adding 1 bit of information. The only question is what should we call the flag that will supply that information? IMO, "isdst" is a wrong name for dealing with the event I described above.
While I see your point that isdst is the wrong name in that it doesn't describe what's actually happening in all cases, it is the most well known instance of the issue, and I personally think that using isdst for the other cases makes sense, and that they would disambiguate in the same direction that it would in a dst transition of the same type (clocks forward or backward).
On Mon, Jul 27, 2015 at 11:42 AM, Ryan Hiebert <ryan@ryanhiebert.com> wrote:
On Jul 27, 2015, at 10:37 AM, Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
On the other hand, these rare events are not that different from more or less regular DST transitions. You still have either a non-existent or ambiguous local times interval and you can resolve the ambiguity by adding 1 bit of information. The only question is what should we call the flag that will supply that information? IMO, "isdst" is a wrong name for dealing with the event I described above.
While I see your point that isdst is the wrong name in that it doesn't describe what's actually happening in all cases, it is the most well known instance of the issue, and I personally think that using isdst for the other cases makes sense, and that they would disambiguate in the same direction that it would in a dst transition of the same type (clocks forward or backward).
Well, my specific proposal in [1] was to also change the semantics. The proposed "which" flag would have the following meaning: 1. If local time is valid and unambiguous, "which" is ignored. 2. If local time is ambiguous, which=0 means the first and which=1 means the second (chronologically). 3. If local time is invalid, which=0 means the time extrapolated from before the transition and which = 1 means the time extrapolated from after the transition. Note that these rules have some nice properties: if t is ambiguous, UTC(t, which=0) < UTC(t, which=1) and if t is invalid, UTC(t, which=0) > UTC(t, which=1). This property can be used to take different actions in those cases. The result for ambiguous t and which=0 has a natural interpretation as time specified by a user not aware of the clock change. I think these rules are simpler and more natural than those for isdst which takes 3 values: 0, 1 and -1 and the rules for -1 vary between implementations. Under my proposal unspecified "which" means which=0. [1]: https://mail.python.org/pipermail/python-dev/2015-April/139099.html
[Paul Moore]
... I think the following statements are true. If they aren't, I'd appreciate clarification. I'm going to completely ignore leap seconds in the following - I hope that's OK, I don't understand leap seconds *at all* and I don't work in any application areas where they are relevant (to my knowledge) so I feel that for my situation, ignoring them (and being able to) is reasonable.
Guido will never allow any aspect of "leap seconds" into the core, although it's fine by him if someone wants to write their own tzinfo class to try to model them.
Note that I'm not talking about internal representations - this is purely about user-visible semantics.
1. "Naive" datetime arithmetic means treating a day as 24 hours, an hour as 60 minutes, etc. Basically base-24/60/60 arithmetic.
It also means that the tzinfo(s) member (if any) is(are) ignored. So not only leap seconds are ignored: 1. Possible DST transitions are ignored. 2. Possible changes to the base UTC offset are ignored. 3. Possible changes to the name of the time zone (even if "the rules" don't change) are ignored. 4. Everything else whatsover that could be learned from the tzinfo member is ignored. Note that in "aware" arithmetic, the current fromutc() implementation is only strong enough to account reliably for #1.
2. If you're only working in a single timezone that's defined as UTC or a fixed offset from UTC, naive arithmetic is basically all there is.
Yup!
3. Converting between (fixed offset) timezones is a separate issue from calculation - but it's nothing more than applying the relevant offsets.
Yup! Although that can't be exploited by Python: there's nothing in a tzinfo instance Python can query to discover the rules it implements.
4. Calculations involving 2 different timezones (fixed-offset ones as above) is like any other exercise involving values on different scales. Convert both values to a common scale (in this case, a common timezone) and do the calculation there. Simple enough.
Yup.
5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back).
Yup.
Are we OK to this point? This much comprises what I would class as a "naive" (i.e. 99% of the population ;-)) understanding of datetimes.
The stdlib datetime module handles naive datetime values, and fixed-offset timezones, fine, as far as I can see.
It ignores the possibility called #3 above (that some bureaucrat changed the name of a fixed-offset time zone despite that the offset didn't change). Everyone ignores #4, and always will ;-)
(I'm not sure that the original implementation included fixed-offset tzinfo objects, but the 3.4 docs say they are there now, so that's fine).
The original implementation supplied no tzinfo objects, only an abstract tzinfo base class.
Looking at the complicated cases, the only ones I'm actually aware of in practice are the ones that switch to DST and back, so typically have two offsets that differ by an hour,
Some number of minutes, anyway (not all DST transitions move by whole hours).
switching between the two at some essentially arbitrary points. If there are other more complex forms of timezone, I'd like to never need to know about them, please ;-)
#2 above is common enough, although there's not a _lot_ of base-offset-changing going on in current times.
The timezones we're talking about here are things like "Europe/London", not "GMT" or "BST" (the latter two are fixed-offset).
There are two independent issues with complex timezones:
1. Converting to and from them. That's messy because the conversion to UTC needs more information than just the date & time (typically, for example, there is a day when 01:45:00 maps to 2 distinct UTC times). This is basically the "is_dst" bit that Tim discussed in an earlier post. The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used).
"Flatly impossible" is more on target than "hard". In the case of text encoding, it's often possible to guess correctly by statistical analysis of the bytes. 01:45:00 in isolation gives no clue at all about whether standard or daylight time was intended. A similar point applies to some ambiguous cases when the base ("standard") UTC offset changes.
Once we have the extra information, though, doing conversions is just a matter of applying a set of rules.
Yup, and it's easy.
2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations. So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode.
Stick to naive time, and your head won't even hurt ;-) There is no "right" or "wrong" answer to this one: different apps can _need_ different behaviors for this. Python picked one to make dead easy ("naive"), and intended to make the other _possible_ via longer-winded (but conceptually straightforward) code.
It seems to me that the problem is that for this latter issue, it's the *timedelta* object that's not rich enough. You can't say "add 1 day, and by 1 day I mean keep the same time tomorrow" as opposed to "add 1 day, and by that I mean 24 hours"[1]. In some ways, it's actually no different from the issue of adding 1 month to a date (which is equally ill-defined, but people "know what they mean" to just as great an extent). Python bypasses the latter by not having a timedelta for "a month". C (and the time module) bypasses the former by limiting all time offsets to numbers of seconds - datetime gave us a richer timedelta object and hence has extra problems.
There's more to it than that. "Naive time" also wants, e.g., "01:45:00 tomorrow minus 01:45:00 today" to return 24 hours. Maybe the same thing in disguise, though.
I don't have any solutions to this final issue. But hopefully the above analysis (assuming it's accurate!) helps clarify what the actual debate is about, for those bystanders like me who are interested in following the discussion. With luck, maybe it also gives the experts an alternative perspective from which to think about the problem - who knows?
Paul
[1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones.
While perhaps that _could_ have been said at the start, it's a decade too late to say that now ;-)
Is that what Lennart has been trying to say in his posts?
Have to leave that to him to say. Various date-and-time implementations have all sorts of gimmicks. Possibilities raised in this thread so far kind of scratch the surface :-(
On 27 July 2015 at 22:10, Tim Peters <tim.peters@gmail.com> wrote:
1. Converting to and from them. That's messy because the conversion to UTC needs more information than just the date & time (typically, for example, there is a day when 01:45:00 maps to 2 distinct UTC times). This is basically the "is_dst" bit that Tim discussed in an earlier post. The semantic issue here is that users typically say "01:45" and it never occurs to them to even think about *which* 01:45 they mean. So recovering that extra information is hard (it's like dealing with byte streams where the user didn't provide details of the text encoding used).
"Flatly impossible" is more on target than "hard". In the case of text encoding, it's often possible to guess correctly by statistical analysis of the bytes. 01:45:00 in isolation gives no clue at all about whether standard or daylight time was intended. A similar point applies to some ambiguous cases when the base ("standard") UTC offset changes.
By "hard", what I meant was that you'd have to explain what you need to the user, and accept their answer, in the user interface to your application. Explaining why you need to know in a way that isn't totally confusing is what I classed as "hard". I wouldn't even consider trying to guess the user's intent. Although "if you don't say, I'll use naive datetimes" seems to me a plausible position to take if you want to allow the user not to care. Strange that this is how Python works... ;-) Paul
On Mon, Jul 27, 2015 at 2:10 PM, Tim Peters <tim.peters@gmail.com> wrote:
Guido will never allow any aspect of "leap seconds" into the core,
really? that is a shame (and odd) -- it's a trick, because we don't know what leap seconds will be needed in the future, but other than that, it's not really any different than leap years, and required for "proper" conversion to/from calendar description of time (at least for UTC, the GPS folks have their own ideas about all this). But leap seconds are the big red herring -- darn few people need them! It's pretty rare, indeed, to be expressing your time in gregorian dates, and also care about accuracy down to the seconds over centuries....
2. If you're only working in a single timezone that's defined as UTC
or a fixed offset from UTC, naive arithmetic is basically all there is.
Yup!
and remarkably useful!
5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back).
Yup.
which is a good reason to "store" your datetime in UTC, and do all the math there.
Are we OK to this point? This much comprises what I would class as a "naive" (i.e. 99% of the population ;-)) understanding of datetimes.
The stdlib datetime module handles naive datetime values, and fixed-offset timezones, fine, as far as I can see.
It ignores the possibility called #3 above (that some bureaucrat changed the name of a fixed-offset time zone despite that the offset didn't change).
Should the code ever care about a time zone's name? it seems that two tzinfo objects should only be considered the same if they ar the same instance. period. so not sure what the real issue is with (3)
2. Arithmetic within a complex timezone. Theoretically, this is simple
enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations.
what reasonable expectation does this not match?
So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode.
This is what I"ve been calling (is there a standard name for it?) a Calendar operation: " this time the next day" -- that could be 23, 24, or 25 hours, if you are bridging a DST transition -- but that kind of operation should not be in the stdlib -- unless, of course, an entire "work with calendar time" lib is added -- but that's a whole other story. for a datetime.timedelta -- a day is 24 hours is a day. period, always. So you can compute "one day from now", which is the same as "24 hours from now" or 1440 minutes from now, or ...., but you can't compute "this time tomorrow" -- not with a timedelta, anyway. Python picked one to make dead easy
("naive"), and intended to make the other _possible_ via longer-winded (but conceptually straightforward) code.
exactly -- you can extract the year, month, day -- add one to the day, and then go back. But then you'll need to do all the "30 days hath September" stuff - and leap years, and ... which is why all that is another ball of wax. And by the way -- doesn't dateutil have all that?
datetime gave us
a richer timedelta object and hence has extra problems.
it's only a little richer, and doesn't really add problems, just doesn't solve some common problems... There's more to it than that. "Naive time" also wants, e.g.,
"01:45:00 tomorrow minus 01:45:00 today" to return 24 hours. Maybe the same thing in disguise, though.
I think so -- but it's not a "problem" because the datetime module doesn't have any way to express "tomorrow" anyway.
[1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones.
they always give the same result -- even with complex time zones. I don't think timedelta evey needs to know about timezones at all. timedelta, is actually really, really simple, all it needs to know is how to translate various units into its internal representation (days, seconds, microseconds ?) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
[Paul Moore]
[Tim]
Guido will never allow any aspect of "leap seconds" into the core,
[Chris Barker <chris.barker@noaa.gov]
really? that is a shame (and odd) -- it's a trick, because we don't know what leap seconds will be needed in the future, but other than that, it's not really any different than leap years,
It couldn't be _more_ different in three key respects: 1) The leap year rules are algorithmic, while leap seconds are added at unpredictable times by decree. That means any attempt to support them across dozens of platforms also needs to find a way to distribute & synchronize leap-second info across all the platforms too, over & over & over again. Few want to volunteer for yet another never-ending new task. 2) Leap years visibly effect everyone on the planet who uses a calendar. Leap seconds don't. To the contrary, civil authorities do all they can to keep leap seconds "out of sight, out of mind". 3) If adding leap seconds had any visible semantic effect in Python, 99.99% of Python users would complain about being bothered with them. In contrast, removing leap-year support would cause 32.6% of Python users to complain - eventually ;-)
... But leap seconds are the big red herring -- darn few people need them!
Do you? If not, put your energy instead into something you _do_ need ;-) But if you do need them, write a patch and try to sneak it by Guido.
It's pretty rare, indeed, to be expressing your time in gregorian dates, and also care about accuracy down to the seconds over centuries....
Python's datetime supports microsecond precision. Mere seconds are for wimps ;-) ...
5. The problems all arise *only* with timezones whose UTC offset varies depending on the actual time (e.g., timezones that include the transition to DST and back).
which is a good reason to "store" your datetime in UTC, and do all the math there.
As mentioned earlier, uniform patterns in local time can be - typically because of DST transitions - "lumpy" in UTC. Like "every Friday at 3pm in time zone T" is trivial to deal with _in_ T using "naive" datetime arithmetic, but the corresponding datetimes in UTC are irregular (the number of hours between successive corresponding UTC times can slobber around between 167 and 169).. There are no one-size-fits-all solutions here. ....
It ignores the possibility called #3 above (that some bureaucrat changed the name of a fixed-offset time zone despite that the offset didn't change).
Should the code ever care about a time zone's name? it seems that two tzinfo objects should only be considered the same if they ar the same instance. period. so not sure what the real issue is with (3)
_Users_ care. That's why the tzname() method exists. When they display it, it's at best confusing if the name they see doesn't match the name they expect. But I'm not sure _how_ the name can get out of synch. Upon reflection, the specific case I had in mind was actually caused by incorrect coding of a tzinfo subclass. Maybe that's the only way a wrong name can be returned (apart from incorrect names in the base zoneinfo data files).
2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations.
what reasonable expectation does this not match?
The "every Friday at 3pm in time zone T" example comes to mind: trying to do _that_ arithmetic in UTC is an irregular mess. More generally, as mentioned before, the current fromutc() implementation can't deal reliably with a time zone changing its standard UTC offset (if it isn't clear, fromutc() is used in astimezone(), which latter is used for time zone conversions). Paul, do you have something else in mind for this one?
... " this time the next day" -- that could be 23, 24, or 25 hours, if you are bridging a DST transition -- but that kind of operation should not be in the stdlib -- unless, of course, an entire "work with calendar time" lib is added -- but that's a whole other story.
for a datetime.timedelta -- a day is 24 hours is a day. period, always.
But in "naive time", the difference between 3pm one day and 3pm the next day is also always 24 hours, period, always.
So you can compute "one day from now", which is the same as "24 hours from now" or 1440 minutes from now, or ...., but you can't compute "this time tomorrow" -- not with a timedelta, anyway.
To the contrary, for a dozen years "+ timedelta(days=1)" HAS computed "this time tomorrow", and exactly the same as "+ timedelta(hours=24)". Those have never done anything other in Python.
Python picked one to make dead easy ("naive"), and intended to make the other _possible_ via longer-winded (but conceptually straightforward) code.
exactly -- you can extract the year, month, day -- add one to the day, and then go back. But then you'll need to do all the "30 days hath September" stuff - and leap years, and ...
It sounds like you believe Python made the choice it didn't make: It made the "same time tomorrow" choice. The conceptually straightforward way to implement the other choice is to convert to UTC, do arithmetic in UTC, then convert back to the original time zone. Or, better when it applies, stick to UTC all the time, only converting to some other time zone for display purposes (if needed at all).
datetime gave us a richer timedelta object and hence has extra problems.
it's only a little richer, and doesn't really add problems, just doesn't solve some common problems...
Indeed, all timedelta really does is represent an integer number of microseconds, with a bunch of ways in the constructor to spare the user from having to remember how to convert from other duration units. That's just, in effect, syntactic sugar. Under the covers it's just a big integer (spelled in a weird mixed-radix system).
There's more to it than that. "Naive time" also wants, e.g., "01:45:00 tomorrow minus 01:45:00 today" to return 24 hours. Maybe the same thing in disguise, though.
I think so -- but it's not a "problem" because the datetime module doesn't have any way to express "tomorrow" anyway.
See above. Expressing "tomorrow" _is_ built in. Ditto "same time in N weeks", "one hour beyond the same time 113 days earlier", etc.
... even with complex time zones. I don't think timedelta evey needs to know about timezones at all. timedelta, is actually really, really simple, all it needs to know is how to translate various units into its internal representation (days, seconds, microseconds ?)
timedelta indeed knows nothing about time zones. People who hate Python's "naive single-zone arithmetic" should realize that's entirely about how datetime's arithmetic operators are implemented, nothing about how timedelta is implemented.
On 28/07/2015 05:26, Tim Peters wrote:
Python's datetime supports microsecond precision. Mere seconds are for wimps ;-)
Microseconds are for wimps https://bugs.python.org/issue22117 :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
Responses to several partial messages follow. [Lennart Regebro]
Then we can't implement timezones in a reasonable way with the current API, but have to have something like pytz's normalize() function or similar.
I'm sorry I've wasted everyones time with this PEP.
[ijs] I think that integrating pytz into the stdlib, which is what the PEP proposes, would be valuable even without changing datetime arithmetic. But I see ways to accomplish the latter without breaking backward compatibility. The dream ain't dead! See below. [Paul Moore]
2. Arithmetic within a complex timezone. Theoretically, this is simple enough (convert to UTC, do the calculation naively, and convert back). But in practice, that approach doesn't always match user expectations. So you have 2 mutually incompatible semantic options - 1 day after 4pm is 3pm the following day, or adding 1 day adds 25 hours - either is a viable choice, and either will confuse *some* set of users. This, I think, is the one where all the debate is occurring, and the one that makes my head explode.
It seems to me that the problem is that for this latter issue, it's the *timedelta* object that's not rich enough.
[ijs] Yes! This is the heart of the matter. We can solve *almost* all the problems by having multiple competing timedelta classes-- which we already have. Do you care about what will happen after a fixed amount of elapsed time? Use `numpy.timedelta64` or `pandas.Timedelta`. Want this time tomorrow, come hell, high water, or DST transition? Use `dateutil.relativedelta.relativedelta` or `mx.DateTime.RelativeDateTime`. As long as the timedelta objects we're using are rich enough, we can make `dt + delta` say what we mean. There's no reason we can't have both naive and aware arithmetic in the stdlib at once. All the stdlib needs is an added timedelta class that represents elapsed atomic clock time, and voila! The biggest problem that this can't solve is subtraction. Which timedelta type do you get by subtracting two datetimes? Sure, you can have a `datetime.elapsed_since(self, other: datetime, **kwargs) -> some_timedelta_type` that determines what you want from the kwargs, but `datetime.__sub__` doesn't have that luxury. I think the right answer is that subtraction should yield the elapsed atomic clock time, but that would be a backward-incompatible change so I don't put a high probability on it happening any time soon. See the last message (below) for more on this.
You can't say "add 1 day, and by 1 day I mean keep the same time tomorrow" as opposed to "add 1 day, and by that I mean 24 hours"[1]. In some ways, it's actually no different from the issue of adding 1 month to a date (which is equally ill-defined, but people "know what they mean" to just as great an extent). Python bypasses the latter by not having a timedelta for "a month". C (and the time module) bypasses the former by limiting all time offsets to numbers of seconds - datetime gave us a richer timedelta object and hence has extra problems.
Because of the limits on the values of its members, `datetime.timedelta` is effectively just a counter of microseconds. It can't distinguish between 1 day, 24 hours, 1440 minutes or 86400 seconds. They're all normalized to the same value. So it's not actually richer; it only appears so.
I don't have any solutions to this final issue. But hopefully the above analysis (assuming it's accurate!) helps clarify what the actual debate is about, for those bystanders like me who are interested in following the discussion. With luck, maybe it also gives the experts an alternative perspective from which to think about the problem - who knows?
Paul
[1] Well, you can, actually - you say that a timedelta of "1 day" means "the same time tomorrow" and if you want 24 hours, you say "24 hours" not "1 day". So timedelta(days=1) != timedelta(hours=24) even though they give the same result for every case except arithmetic involving complex timezones. Is that what Lennart has been trying to say in his posts?
I thought for a long time that this would be sufficient, and I still think it's a good spelling that makes it clear what the user wants most of the time, but I have wanted things like "the first time the clock shows 1 hour later than it shows right now" enough times that I no longer think this is quite sufficient. (I *think* you can do that with `dt + dateutil.relativedelta.relativedelta(hour=dt.hour+1, minute=0, second=0, microsecond=0)`, but I'm not sure.) [Tim Peters]
Ah, but it already happens that way - because the builtin datetime arithmetic is "naive". The docs have always promised this:
""" datetime2 = datetime1 + timedelta (1) datetime2 = datetime1 - timedelta (2)
1) datetime2 is a duration of timedelta removed from datetime1, moving forward in time if timedelta.days > 0, or backward if timedelta.days < 0. The result has the same tzinfo attribute as the input datetime, and datetime2 - datetime1 == timedelta after. OverflowError is raised if datetime2.year would be smaller than MINYEAR or larger than MAXYEAR. Note that no time zone adjustments are done even if the input is an aware object.
2) Computes the datetime2 such that datetime2 + timedelta == datetime1. As for addition, the result has the same tzinfo attribute as the input datetime, and no time zone adjustments are done even if the input is aware. This isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation can overflow in cases where datetime1 - timedelta does not. """
[ijs] Once we add the is_dst bit, this becomes a problem. You can't have this and have equality be a congruence (i.e., dt1 == dt2 implies dt1+td == dt2+td) unless you're willing to have the is_dst bit always be significant to equality, even when a time isn't ambiguous. Practically, this means that equality stops being a congruence, but failing to obey that invariant causes a lot of trouble. I have been remiss in not pointing this out, but it's wrong to assume that scientists use exclusively UTC. I got dragged into this mess because I was writing a piece of software to analyze circadian patterns of physical activity in our research subjects, which meant that in several cases we had a continuous record of data that crossed a DST boundary and we needed absolute durations between different times while caring about the local times between which those durations arose. The program started in ruby using ActiveSupport/Time (Rails's time bits) and got ported into python because ruby didn't have good enough support for scientific applications. I was able to get the program working using pandas's Timestamp class, which I think is more or less what Lennart wants to implement (minus all the cruft where it tries to interoperate with both datetime.datetime and numpy.datetime64), and which AFAICT seems to be the de facto standard for people in the science and finance worlds who need to deal with local times, absolute durations and relative durations all at the same time. ijs
On 7/27/2015 3:09 AM, Tim Peters wrote:
[Paul Moore <p.f.moore@gmail.com>]
.... As an example, consider an alarm clock. I want it to go off at 7am each morning. I'd feel completely justified in writing tomorrows_alarm = todays_alarm + timedelta(days=1).
[Lennart Regebro <regebro@gmail.com>]
That's a calendar operation made with a timedelta.
It's an instance of single-timezone datetime arithmetic, of the datetime + timedelta form. Your examples have been of the same form. Note that after Paul's
tomorrows_alarm = todays_alarm + timedelta(days=1)
it's guaranteed that
assert tomorrows_alarm - todays_alarm == timedelta(days=1)
will succeed too.
The "days" attribute here is indeed confusing as it doesn't mean 1 day, it means 24 hours.
Which, in naive arithmetic, are exactly the same thing.
I think using the word 'naive' is both inaccurate and a mistake. The issue is civil or legal time versus STEM time, where the latter includes applications like baking cakes. It could also be called calendar time versus elapsed time. (Financial/legal arithmetic versus STEM arithmetic is a somewhat similar contrast.) The idea that an hour can be sliced out of a somewhat random March day and inserting it into a somewhat random October day is rather sophisticated. It came from the minds of government bureaucrats. It might be smart, dumb, or just a cunning way for civil authorities to show who is in charge by making us all jump. But not 'naive'. 'Naive' means simple, primitive, or deficient in informed judgement. It is easy to take it as connoting 'wrong'. Tim, you have been arguing that civil/legal time arithmetic is not naive. Calling civil time naive undercuts this claim. -- Terry Jan Reedy
....
The "days" attribute here is indeed confusing as it doesn't mean 1 day, it means 24 hours.
Which, in naive arithmetic, are exactly the same thing.
[Terry Reedy]
I think using the word 'naive' is both inaccurate and a mistake. The issue is civil or legal time versus STEM time, where the latter includes applications like baking cakes.
Sorry, never heard of "STEM time" before - & a quick Google search didn't help.
It could also be called calendar time versus elapsed time. (Financial/legal arithmetic versus STEM arithmetic is a somewhat similar contrast.)
And I am, alas, equally unclear on what any of those others mean (exactly) to you.
The idea that an hour can be sliced out of a somewhat random March day and inserting it into a somewhat random October day is rather sophisticated. It came from the minds of government bureaucrats. It might be smart, dumb, or just a cunning way for civil authorities to show who is in charge by making us all jump. But not 'naive'.
I agree. Python's "naive time" single-timezone arithmetic intentionally ignores all that: it ignores leap seconds, it ignores DST transition points, it ignores governments deciding to change the base UTC offset within a pre-existing time zone, ... It's time soooo naive that it thinks 24 hours is the same thing as a day ;-)
'Naive' means simple, primitive, or deficient in informed judgement. It is easy to take it as connoting 'wrong'.
While some people in this thread seem convinced Python's naive time _is_ "wrong", it's not because it's called "naive". In any case, Guido decided to call it "naive" over 13 years ago, starting here, and that terminology has been in use ever since: https://mail.python.org/pipermail/python-dev/2002-March/020648.html
Tim, you have been arguing that civil/legal time arithmetic is not naive.
Yes. But that's not "an argument", it's a plain fact that Python's "naive time" (note that "naive" here is technical term, used widely in the datetime docs) is not civil/legal time (assuming I understand what you mean by that phrase).
Calling civil time naive undercuts this claim.
I don't see that I ever said civil time is naive. Adding a day is _not_ always the same as adding 24 hours in (at least Lennart's beliefs about) civil time. They _are_ always the same in Python's ("naive") datetime arithmetic. And the latter is all I said in the quote at the top of this msg. What am I missing? It's always something ;-)
On 7/27/2015 3:14 PM, Tim Peters wrote:
[Terry Reedy]
I think using the word 'naive' is both inaccurate and a mistake. The issue is civil or legal time versus STEM time, where the latter includes applications like baking cakes.
Sorry, never heard of "STEM time" before - & a quick Google search didn't help.
Searching for 'STEM' to discover the meaning of the acronym displays, for me, after the second "STEM is an acronym referring to the academic disciplines of science, technology, engineering and mathematics." STEM time is the time used in science, technology, engineering and mathematics, with the added note indicating that I mean for technology and engineering to be taken broadly, to include all uses of actual (natural) elapsed time, as opposed to occasionally artificial government time.
The idea that an hour can be sliced out of a somewhat random March day and inserting it into a somewhat random October day is rather sophisticated. It came from the minds of government bureaucrats. It might be smart, dumb, or just a cunning way for civil authorities to show who is in charge by making us all jump. But not 'naive'.
I agree. Python's "naive time" single-timezone arithmetic intentionally ignores all that: it ignores leap seconds, it ignores DST transition points, it ignores governments deciding to change the base UTC offset within a pre-existing time zone, ... It's time soooo naive that it thinks 24 hours is the same thing as a day ;-)
To me, having 1 day be 23 or 25 hours of elapsed time on the DST transition days, as in Paul's alarm example, hardly ignores the transition point. -- Terry Jan Reedy
[Terry Reedy <tjreedy@udel.edu>]
To me, having 1 day be 23 or 25 hours of elapsed time on the DST transition days, as in Paul's alarm example, hardly ignores the transition point.
It's 2:56PM. What time will it be 24 hours from now? If your answer is "not enough information to say, but it will be some minute between 1.56PM and 3:56PM inclusive", you want to call _that_ "naive"? I sure don't. You can only give such an answer if you're acutely aware of (for example) DST transitions. If you're truly naive, utterly unaware of the possibility of occasional time zone adjustments, then you give the obvious answer: 2:56PM. That's what Python's datetime arithmetic gives. That's naive in both technical and colloquial senses. You're only aware of that "2:56PM tomorrow" may be anywhere between 23 and 25 hours away from "2:56PM today" because you're _not_ ignoring possible transitions. So,. sure, I agree that your pointing it out "hardly ignores the transition point". But I wasn't talking about you ;-) I was talking about the arithmetic, which does thoroughly ignore it.
On Mon, Jul 27, 2015 at 11:55 AM, Terry Reedy <tjreedy@udel.edu> wrote:
I think using the word 'naive' is both inaccurate and a mistake.
<snip>
'Naive' means simple, primitive, or deficient in informed judgement. It is easy to take it as connoting 'wrong'.
In this context "naive" means "having no knowledge of timezone". And it does make some sense as a word to use in that case. I don't like it much, but it's the term used in the datetime module docs, so there you go. and infact, everything Tim said can also apply to UTC time. We've had a lot of discussion on teh numpy list about the difference between UTC and "naive" times, but for practicle putrposes, they are exactly the same -- unitl you try to convert to a known time zone anyway. But really, the points Tim was making are not about timezones at all -- but about two concepts: Time arithmetic with: 1) Time spans (timedeltas) -- this is an "amount" of time, and can be added, subtracted, etc to a datetime. such time spans have various appropriate units, like seconds, days. weeks -- but, as Tim pointed out, "years" is NOT an appropriate unit of timedeltas, and should not be allowed in any lib that uses them. 2) Calendar time arithmetic: this is things like "next year", "next week", "two years from now" -- these are quite tricky, and in some special cases have no obvious clear definition (leap years, etc...). Calendar manipulations like (2) should be kept completely separate from time span manipulation. Is anyone suggesting adding that to the standard lib? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
[Chris Barker]
... and infact, everything Tim said can also apply to UTC time. We've had a lot of discussion on teh numpy list about the difference between UTC and "naive" times, but for practicle putrposes, they are exactly the same -- unitl you try to convert to a known time zone anyway.
Yes, "naive arithmetic" is "correct" (by everyone's definition) in any time zone that has a fixed-for-all-eternity offset from UTC. UTC is the simplest case of that (with offset 0). So for all practical purposes you can think of a naive datetime as being "the time" in any eternally-fixed-offset time zone you like - or as in no time zone at all (the time zone _concept_ isn't necessary to grasp naive time - it only takes effort to _forget_ it). But the paranoid should consider that nothing can stop governments from changing the definition of UTC (or any other time zone). They'll have to pry your naive datetimes out of your computer's cold, dead disk drives though ;-)
... 2) Calendar time arithmetic: this is things like "next year", "next week", "two years from now" -- these are quite tricky, and in some special cases have no obvious clear definition (leap years, etc...).
Calendar manipulations like (2) should be kept completely separate from time span manipulation. Is anyone suggesting adding that to the standard lib?
It comes up, and would be useful to many. But it's the kind of thing waiting for an extension module to take the world by storm. If people think the bikeshedding in _this_ thread is excessive ... ;-)
[Tim]
The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back.
[Lennart]
We can't explicitly implement incorrect timezone aware arithmetic and then expect people to not use it.
Python didn't implement timezone-aware arithmetic at all within a single time zone. Read what I wrote just above. It implements naive arithmetic within a single time zone.
We can make the arithmetic correct,
The naive arithmetic within a timezone is already correct, by its own internal criteria. It's also useful (see the original discussions, or Paul Moore's recent brief account). That it's not the arithmetic you want doesn't make it "incorrect", it makes it different from what you want. That's fine - you're allowed to want anything ;-) But it's a dozen years too late to change that decision. Maybe for Python 4.
and we can raise an error when doing tz-aware arithmetic in a non-fixed timezone.
Sorry, I don't know what that means. Under any plausible interpretation, I don't see any need to raise an exception.
But having an implementation we know is incorrect
You really have to get over insisting it's incorrect. It's functioning exactly the way it was intended to function. It's _different_ from what you favor. Note that I'm not calling what you favor "incorrect". It's different. Both kinds of arithmetic are useful for different purposes, although I still agree with Guido's original belief that the current arithmetic is most useful most often for most programmers.
and telling people "don't do that" doesn't seem like a good solution here.
We don't tell people "don't do that". It's perfectly usable exactly as-is for many applications. Not all. For those applications needing the other kind of arithmetic, the convert-to/from-UTC dance was the intended solution.
Why do we even have timezone aware datetimes if we don't intend them for usage?
They are intended for usage. But a single way of using them is not suitable for all possible applications.
... Python's datetime never intended to support that directly.
I think it should.
Ya, I picked that up ;-) I don't, but it's too late to break backward compatibility regardless.
It's expected that it supports it,
By some people, yes. Not by all.
and there is no real reason not to support it.
Backward compatibility is a gigantic reason to continue with the status quo. See Paul Moore's post for a start on why naive arithmetic was picked to begin with.
The timezone handling becomes complicated if you base yourself on localtime, and simple if you base yourself on UTC.
That's an implementation detail unrelated (in principle) to how arithmetic works. Although as a practical matter it cuts both ways: naive local-time arithmetic is complicated if the internal time is stored in UTC, but simple if stored in local time.
As you agree, we recommend to people to use UTC at all times,
I recommend people don't use tzinfo at all if they can avoid it. Beyond that, there are many attractions to using UTC, and to explicitly use UTC. Not all applications need to care, though.
and only use timezones for input and output. Well, what I'm now proposing is to take that recommendation to heart, and change datetime's implementation so it does exactly that.
Suppose I'm correct in my belief that there's scant chance of getting approval for changing the default datetime arithmetic in Python 3 (or Python 2). Would you still be keen to replace the internals with UTC format? Note that there are many consequences to that implementation detail. For example, it was an explicit requirement of the datetime design that the month, day, hour, minute and second components be very cheap to extract. If you have to do conversion every time one is accessed, it's much slower; if you cache the "local time" components separately, the memory burden increases. Etc.
I saw the previous mention of "pure" vs "practical", and that is often a concern. Here it clearly is not. This is a choice between impure, complicated and impractical, and pure, simple and practical.
There is nothing in the datetime world simpler than naive arithmetic ;-) "Practical" is relevant to a specific application's specific needs, and neither kind of arithmetic is "practical" for all applications. Guido believed naive arithmetic is most practical overall. But even believing that too, datetime certainly "should be" beefed up to solve the _other_ problems: like resolving ambiguous times, and supporting the full range of zoneinfo possibilities
Is it the case that pytz also "fails" in the cases your attempts "fail"?
No, that is not the case. And if you wonder why I just don't do it like pytz does it, it's because that leads to infinite recursion, much as discussions on this mailing list does. ;-) And this is because we need to normalize the datetime after arithmatic, but normalizing is arithmetics.
If I could talk you out of trying to "fix" the arithmetic, all those headaches would go away and you could make swift progress on what remains. But I can't, so I won't try again ;-)
Ah, but it already happens that way
No, in fact it does not.
Yes, it does. But rather than repeat it all again, go back to the original message. I quoted the relevant Python docs in full, and they're telling the truth. They're talking about what happens _in a single time zone_. You go on to mix time zones again, and that's _not_ what the docs are talking about. Obviously, two times "an hour apart" on a local clock may or may not be an hour apart in UTC, and vice versa. The real problem here, I suspect, is that I keep talking about naive arithmetic, and you keep reading it as "an utterly broken implementation of what I really want". But neither what Python does, _nor_ what you want, is "correct" or "incorrect". They're different. They're each consistent within their own view of the world.
... It's not a question of changing datetime arithmetic per se. The PEP does indeed mean it has to be changed, but only to support ambiguous and non-existent times.
The latter have nothing to do with arithmetic. For example, in US Eastern 2:37 AM doesn't exist on the local clock on the day DST begins, and 1:48 AM is ambiguous on the day DST ends. See? Not one word about arithmetic. Those are just plain facts about how US Eastern works, so datetime could address them even if no user-visible datetime arithmetic of any kind were supported. Naive arithmetic doesn't care about them, but they're still potential issues when converting to/from other time zones (and language-supplied conversions have nothing to do with user-visible arithmetic semantics either).
It's helpful to me to understand, which I hadn't done before, that this was never intended to work. That helps me argue for changing datetimes internal implementation, once I get time to do that. (I'm currently moving, renovating a new house, trying fix up a garden that has been neglected for years, and insanely, write my own code editor, all at the same time, so it won't be anytime soon).
If I were you, I'd work on the code editor. That sounds like the most fun, and moving sounds like the least fun ;-)
The "changing arithmetic" discussion is a red herring.
I'm afraid it's crucial: the prospect of breaking programs (by changing arithmetic) that have worked for a dozen+ years is a major problem. If, e.g., some program "adds a week" to a datetime to schedule a new meeting "at the same time" next week, and that's suddenly off by an hour (one way or the other) because a DST transition just happened to occur across that week, people would rightfully scream bloody murder. Many cases like that "just work" _because_ single-zone arithmetic is naive. With "your kind" of arithmetic, conceptually trivial cases like that become a real pain in the ass to program correctly (not a coincidence: they're conceptually trivial because "adding a week" is, to virtually everyone's mind, an instance of naive datetime arithmetic - they're certainly not thinking anything akin to "what I really want is to convert to UTC, add 168 hours, and convert back again", and they don't want the _result_ of doing that either). So if you're determined to change arithmetic, it will have to be done in a backward-compatible way (and I briefly sketched such a way in an earlier message today).
Now my wife insist I help her pack, so this is the end of this discussion for me. If i continue it will be only as a part of discussing how we change how datetime works internally.
OK. But do try to get some sleep too! Sleep is good :-)
On Mon, Jul 27, 2015 at 01:04:03AM -0500, Tim Peters wrote:
[Tim]
The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back.
[Lennart]
We can't explicitly implement incorrect timezone aware arithmetic and then expect people to not use it.
Python didn't implement timezone-aware arithmetic at all within a single time zone. Read what I wrote just above. It implements naive arithmetic within a single time zone.
This usage of "time zone" is confusing. As far as I can tell, you really mean "UTC offset". A time zone would be something like "Europe/London", which has two different UTC offsets throughout the year (not to mention other historical weirdnesses), whereas arithmetic on a "timezone-aware" datetime is only going to work so long as you don't cross any of the boundaries where the UTC offset changes. I agree with you about pretty much everything else about datetime, just I find the terminology misleading. The only other thing I found really weird about datetime is how Python 2 had no implementation of a UTC tzinfo class, despite this being utterly trivial - but it's too late to do anything about that now, of course.
[Tim]
Python didn't implement timezone-aware arithmetic at all within a single time zone. Read what I wrote just above. It implements naive arithmetic within a single time zone.
[Jon Ribbens <jon+python-dev@unequivocal.co.uk>]
This usage of "time zone" is confusing.
Ha! _All_ usages of "time zone" are confusing ;-) This specific use pointed at something pretty trivial to state but technical: any instance of Python datetime arithmetic in which the datetime input(s) and, possibly also the output, share the same tzinfo member(s). Specifically: datetime + timedelta datetime - timedelta datetime - datetime Maybe another, but you get the idea. Those all do "naive datetime arithmetic", and in the last case the "same tzinfo member" part is crucial (if the input datetimes have different tzinfo members in subtraction, we're no longer "within a single time zone", and time zone adjustments _are_ made: it acts like both inputs are converted to UTC before subtracting; but not so if both inputs share their tzinfo members).
As far as I can tell, you really mean "UTC offset". A time zone would be something like "Europe/London", which has two different UTC offsets throughout the year (not to mention other historical weirdnesses), whereas arithmetic on a "timezone-aware" datetime : is only going to work so long as you don't cross any of the boundaries where the UTC offset changes.
In this context I only had in mind tzinfo members. They may represent fixed-offset or multiple-offset "time zones" (or anything else a programmer dreams up), or may even be None. The datetime implementation has no idea what they represent: the implementation can only judge whether two given tzinfo objects are or aren't the same object. So "within a single time zone" here just means there's only one tzinfo object in play.
I agree with you about pretty much everything else about datetime, just I find the terminology misleading. The only other thing I found really weird about datetime is how Python 2 had no implementation of a UTC tzinfo class, despite this being utterly trivial - but it's too late to do anything about that now, of course.
At the time, Guido ran his time machine forward, and saw that Stuart Bishop would soon enough supply all the time zones known to mankind ;-)
The only other thing I found
really weird about datetime is how Python 2 had no implementation of a UTC tzinfo class, despite this being utterly trivial -
Huh? it is either so trivial that there is no point -- simiply say that your datetimes are UTC, and you are done. Or it's not the least bit trivial -- the only difference between a UTC datetime and a "naive" datetime is that one can be converted to (or interact with) other time zones. Except that, as we know from this conversation, is very, very non-trivial! (Also, technically, UTC would use leap-seconds...) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, Jul 27, 2015 at 04:28:48PM -0700, Chris Barker wrote:
> The only other thing I found > really weird about datetime is how Python 2 had no implementation of > a UTC tzinfo class, despite this being utterly trivial -
Huh? it is either so trivial that there is no point -- simiply say that your datetimes are UTC, and you are done. Or it's not the least bit trivial -- the only difference between a UTC datetime and a "naive" datetime is that one can be converted to (or interact with) other time zones. Except that, as we know from this conversation, is very, very non-trivial!
No, it has nothing to do with conversions. The difference between a naive timezone and a UTC one is that the UTC one explicitly specifies that it's UTC and not "local time" or some other assumed or unknown timezone. This can make a big difference when passing datetime objects to third-party libraries, such as database interfaces.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/27/2015 02:04 AM, Tim Peters wrote:
The naive arithmetic within a timezone is already correct, by its own internal criteria. It's also useful (see the original discussions, or Paul Moore's recent brief account).
"Naive" alarm clocks (those which don't know from timezones) break human expectations twice a year, because their users have to be awake to fix them (or make the clock itself out-of-whack with real civil time for the hours between fixing and the actual transition). For confirmatoin, ask your local priest / pastor about the twice yearly mixups among congregants whose clocks don't self-adjust for the DST boundary: some show up late / early for church *every* time the local zone changes. I'd be in favor of dropping "days" from timedelta, myself, for just this reason: if "1 day" is the same for your usecase as "24 hours", then just do the math yourself. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJVtqPcAAoJEPKpaDSJE9HY4oEP+wUSWTeQS7cn3FLVBOeUV/lZ MIqZSnIGYOaSS6JDo2oTjm+yQWySEp5QMXHNYohPOkkkYDdu8L/r250KKb6F3fbo OMnNXlBCVHi66kFCs0x3+zQIlhSzkYV2FcT39gNu0llw5gODtmbbvYZE+CA4ej6R PIhnizyT7bXa+q2WYBrL0/+w/IBuv4H3d/x0b79cPZpqRZeI57k90qsee9SSPyDb MGs76IUOfJuZNruqfuY+zhFlfwB5kOt8U4kTlXZS4At1TKskoH5zuIiaHZooN6gy fBz3Zzt2XKYiWPWrzEbVeXrdXmAFRyr5sWqVQ0SliKA06rq1Tr5h53orGLidMaPe noUnz8YQHssY5e/kAbSUv6C93GNbldNEFOV1Ab03JT+NPrhNxPqi1ZGTJsMDc+Tl HI2I5C1TXW8ZPx/US2+Zt0yu0HX82EX03UPlRW4wZZSyKw7eCosF9fWXwufF9yTP 9v0otEB/x3rN1TJgc+7U4r1JmYPy+eYyjKs1xy58kb/a7awSvlmEeWvQelGqQKc+ lnRT6VxzVlgmTginq/5oHyFkI5OFTYuukuQDZx3ocd1g7EX42pNRYHVcMbZiQ9L5 DFKrENQDkegTkX+g1BUlVSW67smrFfki6Y7O/5R378x+q/sn6oqYe9334C+ccbz8 8jA16niF9EiwaxieLH9w =I31e -----END PGP SIGNATURE-----
[Tres Seaver <tseaver@palladion.com>]
"Naive" alarm clocks (those which don't know from timezones) break human expectations twice a year, because their users have to be awake to fix them (or make the clock itself out-of-whack with real civil time for the hours between fixing and the actual transition). For confirmatoin, ask your local priest / pastor about the twice yearly mixups among congregants whose clocks don't self-adjust for the DST boundary: some show up late / early for church *every* time the local zone changes.
Sure. I don't see how this applies to Python's arithmetic, though. For a start, you're talking about alarm clocks ;-) Note that "naive" is a technical term in the datetime context, used all over the datetime docs. However, I'm using "naive arithmetic" as a shorthand for what would otherwise be a wall of text. That's my own usage; the docs only apply "naive" and "aware" to date, time and datetime objects (not to operations on such objects).
I'd be in favor of dropping "days" from timedelta, myself, for just this reason: if "1 day" is the same for your usecase as "24 hours", then just do the math yourself.
timedelta objects only store days, seconds, and microseconds, which is advertised. It would be bizarre not to allow to set them directly. They're the only timedelta components for which instance attributes exist. In the timedelta constructor, it's the seconds, milliseconds, minutes, hours, and weeks arguments that exist solely for convenience. You could "do the math yourself" for all those too - but why would you want to make anyone do that for any of them? All the world's alarm clocks would remain just as broken regardless ;-) Even if days weren't a distinguished unit for timedelta, I'd still much rather write, e.g., timedelta(days=5, hours=3) than timedelta(hours=123) or timedelta(hours=5*24 + 3) etc. The intent of the first spelling is obvious at a glance.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/27/2015 06:03 PM, Tim Peters wrote:
Even if days weren't a distinguished unit for timedelta, I'd still much rather write, e.g.,
timedelta(days=5, hours=3)
than
timedelta(hours=123)
or
timedelta(hours=5*24 + 3)
etc. The intent of the first spelling is obvious at a glance.
- From a human's perspective, "a day from now" is always potentially unambigous, just like "a month from now" or "a year from now", whereas "24 hours from now" is never so. In a given application, a user who doesn't care can always write a helper function to generate hours; in an applicatino whose developer who *does* care, the 'days' argument to timedelta in its current does *not* help achieve her goal: it is an attractive nuisance she will have to learn to avoid. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBAgAGBQJVttPGAAoJEPKpaDSJE9HY0N4QAMQ+Q1/m6Dgg97aS1fRFrMA1 gi7lrWEuW0II0V9bOvB+j0IkASBahreauYb+MBnXXSy1JEDRpVQX7h4SHzLMQ+TF YSKnxCCY1UktpegkriZF7FoN8Rhv3egA01qWPO0RjHUkZym7/W2zN5cXhRTeij8N dbAcoT5y4gdii55T+edWjYNJDFihgKNEuh2KMPmMH37tYqOKCFsz1ojX2ox7e4dC 2yEACVz8G+bmUQQ/WXRKsM4pvMf616U9qkMcEYCVzqV+4smX+/z6c7gs244UVcr4 b4m6Du6UTNAtZpSkToYZvN9R2WbDmbG4FnUrF9eso7m1S2BjdlNyxJS7zGmp+Ttj XxmPeptC/INx8EaILYlB70gDDVztU+QBeolP9lfmfpY3srhI1a2uIGH2LhhOuy+F xcRoGaOIg3+JFyPa8ye6OAg6Vka9h+e02ZWaAAxfRhZgnnNduUnTaomuTKi8sCAa s3AHG4E5dOTJdLGxhgVEOSl9nqIJNmVxLxxb2utcS7W5G28KHYLzgV6w2r/fOkYf FvN5Lj6qQuQTPKdN807/7cl1fqOGPg4P74GMojVA816aNtjh4hTw/2AXqZ0Q0LTq QzhatRaDY+cu1SSZV9aDuCxvm4chjITucb6g7/dvR1xSY4Y+oxFgt3/KO2N5jJSY jBlJGbgGp9kukkwO2ret =Aw0Q -----END PGP SIGNATURE-----
On 28/07/2015 01:58, Tres Seaver wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/27/2015 06:03 PM, Tim Peters wrote:
Even if days weren't a distinguished unit for timedelta, I'd still much rather write, e.g.,
timedelta(days=5, hours=3)
than
timedelta(hours=123)
or
timedelta(hours=5*24 + 3)
etc. The intent of the first spelling is obvious at a glance.
- From a human's perspective, "a day from now" is always potentially unambigous, just like "a month from now" or "a year from now", whereas "24 hours from now" is never so. In a given application, a user who doesn't care can always write a helper function to generate hours; in an applicatino whose developer who *does* care, the 'days' argument to timedelta in its current does *not* help achieve her goal: it is an attractive nuisance she will have to learn to avoid.
To me a day is precisely 24 hours, no more, no less. I have no interest in messing about with daylight savings of 30 minutes, one hour, two hours or any other variant that I've not heard about. In my mission critical code, which I use to predict my cashflow, I use code such as. timedelta(days=14) Is somebody now going to tell me that this isn't actually two weeks? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[Mark Lawrence <breamoreboy@yahoo.co.uk>]
To me a day is precisely 24 hours, no more, no less. I have no interest in messing about with daylight savings of 30 minutes, one hour, two hours or any other variant that I've not heard about.
In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Precisely define what "two weeks" means, and then someone can answer. The timedelta in question represents precisely 14 24-hours days, and ignores the possibility that some day in there may suffer a leap second. If you add that timedelta to a datetime object, the result may not be exactly 14*24 hours in the future as measured by civil time (which includes things like DST transitions). The result will have the same local time on the same day of the week two weeks forward. For example, if you started with Monday the 6th at 3:45pm, the result will be Monday the 20th at 3:45;pm. Period. The time zone (if any is attached) is wholly ignored throughout. If a DST transition occurs across that period, then it's impossible to say how far removed (as measured by. say, an independent stopwatch) Monday the 20th at 3:45pm is from Monday the 6th at 3:45pm without also knowing the month, the year, and the exact local time zone rules in effect across the period. It remains unclear to me which of those outcomes _you_ consider to be "actually 14 days". But my bet is that you like what Python already does here (because "tz-naive arithmetic" is exactly what _I_ want in all my financial code).
On 28/07/2015 03:15, Tim Peters wrote:
[Mark Lawrence <breamoreboy@yahoo.co.uk>]
To me a day is precisely 24 hours, no more, no less. I have no interest in messing about with daylight savings of 30 minutes, one hour, two hours or any other variant that I've not heard about.
In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Precisely define what "two weeks" means, and then someone can answer.
One week == 7 days == 7 * 24 hours Two weeks = 2 * (one week)
The timedelta in question represents precisely 14 24-hours days, and ignores the possibility that some day in there may suffer a leap second.
As I've said elsewhere I've no interest in DST, at least right here, right now, let alone leap seconds. When I run my cashflow forecast the balance in my bank account one year from today isn't going to be influenced by UK clocks falling back to GMT at the end of October and on to BST at the end of next March.
It remains unclear to me which of those outcomes _you_ consider to be "actually 14 days". But my bet is that you like what Python already does here (because "tz-naive arithmetic" is exactly what _I_ want in all my financial code).
Correct. What I would like to know is how many people are in my position, how many people are in the situation of needing every possible combination of dates, times, daylight saving, local time zone rules and anything else you can think of under the sun, and how many are on the scale somewhere in between these two extremes. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
On Tue, Jul 28, 2015 at 1:55 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
One week == 7 days == 7 * 24 hours Two weeks = 2 * (one week)
Right, and that of course is not true in actual reality. I know you are not interested in DST, but with a timezone that has DST, two times a year, the above statement is wrong.
As I've said elsewhere I've no interest in DST, at least right here, right now, let alone leap seconds. When I run my cashflow forecast the balance in my bank account one year from today isn't going to be influenced by UK clocks falling back to GMT at the end of October and on to BST at the end of next March.
And then you should not use timezoned datetimes, but use naive ones. If you don't care about the timezone, then don't use it. Problem solved. It should be noted here that Python is one of the few languages that actually lets you do that. It's not very common to support time zone naive datetimes.
Correct. What I would like to know is how many people are in my position, how many people are in the situation of needing every possible combination of dates, times, daylight saving, local time zone rules and anything else you can think of under the sun, and how many are on the scale somewhere in between these two extremes.
There are a few positions. 1. Not caring. datetime supports that as of today. This is probably the most common case. That certainly is the case for me most of the time I need to do something with datetimes. It's usually measuring a few seconds of time or calculating dates. 2. Caring about time zones including DST's. IMO, this is the most common case once you care about time zones. You have several time zones, and you want conversion between them to work, and if you say one hour, you mean one hour. Datetime as of today does not support this, and Tim has declared that it never will, at least not before Python 4 (which amounts to much the same thing). 3. The position of Tim and Guido, which is "I want my time zone aware datetimes to ignore the time zone, except when converting to other time zones". I have yet to see a use case for that, and hence I am still not convinced that this position is useful, I think it is only based on misunderstanding. 4. ? Are there more positions, something I have missed? //Lennart
On 28/07/2015 13:35, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 1:55 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
One week == 7 days == 7 * 24 hours Two weeks = 2 * (one week)
Right, and that of course is not true in actual reality. I know you are not interested in DST, but with a timezone that has DST, two times a year, the above statement is wrong.
Tim asked for my definition of two weeks so I've given it. With respect to that in reality this is true, for me, with my application, making my statement above correct. For my application we could go from GMT to BST and back on successive days throughout the year and it wouldn't make any difference. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
On Tue, Jul 28, 2015 at 3:17 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
Tim asked for my definition of two weeks so I've given it. With respect to that in reality this is true, for me, with my application, making my statement above correct. For my application we could go from GMT to BST and back on successive days throughout the year and it wouldn't make any difference.
Right. You want a timezone naive datetime. Your usecase is covered, no problemo. //Lennart
On Tue, Jul 28, 2015 at 1:55 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
Correct. What I would like to know is how many people are in my position, how many people are in the situation of needing every possible combination of dates, times, daylight saving, local time zone rules and anything else you can think of under the sun, and how many are on the scale somewhere in between these two extremes.
I should also point out that this is not about supporting "everything under the sun" in any form at all. It's about whether the arithmetic on a *time zone aware* date time should use that time zone information, or if the time zone aware datetime should ignore the attached time zone. //Lennart
On Tue, Jul 28, 2015 at 3:22 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
To me a day is precisely 24 hours, no more, no less.
OK.
In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Yes, I'm telling you that, now. The two claims "One day is always precisely 24 hours" and "14 days is two weeks" are not both true. You have to choose one. //Lennart
On 28/07/2015 06:21, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 3:22 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
To me a day is precisely 24 hours, no more, no less.
OK.
In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Yes, I'm telling you that, now.
The two claims "One day is always precisely 24 hours" and "14 days is two weeks" are not both true. You have to choose one.
//Lennart
You can tell me, but as far as I'm concerned in my application both are true, so I don't have to choose one. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
On Tue, Jul 28, 2015 at 10:06 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
On 28/07/2015 06:21, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 3:22 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
To me a day is precisely 24 hours, no more, no less. In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Yes, I'm telling you that, now.
The two claims "One day is always precisely 24 hours" and "14 days is two weeks" are not both true. You have to choose one.
You can tell me, but as far as I'm concerned in my application both are true, so I don't have to choose one. (and subsequently) Tim asked for my definition of two weeks so I've given it. With respect to that in reality this is true, for me, with my application, making my statement above correct. For my application we could go from GMT to BST and back on successive days throughout the year and it wouldn't make any difference.
When your clocks go from winter time to summer time, there are two possibilities: 1) Your application says "days=14" and actually gets 167 or 169 hours 2) Your application says "days=14" and ends up with the time changing (Or equivalently if you say "days=1" or "hours=24" or whatever.) A naive declaration of "two weeks later" could conceivably mean either. When I schedule my weekly Dungeons & Dragons sessions, they are officially based on UTC [1], which means that one session starts 168 hours after the previous one. Currently, they happen when my local clock reads noon; in summer, my local clock will read 1PM. Was it still "a week later" when it was noon once and 1PM the next time? Conversely, my (also weekly) Thinkful open sessions are scheduled every week at 8AM US Eastern time (America/New_York). For anyone on the Atlantic coast of the US, they will occur every Wednesday and the clock will read 08:00 every time. Sometimes, one will happen 167 hours after the previous one, or 169 hours afterwards. Is that "a week later"? Your application has to make a choice between these two interpretations. This is a fundamental choice that MUST be made. Trying to pretend that your application doesn't care is like trying to say that Code Page 437 is good enough for all your work, and you can safely assume that one byte is one character is one byte. ChrisA [1] Leap seconds aren't significant, as people are often several minutes early or late, so UTC/UT1/GMT/TIA are all effectively equivalent.
On 28/07/2015 16:47, Chris Angelico wrote:
On Tue, Jul 28, 2015 at 10:06 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
On 28/07/2015 06:21, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 3:22 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
To me a day is precisely 24 hours, no more, no less. In my mission critical code, which I use to predict my cashflow, I use code such as.
timedelta(days=14)
Is somebody now going to tell me that this isn't actually two weeks?
Yes, I'm telling you that, now.
The two claims "One day is always precisely 24 hours" and "14 days is two weeks" are not both true. You have to choose one.
You can tell me, but as far as I'm concerned in my application both are true, so I don't have to choose one. (and subsequently) Tim asked for my definition of two weeks so I've given it. With respect to that in reality this is true, for me, with my application, making my statement above correct. For my application we could go from GMT to BST and back on successive days throughout the year and it wouldn't make any difference.
When your clocks go from winter time to summer time, there are two possibilities:
1) Your application says "days=14" and actually gets 167 or 169 hours 2) Your application says "days=14" and ends up with the time changing
My cashflow forecast doesn't give two hoots how many hours there are in two weeks, which I've defined elsewhere. It doesn't care if the time changes. Neither does it care how many days there are in a month for that matter. It can even cater with plotting data with a tick on the 29th of each month when we have a leap year and February is included in the plot, thanks to the dateutils rrule.
(Or equivalently if you say "days=1" or "hours=24" or whatever.)
A naive declaration of "two weeks later" could conceivably mean either. When I schedule my weekly Dungeons & Dragons sessions, they are officially based on UTC [1], which means that one session starts 168 hours after the previous one. Currently, they happen when my local clock reads noon; in summer, my local clock will read 1PM. Was it still "a week later" when it was noon once and 1PM the next time?
Don't know and don't care, your application is not working in the same way that mine does.
Conversely, my (also weekly) Thinkful open sessions are scheduled every week at 8AM US Eastern time (America/New_York). For anyone on the Atlantic coast of the US, they will occur every Wednesday and the clock will read 08:00 every time. Sometimes, one will happen 167 hours after the previous one, or 169 hours afterwards. Is that "a week later"?
Ditto my above remark.
Your application has to make a choice between these two interpretations. This is a fundamental choice that MUST be made. Trying to pretend that your application doesn't care is like trying to say that Code Page 437 is good enough for all your work, and you can safely assume that one byte is one character is one byte.
No.
ChrisA
[1] Leap seconds aren't significant, as people are often several minutes early or late, so UTC/UT1/GMT/TIA are all effectively equivalent.
Precisely my point. For me hours are not significant, days are. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
On Tue, Jul 28, 2015 at 3:22 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote: To me a day is precisely 24 hours, no more, no less.
Start with this line. Then proceed: On Wed, Jul 29, 2015 at 3:01 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
My cashflow forecast doesn't give two hoots how many hours there are in two weeks, which I've defined elsewhere. It doesn't care if the time changes. Neither does it care how many days there are in a month for that matter. It can even cater with plotting data with a tick on the 29th of each month when we have a leap year and February is included in the plot, thanks to the dateutils rrule.
Okay. So you do *not* care that a day be, or not be, 24 hours. Your code cares about days, and does not care if one of them happens to be 23 or 25 hours long. That's what's going on. To you, a day is *one day*, and has no correlation to 24 hours, 86400 seconds, 86,400,000 milliseconds, or the radiation period of caesium. That's a perfectly acceptable standpoint, but you MUST acknowledge that this is incompatible with the equally-acceptable standpoint that "1 day" == "24 hours". You cannot have both. ChrisA
Tres Seaver writes:
- From a human's perspective, "a day from now" is always potentially unambigous, just like "a month from now" or "a year from now", whereas "24 hours from now" is never so.
I gather you've never been a prof who told a student with aggravated "writer's block" she had 24 hours to produce a draft, and have her walk in 45 minutes early, apologizing profusely for being 15 minutes late! Humans always have a use case in mind. In *my* mind, I *meant* 24 hours or 84600 seconds (estimating process allocations of 2 hours inadvertant sleeping, 3 hours writing, and 19 hours fussing and/or panicking ;-), while the *student* *interpreted* it as "be here with a stack of paper at the same time tomorrow". You can say "that's just wordplay" (or more precisely, "that's a communication problem"). AFAICS, one way to view Tim's point (or Guido's point in the original decision) is that it's *always* a communication problem, and that Python should refuse to guess. Since communicating sufficiently accurate information about the mapping from any local time to time in any civil system is always difficult (and impossible in the case of civil times one legislative session or more in the future), Python chose naive time arithmetic and naive time classes to represent it (FVO "naive" equivalent to "what Tim said"). In other words, datetime and timedelta implement the only calculations it was feasible to "just get it right" at the time (and I would say that because of the communication problem the alternative use case is *still* an application problem, not a library problem). Steve
I was going to jump in and explain the rationale for the original design and why we shouldn't change it, but I just realized that Tim Peters has been explaining this position already, and instead I am going to mute this thread. Please switch to python-ideas or to the new datetime-specific list (if it's ever created -- personally I think it's a waste) and change the subject if you need me to chime in. -- --Guido van Rossum (python.org/~guido)
On Tue, Jul 28, 2015 at 12:03 AM, Tim Peters <tim.peters@gmail.com> wrote:
timedelta objects only store days, seconds, and microseconds,
Except that they don't actually store days. They store 24 hour periods, which, because of timezones changing, is not the same thing. This is also clearly intended, for example timedelta allows floats, and will convert the fraction into seconds. And as you have repeated many times now, the datetime module's arithmetic is "naive" ie, it assumes that one day is always 24 hours. The problem with that assumption is that it isn't true.
[Tim]
timedelta objects only store days, seconds, and microseconds,
[Lennart Regebro <regebro@gmail.com>]
Except that they don't actually store days. They store 24 hour periods,
Not really. A timedelta is truly an integer number of microseconds, and that's all. The internal division into days, seconds and microseconds is a mixed-radix scheme designed to make extracting some common units of duration more efficient than by using division on a single long integer all the time. That's an implementation detail, although one that is exposed.
which, because of timezones changing, is not the same thing.
24 hours is 24 hours at any time in _any_ time zone, ignoring leap seconds. timedeltas are durations, not points in time. "time zones" make no sense applied to durations.
This is also clearly intended, for example timedelta allows floats, and will convert the fraction into seconds.
I suspect I'm missing your meaning here. Have a specific example to illustrate it? For example, are you talking about things like this?
timedelta(weeks=0.5) datetime.timedelta(3, 43200)
If so, what of it? A week is _defined_ to be 7 days in timedelta, where a day is in turn defined to be 24 hours, where ... until you finally get down to microseconds. None of this has anything to do with points in time or time zones. It's entirely about duration. In the example, a week turns out to be 604800000000 microseconds. Half of that is 302400000000 microseconds. Convert that into mixed-radix days-seconds-microseconds representation, and you're left with 3 days and 43200 seconds (with 0 microseconds left over). I don't see anything remarkable about any of that - perhaps you just object to the use of the _word_ "day" in this context? It's just a word, and there's nothing remarkable either about viewing a duration of "a day" as being a duration of "24 hours". It's a timedelta - a duration. UTC offsets of any kind have nothing to do with pure durations, they only have to do with points in time. Calling "a day" 24 hours _in the context_ of a timedelta is not only unobjectionable, calling a day anything else in this entirely zone-free context would be insane ;-)
And as you have repeated many times now, the datetime module's arithmetic is "naive"
But only when staying within a single time zone. For example, if dt1 and dt2 have different tzinfo members, dt1 - dt2 acts as if both were converted to UTC first before doing subtraction. "Naive time" doesn't come into play _across_ time zones, only _within_ a time zone. When mixing time zones, there's no plausible justification for ignoring either of 'em. So they're not ignored then.
ie, it assumes that one day is always 24 hours.
That's really not what it's doing, although the analogy is sometimes used in explanations. What somedatetime+timedelta really does is simpler than that: it adds the number of microseconds represented by the timedelta to somedatetime, being careful with carries across the assorted time and date components. That's all. There are no assumptions about what any of it "means". What it _doesn't_ do is consult the tzinfo member about anything, and _that's_ the true source of your complaints. It so happens that, yes, naive datetime arithmetic does always treat "a day" as 24 hours (and "a week" as 7 days, and "a minute" as 60 seconds, and so on), but not really because it's assuming anything about what days, weeks, etc "mean". It's working with microseconds, and it's giving the result you'd get from working on somedatetime.replace(tzinfo=None) instead, except it doesn't actually remove the tzinfo member.
The problem with that assumption is that it isn't true.
There isn't really an assumption here. "Naive time" has no concept of "time zone", which isn't "an assumption" so much as a _requirement_ of the design. You can legitimately object that this requirement is at odds with reality in some cases. And that's true: it is. But that's also been known since the start. It's _intentionally_ at odds with reality in some cases, because it was believed that a simpler approximation to reality would be most useful most often to most applications and programmers. And you've heard from some of them here. Note that the same principle is at work in other aspects of datetime's design. For example, the proleptic Gregorian calendar is itself a simplified approximation to reality. In historical terms it's a relatively recent invention, and even now it's not used in much of the world. So what? It does little harm to most applications to pretend that, e.g., 3 March 1012 is a valid Gregorian date, but simplifies their lives, although some obsessed calendar wonk may be outraged by such a bold fiction ;-) It's all a "practicality beats purity" thing, but weaker than many such things, because in this case _sometimes_ naive arithmetic is _not_ the most practical thing. It has been in every dateime application I ever wrote, but I recognize that's not the universal experience.
On Tue, Jul 28, 2015 at 8:11 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
timedelta objects only store days, seconds, and microseconds,
[Lennart Regebro <regebro@gmail.com>]
Except that they don't actually store days. They store 24 hour periods,
Not really. A timedelta is truly an integer number of microseconds, and that's all.
That's what I said. Timedeltas, internally assume that 1 day is 24 hours. Or 86400000 microseconds. That's the assumption internally in the timedelta object. The problem with that being that in the real world that's not true.
24 hours is 24 hours at any time in _any_ time zone, ignoring leap seconds. timedeltas are durations, not points in time. "time zones" make no sense applied to durations.
My point exactly. And should not then adding 86400000 microseconds to a datetime actually result in a datetime that happens 86400000 microseconds later?
ie, it assumes that one day is always 24 hours.
That's really not what it's doing
That is really exactly what the timedelta is doing, as you yourself, just a few lines above say.
used in explanations. What somedatetime+timedelta really does is simpler than that: it adds the number of microseconds represented by the timedelta to somedatetime,
No it doesn't.
[delightful new insight elided, all summarized by what remains ;-) ] [Tim]
What somedatetime+timedelta really does is simpler than that: it adds the number of microseconds represented by the timedelta to somedatetime,
[Lennart]]
No it doesn't.
Lennart, I wrote the code. Both the Python and C datetime implementations. I know exactly what it does, and these repetitive denials can't change that. Well, maybe they can. I really am just assuming they can't ;-) Here's, e.g., the Python code for datetime.__add__: def __add__(self, other): "Add a datetime and a timedelta." if not isinstance(other, timedelta): return NotImplemented delta = timedelta(self.toordinal(), hours=self._hour, minutes=self._minute, seconds=self._second, microseconds=self._microsecond) delta += other hour, rem = divmod(delta.seconds, 3600) minute, second = divmod(rem, 60) if 0 < delta.days <= _MAXORDINAL: return datetime.combine(date.fromordinal(delta.days), time(hour, minute, second, delta.microseconds, tzinfo=self._tzinfo)) raise OverflowError("result out of range") There's not much to it. Apart from the year, month and date parts of `self` (which are converted to an integer number of days), all the rest is just adding integer microseconds expressed in two distinct mixed-radix systems. What part of that inspires your "No it doesn't"? It quite obviously does, if you understand the code. Your real objection (whether you realize it or not) is that it's not converting to UTC at the start and back to self._tzinfo after. But what does it matter? I'm done with this line. You can get what you want, but it has to (according to me) be done in a backward-compatible way (and see other msgs for ideas in that direction).
On 28/07/2015 07:54, Lennart Regebro wrote:
On Tue, Jul 28, 2015 at 8:11 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
timedelta objects only store days, seconds, and microseconds,
[Lennart Regebro <regebro@gmail.com>]
Except that they don't actually store days. They store 24 hour periods,
Not really. A timedelta is truly an integer number of microseconds, and that's all.
That's what I said. Timedeltas, internally assume that 1 day is 24 hours. Or 86400000 microseconds. That's the assumption internally in the timedelta object.
The problem with that being that in the real world that's not true.
In my real world it is. We clearly have parallel worlds. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence
[Tim]
Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity".
Heh. It's even sillier than that - the Saudis never used "Riyadh Solar Time", and it's been removed from release 2015e of the tz database: https://www.ietf.org/timezones/data/NEWS Release 2015e - 2015-06-13 10:56:02 -0700 ... The files solar87, solar88, and solar89 are no longer distributed. They were a negative experiment - that is, a demonstration that tz data can represent solar time only with some difficulty and error. Their presence in the distribution caused confusion, as Riyadh civil time was generally not solar time in those years. Looking back, Paul Eggert explained more in 2013, but it took this long for the patch to land: http://comments.gmane.org/gmane.comp.time.tz/7717 > did Saudi Arabia really use this as clock time? Not as far as I know, for civil time. There was some use for religious purposes but it didn't use the approximation in those files. These files probably cause more confusion than they're worth, so I'll propose a couple of patches to remove them, in two followup emails. I haven't pushed these patches to the experimental github version. The position of the sun is vital to establishing prayer times in Islam, but that's got little to do with civil time in Islamic countries. And Olson didn't take his "Riyadh Solar Time" rules from the Saudis, he made up the times himself: "Times were computed using formulas in the U.S. Naval Observatory's Almanac for Computers 1987[89]". The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits. So, as a motivating example, it's hard to get less compelling: Riyadh Solar is a wholly artificial "time zone" made up by a time zone wonk to demonstrate some limitations of the tz database he maintained. Although I expect he could have done so just as effectively by writing a brief note about it ;-)
The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits.
Little known fact: if you have a sub-minute-resolution UTC offset when a leap second hits, it rips open a hole in the space-time continuum and you find yourself in New Netherlands. ijs ________________________________________ From: Tim Peters <tim.peters@gmail.com> Sent: Saturday, July 25, 2015 00:07 To: ISAAC J SCHWABACHER Cc: Alexander Belopolsky; Lennart Regebro; Python-Dev Subject: Re: [Python-Dev] Status on PEP-431 Timezones [Tim]
Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity".
Heh. It's even sillier than that - the Saudis never used "Riyadh Solar Time", and it's been removed from release 2015e of the tz database: https://www.ietf.org/timezones/data/NEWS Release 2015e - 2015-06-13 10:56:02 -0700 ... The files solar87, solar88, and solar89 are no longer distributed. They were a negative experiment - that is, a demonstration that tz data can represent solar time only with some difficulty and error. Their presence in the distribution caused confusion, as Riyadh civil time was generally not solar time in those years. Looking back, Paul Eggert explained more in 2013, but it took this long for the patch to land: http://comments.gmane.org/gmane.comp.time.tz/7717 > did Saudi Arabia really use this as clock time? Not as far as I know, for civil time. There was some use for religious purposes but it didn't use the approximation in those files. These files probably cause more confusion than they're worth, so I'll propose a couple of patches to remove them, in two followup emails. I haven't pushed these patches to the experimental github version. The position of the sun is vital to establishing prayer times in Islam, but that's got little to do with civil time in Islamic countries. And Olson didn't take his "Riyadh Solar Time" rules from the Saudis, he made up the times himself: "Times were computed using formulas in the U.S. Naval Observatory's Almanac for Computers 1987[89]". The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits. So, as a motivating example, it's hard to get less compelling: Riyadh Solar is a wholly artificial "time zone" made up by a time zone wonk to demonstrate some limitations of the tz database he maintained. Although I expect he could have done so just as effectively by writing a brief note about it ;-)
[Tim]
The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits.
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
Little known fact: if you have a sub-minute-resolution UTC offset when a leap second hits, it rips open a hole in the space-time continuum and you find yourself in New Netherlands.
Tell me about it! Last time that happened I had to grow stinking tulips for 3 years to get enough money to sail back home. I'll never use a sub-minute-resolution UTC offset again ;-)
From: Tim Peters <tim.peters@gmail.com> Sent: Saturday, July 25, 2015 00:14 To: ISAAC J SCHWABACHER Cc: Alexander Belopolsky; Lennart Regebro; Python-Dev Subject: Re: [Python-Dev] Status on PEP-431 Timezones
[Tim]
The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits.
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
Little known fact: if you have a sub-minute-resolution UTC offset when a leap second hits, it rips open a hole in the space-time continuum and you find yourself in New Netherlands.
Tell me about it! Last time that happened I had to grow stinking tulips for 3 years to get enough money to sail back home. I'll never use a sub-minute-resolution UTC offset again ;-)
I meant this one: https://what-if.xkcd.com/54/ :) ijs
From: Tim Peters <tim.peters@gmail.com> Sent: Friday, July 24, 2015 20:39 To: ISAAC J SCHWABACHER Cc: Alexander Belopolsky; Lennart Regebro; Python-Dev Subject: Re: [Python-Dev] Status on PEP-431 Timezones
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
... I disagree with the view Tim had of time zones when he wrote that comment (and that code). It sounds like he views US/Eastern and US/Central as time zones (which they are), but thinks of the various America/Indiana zones as switching back and forth between them, rather than being time zones in their own right
You can think of them anyway you like. The point of the code was to provide a simple & efficient way to convert from UTC to local time in all "time zones" in known actual use at the time; the point of the comment was to explain the limitations of the code. Although, as Allexander noted, the stated assumptions are stronger than needed.
I think the right perspective is that a time zone *is* the function that its `fromutc()` method implements,
Fine by me ;-)
My issue is that you're computing `fromutc()`, which is a function, in terms of `dst()` and `utcoffset()`, which aren't. I think this is backwards; `dst()` and `utcoffset()` should be computed from `fromutc()` plus some additional information that has to be present anyway in order to implement `fromutc()`. With the extra bit, `dst()` and `utcoffset()` become partial functions, which makes it *possible* to get the right answer in all cases, but it's still simpler to start with the total function and work from there.
although of course we need additional information in order to actually compute (rather than merely mathematically define) its inverse. Daylight Saving Time is a red herring,
Overstated. DST is in fact the _only_ real complication in 99.99% of time zones (perhaps even 99.9913% ;-) ). As the docs say, if you have some crazy-ass time zone in mind, fine, that's why fromutc() was exposed (so your; crazy-ass tzinfo class can override it).
I stand by what I meant by this, even if I did a bad job of expressing the point. Assuming that all time zone discontinuities are due to DST changes breaks many time zones (really almost all of the Olson time zones, though only for a vanishingly small fraction of datetimes), but that's not the point I was making. The point is that it doesn't buy us anything. Though this is probably obscured by all the markup, the more general algorithm I gave is also simpler than the one in the comment in datetime.py, and the reason for that is that it solves an easier problem, but one that serves our practical purposes just as well.
and assumptions 2 and 4
Nitpick: 4 is a consequence of 2, not an independent assumption.
in that exposition are just wrong from this point of view.
As above, there is no particular POV in this code: just a specific fromutc() implementation, comments that explain its limitations, and an invitation in the docs to override it if it's not enough for your case.
I went too far in inferring your viewpoint from your code. I don't find fault with the explanation on its own terms. But adding zoneinfo to the stdlib, as PEP 431 proposes to do, requires making weaker assumptions and asking a different question than the one answered in the comment.
In the worst case, Asia/Riyadh's two years of solar time completely shatter these assumptions.
Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity".
As a mathematician at heart, I have a deep and abiding conviction, which I expect nobody else to share, that purity begets practicality in the long run. At least if you've found the right abstraction.
[eliding a more-general view of what time zones "really" are]
[note for people just joining this conversation: I think the information in the elision is critical to understanding what I'm talking about]
I'm not eliding it because I disagree with it, but because time zones are political constructions. "The math" we make up may or may not be good enough to deal with all future political abominations; for example:
... This assumption would be violated if, for example, some jurisdiction decided to fall back two hours by falling back one hour and then immediately falling back a second hour. I recommend the overthrow of any such jurisdiction and its (annexation by the Netherlands)[3].
That's not objectively any more bizarre than Riyadh Solar Time. Although, if I've lived longer than you, I may be more wary about the creative stupidity of political schemes ;-)
It's not, you have, and you probably are. :) But these assumptions didn't come out of nowhere. They're the assumptions behind zoneinfo, weakened as much as possible without making the problem any harder. It's hard to weaken them further and still have anything to work with. (See? I *do* still have a sense of practicality!) Nobody wants to read me discussing this at great length, but I'll say that I don't expect any legislative body to have the collective mathematical sophistication necessary to violate piecewise continuity or computability. If you really want to troll me, I invite you to take over a government and institute a time zone based on the (Weierstrass function)[1].
... (Lennart, I think this third assumption is the important part of your "no changes within 48 hours of each other" assumption,
The "48 hours" bit came from Alexander. I'm personally unclear on what Lennart's problems are.
Whoops!
... All of these computations can be accomplished by searches of ordered lists and applications of $fromlocal_i$.
Do you have real-world use cases in mind beyond supporting long-abandoned Riyadh Solar time?
Parts of five US states (Alaska, North Dakota, Indiana, Kentucky and Michigan) have changed their standard time since 1970. But I'll admit that mentioning Riyadh was a low blow.
... With this perspective, arithmetic becomes "translate to UTC, operate, translate back", which is as it should be.
There _was_ a POV in the datetime design about that: no, that's not how it should be. Blame Guido ;-) If I add, say, 24 hours to noon today, I want to get noon tomorrow, and couldn't care less whether DST started or stopped (or any other political adjustment was made) in between. For that reason, it was wholly intentional that datetime + timedelta treats datetime as "naive". If that's not what someone wants, fine, but then they don't want Python's datetime arithmetic BTW, there's no implication that they're "wrong" for wanting something different; what would be wrong is insisting that datetime's POV is "wrong". Both views are valid and useful, depending on the needs of the application. One had to picked as the built-in behavior, and "naive" won.
Sigh. This offends my sensibilities so much, but I've said my bit on this elsewhere on this list, and I don't think I have the right abstraction to cut this Gordian knot. Point conceded.
... But IIUC what Lennart is complaining about
I don't, and I wish he would be more explicit about what "the problem(s)" is(are).
is the fact that the DST flag isn't part of and can't be embedded into a local time, so it's impossible to fold the second parameter to $fromlocal$ into $t$. Without that, a local time isn't rich enough to designate a single point in time and the whole edifice breaks.
You can blame Guido for that too ;-) , but in this case I disagree(d) with him: Guido was overly (IMO) annoyed by that the only apparent purpose for a struct tm's tm_ isdst flag was to disambiguate local times in a relative handful of cases. His thought: an entire bit just for that?! My thought: get over it, it's one measly bit.
I could have sworn that the last thing I saw Guido post about the previous point was something along the lines of "oops", but I bet it was really about this. ijs [1]: https://en.wikipedia.org/wiki/Weierstrass_function
[ISAAC J SCHWABACHER <ischwabacher@wisc.edu>]
... I think the right perspective is that a time zone *is* the function that its `fromutc()` method implements,
[Tim]
Fine by me ;-)
[Isaac]
My issue is that you're computing `fromutc()`, which is a function, in terms of `dst()` and `utcoffset()`, which aren't.
I know. That's not "an issue" that will gain traction, though ;-)
I think this is backwards; `dst()` and `utcoffset()` should be computed from `fromutc()` plus some additional information that has to be present anyway in order to implement `fromutc()`.
Memory lane: that additional information doesn't exist now. I think it "should have", but at the time, as I recall there was fatal opposition to storing an `isdst` flag because it would consume an extra byte in the pickle format. That was enough to kill it: datetime development was paid for by a company very concerned about pickle sizes ;-)
With the extra bit, `dst()` and `utcoffset()` become partial functions, which makes it *possible* to get the right answer in all cases, but it's still simpler to start with the total function and work from there.
Well, maybe simpler for you, but I think not in general. At the time, all aspects of datetime's development were vigorously debated, but mostly on Zope Corp (the company paying for it) wikis and mailing lists. While some people didn't care about time zones at all, most did. Of the latter: - All were keenly aware of the need to incorporate UTC offsets. - All were keenly aware of the need to accommodate "daylight time" schemes. - None gave a fig about anything else. Very late in the game, examples were given of localities that had in fact changed their UTC offsets from time to time, but as curiosities rather than as "issues". That's when I created fromutc() - it was a last-second addition. I cared enough to make it _possible_ to accommodate such cases, but there was no interest (or time) to cater to them directly. Instead fromutc() was written to use only the already-existing utcoffset() and dst(). Everyone already knew how to use the latter: they directly corresponded to the two things everyone cared about keenly from the start. That doesn't _preclude_ anyone from writing a more-general fromutc(), and I encourage, for example, you to do so ;-) I agree it's the most fundamental thing from an abstract mathematical view, but "UTC offset" and "DST offset" fit most peoples' brains a hell of a lot better than "collection of piecewise continuous monotonically increasing functions whose images don't overlap too much" ;-)
.... Daylight Saving Time is a red herring,
Overstated ....
I stand by what I meant by this, even if I did a bad job of expressing the point. Assuming that all time zone discontinuities are due to DST changes breaks many time zones (really almost all of the Olson time zones, though only for a vanishingly small fraction of datetimes),
It's use cases that are missing here: who needs to convert historic times to/from UTC. and where the "problem times" are generally arranged by politicians to occur when most people are sleeping? That's why nobody really cared about offset-changing zones at the start. Yes, such zones exist, but times recorded in such zones are in yesterday's databases we don't care about anymore except maybe to display the values.
but that's not the point I was making. The point is that it doesn't buy us anything.
Au contraire: as above, it bought datetime a model people thought they understood at once, since almost everyone has wrestled with UTC offsets and daylight-time switches in ordinary daily life. Implement utcoffset() and dst(), and you're done. Even if you're really not, you _think_ you are, so you slumber peacefully then ;-)
Though this is probably obscured by all the markup, the more general algorithm I gave is also simpler than the one in the comment in datetime.py, and the reason for that is that it solves an easier problem, but one that serves our practical purposes just as well.
It's heavily obscured by the heavy markup. Write some Python code instead? I expect few people will try to untangle the meaning otherwise. As for whether it's simpler - eh, don't know. Here's the actual code, stripped of error-checking: def fromutc(self, dt): dtoff = dt.utcoffset() dtdst = dt.dst() delta = dtoff - dtdst if delta: dt += delta dtdst = dt.dst() return dt + dtdst Will your code run faster? Have fewer conditionals? Fewer lines? Shorter lines? Less nesting? Fewer operations? Important to me, though, is that your code should be far more self-evidently _correct_, provided the reader understands the math underlying it (which will require - as this code does - referring to a relatively massive wall of text to explain it).
... I went too far in inferring your viewpoint from your code. I don't find fault with the explanation on its own terms. But adding zoneinfo to the stdlib, as PEP 431 proposes to do, requires making weaker assumptions and asking a different question than the one answered in the comment.
pytz is already in wide use, yes? How many complaints are there about non-functioning cases? I have no idea.
As a mathematician at heart, I have a deep and abiding conviction, which I expect nobody else to share, that purity begets practicality in the long run. At least if you've found the right abstraction.
Guido is a mathematician by training, yet has an opposing view in this case. So: resolved, there's no point in asking mathematicians about anything, since they never agree ;-)
[eliding a more-general view of what time zones "really" are]
[note for people just joining this conversation: I think the information in the elision is critical to understanding what I'm talking about]
By all means, yes! Note that I found it helpful to paste it into a web-based LaTeX renderer first, because it was close to impossible to read otherwise.
... But these assumptions didn't come out of nowhere. They're the assumptions behind zoneinfo, weakened as much as possible without making the problem any harder. It's hard to weaken them further and still have anything to work with. (See? I *do* still have a sense of practicality!)
Good to hear ;-)
Nobody wants to read me discussing this at great length, but I'll say that I don't expect any legislative body to have the collective mathematical sophistication necessary to violate piecewise continuity or computability. If you really want to troll me, I invite you to take over a government and institute a time zone based on the (Weierstrass function)[1].
No, politicians don't surprise with their sophistication, but with their stupidity :-) For example, when Mike Huckabee becomes US President in 2016, I expect he'll put the US on Apocalypse Time. This will count time _backwards_ starting from now and ending with 0:00:00.000000 at the stroke of (what we now think of as) midnight at the end of (what we now think of as) 2099. And there goes "strictly increasing" ;-)
Parts of five US states (Alaska, North Dakota, Indiana, Kentucky and Michigan) have changed their standard time since 1970.
For example, North Dakota's Mercer County switched from US Mountain to US Central on or about 7 Nov 2010. That's the kind of thing you're talking about? Eh. I don't believe anyone anywhere had a "time zone" designed specifically for Mercer County before - and probably still doesn't. But, sure, if somebody really wants to create such a beast, they should be able to. Everyone else in the world will use US Central for it now, and never know the difference ;-)
But I'll admit that mentioning Riyadh was a low blow.
It was fun! I just wish all examples were that engagingly bizarre :-)
... For that reason, it was wholly intentional that datetime + timedelta treats datetime as "naive". ...
Sigh. This offends my sensibilities so much, but I've said my bit on this elsewhere on this list, and I don't think I have the right abstraction to cut this Gordian knot. Point conceded.
I should point out that I haven't paid any attention to datetime for some years now (other than being a happy casual user), so don't know what you previously said, and can't even say whether Guido is still of the same opinion. However, if he had changed his mind, he would have used his time machine to change all he previously wrote about it, and I didn't see any evidence of that.
participants (28)
-
Alexander Belopolsky
-
Brett Cannon
-
Chris Angelico
-
Chris Barker
-
David Mertz
-
Ethan Furman
-
Glenn Linderman
-
Guido van Rossum
-
ISAAC J SCHWABACHER
-
Isaac Schwabacher
-
Jon Ribbens
-
Lennart Regebro
-
Mark Lawrence
-
MRAB
-
Nick Coghlan
-
Nikolaus Rath
-
Paul Moore
-
R. David Murray
-
Ronald Oussoren
-
Ryan Hiebert
-
Stephen J. Turnbull
-
Steve Dower
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Tim Peters
-
Tres Seaver
-
Łukasz Rekucki