[Datetime-SIG] Timeline arithmetic?
Tim Peters
tim.peters at gmail.com
Sun Sep 13 19:25:35 CEST 2015
[Carl Meyer]
> Well, sure. Of course it is possible to use "arithmetic on POSIX
> timestamps" within an implementation of either kind of arithmetic, if
> you try hard enough; I've never said anything to the contrary (that
> would be a provably silly thing to say).
"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""
"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""
And for "try hard enough" here, "hard enough" amounted to "trivial" ;-)
> What your code does make clear is that if you convert from a DST-using
> timezone to a POSIX timestamp, do "arithmetic on POSIX timestamps" and
> then do a normal (what you would in any other context call a "correct")
> conversion back to the first timezone afterwards, the result you get is
> timeline arithmetic.
How else can you do timelime arithmetic? Zones are _defined_ as
offsets from UTC now.
> Sure, if you do a specific sort of weird (what you would in any other
> context call "wrong") conversion from the POSIX timestamp
There are only two contexts: Model A and Model B. So your "any other
context" means simply "Model A", and, yes, a Model B conversion looks
"wrong" to your Model A eyes. It's equally true that a Model A
conversion looks "wrong" to Model B eyes. The code shows concretely
how arbitrary this choice is. It's just a difference in how POSIX
timestamps are _labelled_. It has nothing to do with the low-level
arithmetic itself.
> back to the other timezone afterward, then you can get classic
> arithmetic instead. I'm not sure what you think that demonstrates
"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""
"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""
> I think it demonstrates that both timeline and classic arithmetic _can_ be
> described in terms that include "arithmetic on POSIX timestamps," but
> timeline arithmetic is much more naturally seen that way.
To you, obviously. But _on its own_ (devoid of any imposed
labellings), POSIX timestamp arithmetic is _solely_ arithmetic on
seconds-counts in UTC. There is no distinction between classic and
timeline arithmetic in UTC (or in any other fixed-offset zone).
Classic arithmetic is no more than "let's just pretend our clock is
already showing UTC, do the arithmetic, then stop pretending". By
Occam's Razor, that's as "natural" as anything gets ;-)
> Your original assertion was that "Classic arithmetic is equivalent to
> doing integer arithmetic on integer POSIX timestamps"
It is. So is timeline arithmetic. The difference is in labeling, not
in the arithmetic.
> as a justification for why datetime chose classic arithmetic,
Sorry, I don't recall trying to "justify" that choice beyond noting
that there _was_ a choice, and one was overwhelmingly better suited to
Guido's novel "naive time" model, while best practice for the other
was already established in C via converting to UTC and back (whether
spelled via a UTC tzinfo or via POSIX timestamps). There was no
agonizing over that decision: the best way to proceed was obvious
_given that_ "naive time" was the primary model in mind.
> implying that classic arithmetic is somehow _more_ or _more naturally_
> seen as "equivalent to integer arithmetic on integer POSIX timestamps" than
> timeline arithmetic. I found that assertion puzzling, and I still do.
To me, it's dead easy to implement either kind of higher-level
arithmetic via POSIX timestamp arithmetic, although it's easi_est_ to
implement classic via the "just pretend at both ends" trick - no
conversions are actually needed on either end.
> I'd still conclude the same thing I already said in an earlier reply:
>
> """
> So, "timeline arithmetic is just arithmetic on POSIX timestamps" means
> viewing all aware datetimes as isomorphic to POSIX timestamps.
You're missing here that there isn't a _unique_ isomorphism. The code
concretely showed that, at the higher level of datetime arithmetic,
you can get either timeline or classic arithmetic depending on _which_
isomorphism you pick. The isomorphism is about the labeling, not
about the POSIX timestamp arithmetic.
> "Classic arithmetic is just arithmetic on POSIX timestamps" means
> viewing aware datetimes as naive datetimes which one can pretend are in
> a hypothetical (maybe UTC, if you like) fixed-offset timezone which is
> isomorphic to actual POSIX timestamps (even though their actual timezone
> may not be fixed-offset).
That's why I wanted to show code ;-) The entire distinction is in
the single if/else clause at the end. It doesn't require piles of
words.
> I accept that those are both true and useful in the implementation of
> their respective model. I just don't think either one is inherently
> obvious or useful as a justification of their respective mental models;
> rather, which one you find "obvious" just reveals your preferred mental
> model.
> """
I'm not trying to "justify" anything. I'm trying to say that "POSIX
timestamp arithmetic" on its own says nothing about which kind of
higher-level arithmetic one sees. That's in the lableling. Which
labeling you need _becomes_ obvious only after you identify the
higher-level model you want.
>> That adds an aware datetime to a timedelta, doing either classic or
>> timeline arithmetic depending on the optional flag. If you want to
>> claim this doesn't do either kind of arithmetic correctly, prove it
>> with a specific example
> I'm not sure why you'd think I'd have any issue with that code, or any
> desire to prove it wrong.
"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""
"""
>> Translation: "I refuse to countenance the possibility of Model A."
"""
[...]
>> I believe you have _pictured_ the POSIX timestamp number line
>> annotated with local calendar notations in your head, but those labels
>> have nothing to do with the timestamp arithmetic.
> It would be more accurate to say that a Model A view pictures only a
> single timeline, which is physical (Newtonian) time. A point on that
> timeline is an instant. Any given instant is annotated with any number
> of labels, each one a unique and unambiguous description of that instant
> in some labeling system. A labeling system can be very simple (e.g.
> POSIX timestamps), less simple (proleptic Gregorian in UTC, or to a
> lesser extent any fixed-offset timezone), or slightly ridiculous
> (timezones with folds and gaps, where now we need a `fold` attribute or
> an explicit offset at each instant or something similar to keep each
> label unique and unambiguous). This mental model implies (and requires)
> that all of these labeling systems are isomorphic to each other and to
> the physical-time timeline, and that arithmetic in any of them is
> isomorphic to arithmetic in any other (and is thus obviously timeline
> arithmetic).
Regardless, the "labels have nothing to do with the timestamp arithmetic".
> Really my only point in this entire thread has been that this model
> (contrary to some of the denigration of it on this mailing list) is
> actually quite intuitive, not difficult to teach, and possible to do all
> sorts of useful work in (_even_ when you have to also teach pytz's
> unfortunate API for it). If you can agree with that - great, we're done
> here. If you don't agree with that, we may as well still be done,
> because I have too much personal experience suggesting it to be true for
> you to be likely able to convince me otherwise :-)
If that was indeed your only point, then yes - there was again no need
for any of this ;-)
> I've also come to recognize, through this thread, that Model B (where
> the "local clock time in a given timezone" "timeline" is elevated to
> sort-of-equal status with the physical timeline, rather than just
> considered a weird complex labeling system for physical time) is also
> useful (more useful for some tasks) and makes intuitive sense too.
It does suffer the drawback of not matching how clocks in the real
world actually behave ;-)
[...]
>> 1. The "as_utc -= ofs" line is theoretically impure, because it's
>> treating a local time _as if_ it were a UTC time. There's no real way
>> around that. We have to convert from local to UTC _somehow_, and
>> POSIX dodges the issue by providing mktime() to do that "by magic".
>> Here we're _inside_ the sausage factory, doing it ourselves. Some rat
>> guts are visible at this level. If you look inside a C mktime()
>> implementation, you'll find rat guts all over that too.
> This seems like a really hand-wavy rationalization of an operation that
> can only really be described as an incorrect timezone conversion.
Perhaps you missed that "as_utc -= ofs" is _also_ needed to implement
timeline arithmetic? In fact, it's not _necessary_ to get the effect
of classic arithmetic. It is necessary to implement timeline
arithmetic: zones are defined as offsets from UTC, and doing POSIX
timestamp arithmetic _requires_ converting to UTC first. How else are
you going to do that, other than by subtracting the zone's UTC offset
to convert to UTC?
> Of course that incorrect timezone conversion operation is useful for
> implementing classic arithmetic in the way you've implemented it, but
> taken out of that context it's just an incorrect conversion.
Nonsense: it is exactly the conversion "you" need at the start to
correctly convert to UTC in Model A. Unless you do that first, you
can't use "POSIX timestamp arithmetic" at all.
> The reason you _need_ that incorrect conversion is because for some
> reason you're really wanting to do your arithmetic in terms of POSIX
> timestamps
I needed it for two reasons. First, to implement timeline arithmetic
using POSIX timestamps (a problem you seem to wish away by viewing the
labels you want as being _inherently_ attached to the POSIX timestamp
number line - but they're not - the only labels defined by POSIX are
to and from the propleptic Gregorian calendar viewed in UTC). Second,
to address your:
"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""
> (which are defined as being in UTC), but you don't _really_
> want correct conversion to UTC and back (because if you do that, you'll
> get timeline arithmetic).
As above, it's really Model A that needs that conversion. Model B can
live without it (and, in the actual Python implementation of classic
arithmetic, doesn't bother with conversion on either end).
As to "correct" conversion, that depends on which model you intend to
implement. The "right" conversion at the end is "wrong" for the other
model.
>> But it's no problem for Guido ;-) We just set the hands on a UTC
>> clock to match the local clock, then move the hands on the UTC clock
>> by the amount the local clock is "ahead of" or "behind" UTC. In that
>> way you can indeed picture the operation as being entirely "in UTC".
> Sure, you can, if you're motivated enough :-)
>> 2. This would be a foolish _implementation_ of classic arithmetic, but
>> not for semantic reasons. It's just grossly inefficient. Stare at
>> the code, and in the classic case it subtracts the UTC offset at first
>> only to add the same offset back later. Those cancel out, so there's
>> no _semantic_ need to do either.. It's only excessive concern for
>> theoretical purity that could stop one from spelling it as
>>
>> return dt + td
>>
>> from the start. That's technically absurd, since it's doing POSIX
>> timestamp arithmetic on a timestamp that's _not_ a UTC seconds count.
>> Its only virtue is that it gets the same answer far faster ;-)
> I actually think this implementation would be _less_ technically absurd.
> I'm not sure why you'd insist that any arithmetic on a count of seconds
> must be "POSIX timestamp arithmetic."
Because I was addressing _your_ claims about POSIX timestamp arithmetic, like:
"""
>>>> Classic arithmetic may be many things, but the one thing it definitively is
>>>> _not_ is "arithmetic on POSIX timestamps."
"""
To address that specific claim, I stuck solely to "arithmetic on POSIX
timestamps".
> In this case you're just doing integer arithmetic on a naive count of seconds
> since some point in the local timezone clock, rather than on a count of
> seconds in UTC. That's a much more natural way to view classic arithmetic,
>:and also happens to be the way datetime actually does it (where "some
> point" is datetime(1, 1, 1)).
It can be viewed either way. A count of microseconds since 0001-01-01
00:00:00 0.0 is certainly more natural given knowledge of Python
internals, but it's just a linear transformation between that notion
and viewing it as a POSIX timestamp instead. As shown before, that's
why "by hand" code to convert a UTC datetime to or from a POSIX
timestamp (either integer or floating) is so trivial to write.
More information about the Datetime-SIG
mailing list