Re: [Datetime-SIG] PEP-431/495
On 26 August 2015 at 15:16, Tim Peters <tim.peters@gmail.com> wrote:
In this case, I'm not fussed if the datetime instance has a 2 state or 3 state flag. This is different to the various constructors which I think need a 3 state flag in their arguments. True, False, None.
As things seem to have progressed later, mapping pytz's explicit time checking into a more magical scheme sprayed all over the datetime internals is not straightforward, So, as I concluded elsewhere, that may or may not be done someday, but it's out of scope for PEP 495. I'm a fan of making progress ("now is better than never", where the latter was PEP 431's fate waiting for perfection on all counts).
Yup.
Classic behaviour as you describe it is a bug.
Believe me, you won't get anywhere with that approach.
- Classic arithmetic is the only kind that makes good sense in the "naive time" model, which _is_ datetime's model.
[... sensible stuff trimmed ...]
That said, I would have preferred it if Python's datetime had used classic arithmetic only for naive datetimes. I feared it might be endlessly confusing if an aware datetime used classic arithmetic too. I'm not sure about "endlessly" now, but it has come up more than once ;-) Far too late to change now, though.
I'm wondering if it is worth formalizing this (post-PEP-495,or maybe some choice wording changes made in the docs). Would it work if we introduced a new type, datetimetz? We would have a time, with a tzinfo because it might be useful later, a naive time, with a tzinfo because it is useful for rendering and conversions, and a datetimetz with all the complexities and slowdowns of timeline arithmetic. While not changing the behaviour of datetime at all, we could get cats and dogs living together by just clarifying what it actually is.
If you use pytz tzinfo instances, adding 1 second always adds one second
But only in the _model_ you have in mind: real-life clocks showing real-life civil time suffer from leap seconds too. You can laugh that off in _your_ apps (and I can too ;-) ), but for other apps it's dead serious.
If our underlying platforms that we needed to work with supported it, I'd probably be in favour of leap seconds. I doubt that would ever happen - there are more palatable workarounds.
and adding 1 day always adds 24 hours.
That's also true of classic arithmetic. The meanings of "day" and "24 hours" also depend on the model in use.
I think in my view, as soon as you go to the bother of adding a tzinfo instance to the datetime you are making a statement about the expected behaviour; that the simpler classic arithmetic no longer applies and the more complex model needs to be used.
def dt_add(dt, td): return dt.tzinfo.fromutc(dt + (td - dt.utcoffset()))
There you go: "timeline" datetime + timedelta arithmetic about as efficiently as possible in pure Python. Note that _if_ the default changed to timeline arithmetic, this code would no longer work. The "+" there requires classic arithmetic to get the right result. Change the default, this code would break too. I find it hard to imagine I'm the only person in the world who has code similarly taking advantage of what Python actually does.
Example:
I see. What I don't like about this approach is the developers need to be aware that they need to call it, and that dt + timedelta(hours=24) may not work. Of course, developers will not be aware or have done more than skim the docs until after their guests have all died of salmonella poisoning from the undercooked Turkey. Its one of the reasons I'm wondering if something more in your face like the datetimetz proposal above would be an improvement. Stop making me hungry dammit.
However... this also means the new flag on the datetime instances is largely irrelevant to pytz. pytz' API will need to remain the same.
My hope was that 495 alone would at least spare pytz's users from needing to do a `.normalize()` dance after `.astimezone()` anymore. Although I'm not clear on why it's needed even now.
Instead of one tzinfo instance, there are dozens for your timezone. The datetime implementation does not give pytz the opportunity to choose which one is used when constructing the datetime, so localize is needed to sort that. Similarly, arithmetic does not always give pytz the opportunity to choose which one is used after crossing a timezone boundary, so normalize is needed to sort that out. While the results of the timeline arithmetic are unambiguous and obvious, they are arguably incorrect until normalize puts things right.
See the "PEP-495 - Strict Invalid Time Checking" thread for more. There seems to be increasing "feature creep" here. Rewriting vast swaths of datetime internals to cater to this is at best impractical, especially compared to supplying a "check this datetime" function users who care can call when they care. Nevertheless, it's a suitable subject for a different PEP. I don't want to bog 495 down with it. If it had _stopped_ with asking for an optional check in the datetime constructor, it may have been implemented already ;-)
Yup. I think I'm after hooks to replace localize on construction and normalize after arithmetic, so users don't have to be relied on to do this explicitly. This doesn't need to happen now, and I fully understand this could be considered fast path and the overhead unacceptable.
The important bit here for pytz is that tzinfo.fromutc() may return a datetime with a different tzinfo instance.
Sorry, didn't follow that. Of course you can write your .fromutc() to return anything you want.
I can't find what concerned me any more. I think there was some wording along the lines of 'the result will be used to initialize the first flag'. What I'm reading now on fromutc() though looks fine, so I think I was mixed up.
- My argument in favour of 'is_dst' over 'first' is that this is what we have in the data we are trying to load. You commonly have a timestamp with a timezone abbreviation and/or offset. This can easily be converted to an is_dst flag.
You mean by using platform C library functions (albeit perhaps wrapped by Python)?
I really missed an answer to that ;-)
I think all the data we have access to, including from platform C library functions, uses the is_dst flag or is simpler to map to the is_dst flag. The C library as exposed by the time.struct_time gives you is_dst. Mapping that to first/fold means first doing doing two conversions and determining which one comes first. Similarly, when loading your JSON file or examining email headers you need to load in a string like '2004-04-04 02:30:00 EDT-05:00'. Its simple to use a lookup table to map the abbreviation + offset to an is_dst flag. Its harder to map it to first/fold because they are swapped around in April and October. And there can be more than two transitions in a year, so if you need to support that your going to need to do the lookup, construct a couple of instances, and compare to work out if EDT or EST comes first that month in that year. But, really, I hate all the options for the flag name. I lean towards is_dst mainly because people are used to it.
But there's every reason to be optimistic: even someone as old and in-the-way as me doesn't find any of this particularly confusing ;-)
I may be old, but at least I'm not as old as Tim ;) -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
participants (1)
-
Stuart Bishop