[Datetime-SIG] PEP-431/495

Stuart Bishop stuart at stuartbishop.net
Thu Aug 27 15:12:33 CEST 2015


On 26 August 2015 at 15:16, Tim Peters <tim.peters at gmail.com> wrote:

>> In this case, I'm not fussed if the datetime instance has a 2 state or
>> 3 state flag. This is different to the various constructors which I
>> think need a 3 state flag in their arguments. True, False, None.
>
> As things seem to have progressed later, mapping pytz's explicit time
> checking into a more magical scheme sprayed all over the datetime
> internals is not straightforward,  So, as I concluded elsewhere, that
> may or may not be done someday, but it's out of scope for PEP 495.
> I'm a fan of making progress ("now is better than never", where the
> latter was PEP 431's fate waiting for perfection on all counts).

Yup.


>> Classic behaviour as you describe it is a bug.
>
> Believe me, you won't get anywhere with that approach.
>
> - Classic arithmetic is the only kind that makes good sense in the
> "naive time" model, which _is_ datetime's model.

[... sensible stuff trimmed ...]

> That said, I would have preferred it if Python's datetime had used
> classic arithmetic only for naive datetimes.  I feared it might be
> endlessly confusing if an aware datetime used classic arithmetic too.
> I'm not sure about "endlessly" now, but it has come up more than once
> ;-)  Far too late to change now, though.

I'm wondering if it is worth formalizing this (post-PEP-495,or maybe
some choice wording changes made in the docs). Would it work if we
introduced a new type, datetimetz? We would have a time, with a tzinfo
because it might be useful later, a naive time, with a tzinfo because
it is useful for rendering and conversions, and a datetimetz with all
the complexities and slowdowns of timeline arithmetic. While not
changing the behaviour of datetime at all, we could get cats and dogs
living together by just clarifying what it actually is.


>> If you use pytz tzinfo instances, adding 1 second always adds one second
>
> But only in the _model_ you have in mind:  real-life clocks showing
> real-life civil time suffer from leap seconds too.  You can laugh that
> off in _your_ apps (and I can too ;-) ), but for other apps it's dead
> serious.

If our underlying platforms that we needed to work with supported it,
I'd probably be in favour of leap seconds. I doubt that would ever
happen - there are more palatable workarounds.

>> and adding 1 day always adds 24 hours.
>
> That's also true of classic arithmetic.  The meanings of "day" and "24
> hours" also depend on the model in use.

I think in my view, as soon as you go to the bother of adding a tzinfo
instance to the datetime you are making a statement about the expected
behaviour; that the simpler classic arithmetic no longer applies and
the more complex model needs to be used.



>     def dt_add(dt, td):
>         return dt.tzinfo.fromutc(dt + (td - dt.utcoffset()))
>
> There you go:  "timeline" datetime + timedelta arithmetic about as
> efficiently as possible in pure Python.  Note that _if_ the default
> changed to timeline arithmetic, this code would no longer work.  The
> "+" there requires classic arithmetic to get the right result.  Change
> the default, this code would break too.  I find it hard to imagine I'm
> the only person in the world who has code similarly taking advantage
> of what Python actually does.
>
> Example:

I see.

What I don't like about this approach is the developers need to be
aware that they need to call it, and that dt + timedelta(hours=24) may
not work. Of course, developers will not be aware or have done more
than skim the docs until after their guests have all died of
salmonella poisoning from the undercooked Turkey. Its one of the
reasons I'm wondering if something more in your face like the
datetimetz proposal above would be an improvement.

Stop making me hungry dammit.



>> However... this also means the new flag on the datetime instances is
>> largely irrelevant to pytz.  pytz' API will need to remain the same.
>
> My hope was that 495 alone would at least spare pytz's users from
> needing to do a `.normalize()` dance after `.astimezone()` anymore.
> Although I'm not clear on why it's needed even now.

Instead of one tzinfo instance, there are dozens for your timezone.
The datetime implementation does not give pytz the opportunity to
choose which one is used when constructing the datetime, so localize
is needed to sort that. Similarly, arithmetic does not always give
pytz the opportunity to choose which one is used after crossing a
timezone boundary, so normalize is needed to sort that out. While the
results of the timeline arithmetic are unambiguous and obvious, they
are arguably incorrect until normalize puts things right.


> See the "PEP-495 - Strict Invalid Time Checking" thread for more.
> There seems to be increasing "feature creep" here.  Rewriting vast
> swaths of datetime internals to cater to this is at best impractical,
> especially compared to supplying a "check this datetime" function
> users who care can call when they care.  Nevertheless, it's a suitable
> subject for a different PEP.  I don't want to bog 495 down with it.
> If it had _stopped_ with asking for an optional check in the datetime
> constructor, it may have been implemented already ;-)

Yup.

I think I'm after hooks to replace localize on construction and
normalize after arithmetic, so users don't have to be relied on to do
this explicitly. This doesn't need to happen now, and I fully
understand this could be considered fast path and the overhead
unacceptable.



>> The important bit here for pytz is that tzinfo.fromutc() may return a
>> datetime with a different tzinfo instance.
>
> Sorry, didn't follow that.  Of course you can write your .fromutc() to
> return anything you want.

I can't find what concerned me any more. I think there was some
wording along the lines of 'the result will be used to initialize the
first flag'. What I'm reading now on fromutc() though looks fine, so I
think I was mixed up.



>>>> - My argument in favour of 'is_dst' over 'first' is that this is what
>>>> we have in the data we are trying to load.  You commonly have
>>>> a timestamp with a timezone abbreviation and/or offset. This can
>>>> easily be converted to an is_dst flag.
>
>>> You mean by using platform C library functions (albeit perhaps wrapped
>>> by Python)?
>
> I really missed an answer to that ;-)

I think all the data we have access to, including from platform C
library functions, uses the is_dst flag or is simpler to map to the
is_dst flag.

The C library as exposed by the time.struct_time gives you is_dst.
Mapping that to first/fold means first doing doing two conversions and
determining which one comes first.

Similarly, when loading your JSON file or examining email headers you
need to load in a string like '2004-04-04 02:30:00 EDT-05:00'. Its
simple to use a lookup table to map the abbreviation + offset to an
is_dst flag. Its harder to map it to first/fold because they are
swapped around in April and October. And there can be more than two
transitions in a year, so if you need to support that your going to
need to do the lookup, construct a couple of instances, and compare to
work out if EDT or EST comes first that month in that year.


But, really, I hate all the options for the flag name. I lean towards
is_dst mainly because people are used to it.


> But there's every reason to be optimistic:  even someone as old and
> in-the-way as me doesn't find any of this particularly confusing ;-)

I may be old, but at least I'm not as old as Tim ;)


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/


More information about the Datetime-SIG mailing list