[Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement

Guido van Rossum guido at python.org
Mon Aug 17 05:18:26 CEST 2015


Thanks for the quick response!

On Sun, Aug 16, 2015 at 2:45 PM, Alexander Belopolsky <
alexander.belopolsky at gmail.com> wrote:

> On Sun, Aug 16, 2015 at 3:23 PM, Guido van Rossum <guido at python.org>
> wrote:
> > I think that a courtesy message to python-dev is appropriate, with a
> link to
> > the PEP and an invitation to discuss its merits on datetime-sig.
>
> Will do.  (Does anyone know how to set Reply-To: header in Gmail?)
>

I think you can set TO: datetime-sig, BCC: python-dev.


> ..
> > - I'm surprised the name of the proposed flag doesn't occur in the
> abstract.
> >
>
> That's because I wanted people to get to the proposal section before
> starting to bikeshed on the name of the flag.   More on that below.
>

Heh. :-)


> > - The rationale might explicitly mention the two cases we're thinking
> about:
> > DST transitions and adjustments to the timezone's base offset -- noting
> that
> > the latter may be an arbitrary interval (not just an hour).
> >
>
> Actually, in either case the adjustment can be a fraction of an hour.
> I'll add this to the rationale.
>
> > - The sidebar doesn't show up as a sidebar, but as somewhat mysterious
> text,
> > on https://www.python.org/dev/peps/pep-0495/ (it does on
> legacy.python.org,
> > but we're trying to avoid that site). Maybe you should file a bug with
> the
> > pydotorg project on GitHub (if you haven't already).
>
> I did: <https://github.com/python/pythondotorg/issues/808>.
>
> > (While I like the
> > artwork, it's a bit un-PEP-like, and maybe not worth it given the
> problems
> > making the image appear properly.)
>
> If we don't fix the layout issues before the pronouncement, I'll
> remove the graphic.
>
> > - Conversely, on legacy.python.org there are some error messages about
> > "Unknown directive type "code"" (lines 112, 118).
>
> I'll look into this.  I've never had problems with ReStructuredText
> rendering on docs.p.o, but the peps site seems to be more restrictive.
>

FWIW I don't get errors when I do "make pep-0498.html" in the peps repo --
I consider that the ultimate arbiter of who's right. Maybe we have an old
ReST version generating legacy?


> >
> > - "a fold is created in the fabric of time" sounds a bit like
> > science-fiction. I'd just say "a time fold is created", or "a fold is
> > created in time".
> >
>
> Agree.  After all, a "fold" already suggests some kind of fabric.
>

:-) Never thought about it this way. I've always just considered it an
excessively literary phrase. Can't you fold a line though?


> > - Despite having read the section about the naming, I'm still not wild
> about
> > the name 'first'. This is in part because this requires True as the
> default,
> > in part because without knowing the background its meaning somewhat
> > mysterious.
>
> I agree.  My top candidate is "repeated=False", but an invitation to
> bikeshed, <
> https://mail.python.org/pipermail/datetime-sig/2015-August/000241.html>,
> was not met with the usual enthusiasm.


Actually the *usual* enthusiasm is probably expressed by more bikeshedding.
:-) In this case I have to agree that "repeated" doesn't sound right.


> To defend the "True means
> earlier" choice, I would mention that it matches "isdst=1 means
> earlier" in the fold.
>

But nobody would be able to remember that mnemonic -- the far majority of
people simply don't know whether to move the clock forward or back when DST
begins or ends, they just read it in the newspaper the day before (or rely
on their cell phone) and try to forget about it as soon as they can. At
least, that's how I usually do it (even though I am well capable of
reasoning it through from first principles, it's not wort remembering).


> > I'm not wild about the alternatives either, so perhaps this
> > requires more bikeshedding. :-( (FWIW I agree that the name should not
> > reference DST, since time folds may appear for other reasons.) Hmm...
> Maybe
> > "fold=True" to select the second occurrance?
>
> I really want something that disambiguates two times based on their
> most natural characteristics: do you want the earlier or the later of
> the two choice?  Anything else, in my view would require additional
> knowledge.
>

Agreed.

> - I'm a bit surprised that this flag doesn't have three values (e.g. None,
> > True, False) -- in C, the tm_isdst flag in struct tm can be -1, 0 and 1,
> > where -1 means "figure it out" or "don't care".
>
> With the proposed functionality, one can easily implement any of the
> C-style isdst logic.


Really? The way I interpret the PEP, there's no way to represent the "-1"
case using a datetime alone.


> The problem, however is that while most C
> libraries agree with in their treatment of 0 and 1, the behavior on
> tm_isdst=-1 ranges from bad to absurd.  For example, the value
> returned by mktime in the ambiguous case may depend on the arguments
> passed to the previous call to mktime.
>

The actual behavior and bugs of C libraries don't interest me much. I just
care about "None" meaning "nobody set it, probably because the code was
written before this flag was introduced".


> > The "don't care" case should allow stricter backward compatibility.
>
> I am not sure we want to maintain the behavior described in
> <http://bugs.python.org/issue22627> (Calling timestamp() on a datetime
> object modifies the timestamp of a different datetime object.)
>

I can't quite follow the bug. Does it imply that datetime objects are
mutable? Or is there some global state that's set by the timestamp()
function? What is it that ts1.timestamp() changes that affects
ts2.timestamp()?

Anyway, I'm not saying we should maintain backwards compatibility in that
case (assuming you can convince me it's a bug that should be fixed
regardless of whether we accept this PEP).

> - "[1] An instance that has first=False in a non-ambiguous case is said to
> > represent an invalid time ..." Could you quickly elaborate here whether
> such
> > an invalid time is considered an hour later than the valid corresponding
> > time with first=True, given a reasonable timezone with and without DST?
>
> Such an instance is just *invalid* as in "February 29, 2015."  In a
> non-ambiguous case,  first=False means "the second of one", which does
> not make sense.  Such instances should never be produced except for a
> narrow purpose of probing the astimezone() or timestamp() to determine
> whether a given datetime is ambiguous or not.
>

Yeah, but it can still be created -- and if I have one, how does it behave?
(I don't care what it means. :-)

> - "In CPython, a non-boolean value of first will raise a TypeError , but
> > other implementations may allow the value None to behave the same as when
> > first is not given." This is surprisingly lenient. Why allow the second
> > behavior at all?
>
> Because it is currently allowed for the other arguments of replace()
> in the pure python datetime implementation that we ship.  I will be
> happy to change that starting with the "first".
>

OK, seems to make sense to be consistent with the other args -- just
explain that reason in the text then.


> > (Surely all Python implementations can distinguish between
> > a value equal to None and a missing value, even if some kind of hack is
> > needed.) Also, why this clause for replace() but not for other methods?
>
> What other methods?  replace() is fairly unique in its treatment of
> arguments.
>

Well, several other methods also have a first=... argument. How should they
treat first=None compared to its absence?

> - I'm disappointed that there are now some APIs that explicitly treat a
> > naive datetime as local (using the system timezone). I carefully avoided
> > such interpretation in the original design, since a naive datetime can
> also
> > be used to represent a point in UTC, or in some timezone that's implicit.
> > But I guess this cat is out of the bag since it's already assumed by
> > timestamp() and fromtimestamp(). :-(
>
> I held that siege as long as I could.
>

And thanks for that! I guess we move on now.

> - "Conversion from POSIX seconds from EPOCH" I'd move this section before
> > the opposite clause, since it is simpler and the other clause references
> > fromtimestamp(). The behavior of fromtimestamp() may also be considered
> > motivational for having only the values True and False for the flag.
> >
>
> Will do.
>
> > - "New guidelines will be published for implementing concrete timezones
> with
> > variable UTC offset." Where?
>
> In the official datetime documentation.  I'll clarify that.
>
> > (Is this just a forward reference to the next section? Then I'd drop it.)
>
> No, I expect that section to be incorporated in the official datetime
> library documentation.
>

OK, should definitely be clarified. Note that PEPs rarely say anything
about the docs -- the docs simply follow the specs laid out by the PEP. So
the PEP could just state the guidelines. (After all the guidelines can
always be viewed in the context of the PEP.)

> - "... must follow these guidelines." Here "must" is very strong (it is
> the
> > strongest word in "standards speak", stronger than "should", "ought to",
> > "may"). I recommend "should", that's strong enough.
>
> OK.  This is a remnant of the idea to include a first-aware fromutc()
> implementation, which after some private discussions with Tim we
> decided to abandon.  In light of that idea, "must" made sense as in
> "in order for unmodified fromutc() work correctly with your tzinfo
> implementation, it *must* ..."
>
> ..
> > - "We chose the minute byte to store the the "first" bit because this
> choice
> > preserves the natural ordering." This only works with folds of exactly
> one
> > hour. Also, is the natural ordering (of the pickles, apparently) used
> > anywhere? I would hope not. Finally, given that two times that differ
> only
> > in their 'first' flag compare equal, the natural ordering (if relevant
> :-)
> > would be to store/compare the 'first' bit last.
> >
>
> I'll remove the rationale.  The ordering is a red herring anyways.  I
> needs a place to stick one bit in the 10-byte payload and the minute
> byte looked like a natural place.  I made up the ordering rational to
> a posteriori justify an arbitrary choice.
>
>
> > - Temporal Arithmetic (probably shouldn't have an "s" at the end):
>
> Wikipedia is of no help here: "Arithmetic or arithmetics (from the
> Greek ἀριθμός arithmos, "number")  ..." I'll check what we use in the
> library docs.  (For some reason, I thought that Arithmetic is a branch
> of mathematic while arithmetics is a set of rules.)
>

Maybe it's British vs. American usage? The Brits also say "maths" while
Americans say "math". But I don't think I've ever seen or heard arithmetics
with an 's', and I've seen and heard plenty of 'maths'. Anyway, we tend to
use American spelling in PEPs.


> > this probably needs some motivation. I think it's inevitable (since we
> don't know
> > the size of the time fold), but it still feels weird.
>
> It's what you say and backward compatibility considerations.  We want
> existing programs to produce the same results even if they
> occasionally encounter first=False instances from say datetime.now().
> I'll add a footnote.
>
> > - "[2] As of Python 3.5, tzinfo is ignored whenever timedelta is added or
> > subtracted ..." I don't see a reason for this footnoote to discuss
> possible
> > future changes to datetime arithmetic; leave that up to the respective
> PEP.
>
> I'll remove the discussion of the future changes to datetime arithmetic.
>
> > (OTOH you may have a specific follow-up PEP in mind, and it may be
> better to
> > review this one in the light of the follow-up PEP.)
>
> Yes, there is a PEP-0500, but it is nowhere as ready as this one.
>

I think it's overkill (see my previous message).


> > - "This proposal will have little effect on the programs that do not read
> > the first flag explicitly or use tzinfo implementations that do." This
> seems
> > ambiguous -- if I use a tzinfo implementation that reads the first flag,
> am
> > I affected or not? Also, "the programs" should be just "programs", and
> I'm
> > kind of curious why the hedging of "little effect" (rather than "no
> effect")
>
> We are changing the behavior of datetime.timestamp on naive instances.
>   This is really what the "hedging" is about.
>

OK. Might be good to be explicit in the text.

But you haven't responded to my complaint that the "or" clause you used is
ambiguous in English -- which of the following does it mean?

- (not "read first flag" or "use tzinfo impls that do")
- not ("read first flag" or "use tzinfo impls that do")


> > is needed. Also, you might give some examples of changes that programs
> that
> > *do* use the first flag may experience.
>
> I don't understand.  The programs that  *do* use the first flag now
> experience an AttributeError, and that will surely change.  Perhaps
> you want to see some examples of how the programs can start using the
> first flag?
>

I was thinking about what happens in a program that explicitly uses the
flag to create a datetime object and then passes it on to some library code
that doesn't know about the flag. But the change in behavior of
fromtimestamp() makes this a moot point. Better be explicit about the
hedging.


> >
> > - In a reply to this thread, you wrote "The rule for the missing time is
> the
> > opposite to that for the ambiguous time. This allows a program that
> probes
> > the TZ database by calling timestamp with two different values of the
> > "first" flag to avoid any additional calls to differentiate between the
> gap
> > and the fold." Can you clarify this (I'm not sure how this works, though
> I
> > intuitively agree that the two rules should be each other's opposite) and
> > add it to the PEP?
> >
>
> Yes, I posted something like this before, but will include in the PEP.
> A first-aware program can do something like the following when it gets
> a naive instance dt that it wants to decorated with a timezone.
>
> dt1 = dt.replace(first=True).astimezone()
> dt2 = dt.replace(first=False).astimezone()
>
> if dt1 == dt2:
>     return dt1
>
> if dt1 < dt2:
>     warn("ambiguous time: picked %s but it could be %s", dt1, dt2)
>     return dt1
>
> if dt1 > dt2:
>     raise ValueError("invalid time", dt, dt1, dt2)
>
>
> > - Would there be any merit in proposing, together with the idea of a
> > three-value flag, that datetime arithmetic should use "timeline
> arithmetic"
> > if the flag is defined and a tzinfo is present?
>
> To add a third value, you will need a full additional bit anyways, so
> why not just have a separate flag that controls the choice of
> arithmetic and leave "first" a pure fold disambiguation flag?  I
> consider the problem of local time disambiguation and that of the
> "timeline arithmetic" to be two orthogonal problems.  Yes, "timeline
> arithmetic" can benefit from the first flag, but it is possible
> without it.  Similarly, the problem of round-tripping the times
> between timezones can benefit from "timeline arithmetic", but PEP 495
> solves it without introducing the new arithmetic.
>

Fair enough. The PEP could use some discussion of this topic!


> In my view PEP 495 solves a long-standing problem for which there is
> no adequate workaround within stdlib and third-party workarounds are
> cumbersome.  The alternative datetime arithmetic PEP (PEP-0500)
> enables some nice to have features, but does not enable anything that
> cannot be achieved  by other means.  I would like to avoid mixing the
> two proposals.
>

Agreed, and I am very glad to see PEP 495, as a concrete proposal for the
"first step" that I proposed a while ago. Still, the whole term "first
step" implies there will be more steps, and we should make sure the first
step is roughly in the right direction!

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150816/79399aab/attachment-0001.html>


More information about the Datetime-SIG mailing list