[Datetime-SIG] Calendar vs timespan calculations...

Tim Peters tim.peters at gmail.com
Sun Aug 2 11:11:03 CEST 2015


[Carl Meyer <carl at oddbird.net>]

Sorry for the delay - there's just too much here to keep up with.  I
enjoyed and appreciated your essay, and will take time to give you
straight answers about the "most important" question you asked.  Of
course I don't speak for Guido.  Although I often do ;-)

> ...
> In order to defend the current model as coherent, one has to discard one
> of the following points, and (despite having read every message in all
> the related threads), I am still not clear precisely which one of these
> Tim et al consider untrue or expendable:
>
> 1) A datetime with a tzinfo member that carries both a timezone and a
> specific UTC offset within that timezone (e.g. a pytz timezone instance)
> corresponds precisely and unambiguously to a single instant in
> astronomical time (as well as carrying additional information).

datetime had no intent to support "astronomical time" in any way,
shape or form.  It's no coincidence that, in Guido's first message
about "naive time":

    https://mail.python.org/pipermail/python-dev/2002-March/020648.html

he talked about "for most *business* uses of date and time".  datetime
development was suggested & funded by Zope Corporation, which mostly
works to meet other businesses' "content management" needs.  The use
cases collected were overwhelmingly from the commercial business
world.

Astronomical time systems weren't on the table.  In this respect, it's
important to realize that while Python 3.2 finally supplied a concrete
instance (of a tzinfo subclass) as "the standard" UTC timezone object
(datetime.timezone.utc), that's still just an approximation:  it
wholly ignores that real-life UTC suffers from leap seconds added (or,
perhaps some day also removed) at various times.  Subtract two
datetimes in `utc`, and the duration returned may be off from real
life, but whether and by how much can only be determined by looking up
the history of leap second adjustments (made to real-life UTC).

Those who suspect "Noda Time" is what they really want should note
that it ignores leap seconds too.  As they say on their site, "We want
to solve the 99% case.
Noda Time doesn't support leap seconds, relativity or various other
subtleties around time lines."  Although in the Zope community (which
mostly drove Python's datetime requirements), it was more the 99.997%
case ;-)

If an astronomical union had funded the project instead ...


> 2) A timedelta object is clearly a Duration, not a Period, because
> timedelta(days=1), timedelta(hours=24), and timedelta(seconds=86400)
> result in indistinguishable objects. I think this point is
> uncontroversial; Tim has said several times that a timedelta is just a
> complicated representation of an integer number of microseconds. That's
> a Duration.

That's my view, yes.  Although these are "naive time" microseconds
too, with eternally fixed relation to all of naive time seconds,
minutes, hours, days and weeks.  In real-life UTC, you can't even say
how long a minute is in seconds - "it depends".


> 3) If one has two datetime objects that each unambiguously correspond to
> an instant in real time, and one subtracts them and gets back an object
> which represents a Duration in real microseconds, the only reasonable
> content for that Duration is the elapsed microseconds in real time between
> the two instants.

Since there's no accounting for leap seconds, this cannot always be
true using tzinfo objects approximating real-life UTC, or any timezone
defined as offsetting real-life UTC.  Which is all of 'em ;-)

So what's the hangup with leap seconds?  They're of no use to business
applications, but would introduce irregularities business logic is
ill-prepared to deal with.  Same as DST transitions, leap-second
adjustments can create missing and ambiguous times on a local clock.
But unlike DST transitions, which occur in each jurisdiction at a time
picked to be minimally visible in the jurisdiction (wee hour on a
weekend), leap-second adjustments occur at a fixed UTC time, which is
usually "in the middle of the work day" in _some_ jurisdictions.  For
that reason, when a leap second was inserted this year, some major
financial markets across the world - normally open at the adjustment
time! - shut down temporarily rather than risk a cascade of software
disasters:

    http://money.cnn.com/2015/06/29/technology/leap-second/

I'm glad they did.  Example:  The order in which trades are executed
(based on timestamps with sub-second resolution) can have legal
consequences.  For example, a big customer calls a broker and tells
them to buy a million shares of Apple stock.  The broker thinks "good
idea!".  He tells his software to place the customer buy order, then
wait a millisecond, then send an order to buy a thousand shares for
his own account.  That's legal.  If the orders are placed in the
opposite order, it's illegal and the broker could go to jail ("front
running", placing his order first _knowing_ that a large order will
soon follow; the large order will certainly drive the stock price up,
benefiting the broker who bought before the thoroughly predictable
rise).

Inserting a leap second causes the local clock to "repeat a second" in
its idea of time (just as "inserting an hour" at the end of DST causes
local clocks to repeat an hour) - or to blow up.  A repeated second
could cause the orders in the example above to _appear_ to have
arrived in "the other" order.  Even if the system time services report
a time like 13:59:60.000 (instead of repeating 13:59:59.000), lots of
software never expected to see such a thing.  Who knows what may
happen?

So I doubt datetime will ever use "real UTC".  It's pretty horrid!
For another example, what will the UTC calendar date and time be 300
million seconds from now?  That's simply impossible to compute for
real UTC, not even in theory.  Saying how many seconds away it will be
is trivial (300 million!), but the physical processes causing leap
second adjustments to UTC are chaotic - nobody can predict how many
leap second adjustments will be made to UTC over the next 300 million
seconds, or when, so there's no way to know what the UTC calendar date
and time will be then.  It _can_ affect the calendar date-and-time
even for times just half a year in the future .  Unless the definition
of UTC is changed yet again (dead serious proposals for which are
pending, supported by most participating countries):

    https://en.wikipedia.org/wiki/Leap_second#Proposal_to_abolish_leap_seconds

That page is also interesting for its account of various software
problems known to have been caused so far by leap-second adjustments.

Anyway, under "real UTC" today, you could get an excellent
approximation of "real time durations" by subtracting, but would have
to accept that there is no fixed mapping between UTC timeline points
and calendar notations except for datetimes no later than about 3
months from now (best I can tell, "they" don't promise to give more
than 3 month notice before the next leap second adjustment).

Finally, I have to note the irony in asking anything about "real time"
;-)  What does "real time" mean?  The most accurate clocks we have are
atomic clocks, but even when two are made as identically as possible -
even if we made two that kept _perfect_ time forever - they will
_appear_ to run at slightly different rates when placed at different
locations on Earth.  That's at least due to gravitational time
dilation:  relativistic effects matter at currently achievable
resolutions.  As a result, current TAI time (the astonishingly uniform
"atomic time" measure from which today's definition of UTC is derived)
can't be known _as_ it happens:  it's the output of an algorithm
(which consumes time!) that collects "elapsed seconds" from hundreds
of ultra-stable clocks around the globe, and averages them in a way to
make a highly informed, provably excellent guess at what they would
have said had they all been flawless, all at mean sea level altitude,
and all at 0 degrees Kelvin.  This computed "TAI time" is out of date
by the time it's known, and typically disagrees (slightly) with most
of the clocks feeding into it.

So the best measure of "real time" we have is a product of human
ingenuity.  The closer to "plain old unadulterated real time as it
exists in nature" you want to get, the more contrived & bogglingly
complex the means needed to achieve it ;-)

Everyone is settling for an approximation, because that's the best
that can be done.  Naive time starts and stops with what most people
"already know".

When UTC started mucking with leap seconds (it didn't always), the
computing world should have embraced TAI internally instead.  TAI
suffers no adjustments of any kind, ever - it's just the running total
of SI seconds since the start of the TAI epoch, as determined by the
best clocks on Earth.  In fact, it's very close to Python's "naive
time"!  TAI uses the propleptic Gregorian calendar too (albeit
starting at a different epoch than year 1), and the TAI "day" is also
defined to be exactly 86400 SI seconds.  The difference is that TAI's
Gregorian calendar will, over time, become unboundedly out of synch
with UTC's Gregorian calendar, as leap seconds pile up in the latter.
So far they're only 36 seconds out of synch.

> ...
> To be clear, I'm not arguing that this behavior can now be changed in
> the existing library objects in a backwards-incompatible way. But
> accepting that it is lacking in internal coherence (rather than just
> being an "alternative and equally good model") would be useful in
> clarifying what kind of an implementation we actually want (IMO,
> something very much like JodaTime/NodaTime). And then can we figure out
> how to get there from here.

I mentioned Noda Time before.  Just looked up Joda-Time, and:

http://joda-time.sourceforge.net/faq.html

"""
Joda-Time does not support leap seconds. Leap seconds can be supported
by writing a new, specialized chronology, or by making a few
enhancements to the existing ZonedChronology class. In either case,
future versions of Joda-Time will not enable leap seconds by default.
Most applications have no need for it, and it might have additional
performance costs.
"""

There's a pattern here:  "almost all" people want nothing to do with
leap seconds, not even time library developers.  That doesn't mean
they're right.  But it doesn't mean they're wrong either ;-)  Without
leap seconds, they're all approximating real-life UTC, and in the same
way Python's `utc` is.


More information about the Datetime-SIG mailing list