[Datetime-SIG] How open are we to re-thinking the whole thing?

Chris Barker chris.barker at noaa.gov
Fri Jul 31 19:02:00 CEST 2015


It seems we've all come to the conclusion that there are three types of
arithmetic here -- we can discuss until the cows come home about how useful
"classic" (what is implemented in datetime now) arithmetic is, and exactly
how it behaves in various corner cases, but it is here to stay for backward
compatibility, and if we want either of the other two ("Duration" and
"Period"), then we'll need to write implementations of them.

So the question I have is: do we want to re-use as much of the current API
and existing objects as possible, or should we start "from scratch", and
design what we want/need now, with all the benefits of hindsight?

For example:

Duration arithmetic is actually pretty easy (or am I just being "naive" ;-)
):

 - convert to UTC
 - do the arithmetic
 - convert back to the time zone desired.

Done. Granted, that's only easy if the "convert" part is done, but once we
work out the nuances of the "is_dst" flag, then that's done.

Period arithmetic is theoretically easy, but full of corner cases that have
to be resolved, one by one...

I think the real challenge here is how to get these two while keeping
maximum compatibility with the current objects and API. So maybe we
shouldn't worry so much about that.

 "the right
> thing" to do in Python's current state is to stick to UTC for every
> operation apart from conversions for input or output.


Exactly -- and  "the right thing to do" to properly support duration
arithmetic is for that to be what happens under the hood.

Which, brings up a point: If new code is going to be written, and new APIs
defined, then maybe a re-thinking of the datetime data structure is in
order:

As Tim points out, both datetime and timedelta are really just fancy ways
to encode milliseconds. They make it easy to work with the calendar
representation version of these quantities -- i.e. on I/O -- because that's
what humans want to work with. Also, the method of attaching the tzinfo
object makes it easy to create and read and write datetimes in a given time
zone.

But if one were to design a system optimized for a different purpose, i.e.
to support Duration arithmetic, it makes more sense to:

* store datetimes in a "time_unit_since an epoch" representation (e.g.
milliseconds since year 1 in the proleptic gregorian calendar)

* store timedeltas in a "time_unit" (e.g. milliseconds)

and (somewhat independent):

* store timezone aware datetimes in UTC, and do the conversion to/from
calendar representation on I/O.

[that makes Duration addition described above reduce to: "add two numbers"]

On the other hand, if you want to support "Period Arithmetic" (aka Calendar
Operations), then it's easier to use Calendar encoding internally -- after
all, "moving to the next day", (or next month, or...) is natural and easy
(except for the edge cases!) in Calendar encoding.

All this points to having two independent systems for the two types of math:

two datetime objects
two timedelta objects

That makes it easy to decide what "type" of delta to return when
subtracting two dates, and easy for the user to know what type of
arithmetic they are going to get.

Also, it could be in a datetime2 package, and have fewer issues with
backward compatibility and make it clear to users that they are getting
something different than the existing datetime package.

Downsides:

* More code to implement and maintain -- always a bad thing

* Harder for users that need to do both kinds of arithmetic in one
application -- if that's a common use case, then one datetime object with
multiple deltas might be better. And the datetime object would have to be
optimised for one or the other use-case -- so be it.

Another totally side note about API. Particularly for Period arithmetic,
the timedelta API feels a bit "OO-heavy" to me:

If I want "tomorrow", I need to do:

DateTime.now() + PeriodDelta(days=1)

I think I'd rather something like:

DateTime.now().next_day()

or

DateTime.now().moveforward("1 day")

And if the Period arithmetic is in methods, then we don't have to worry
about what kind of delta to return  when subtracting.

Just a thought.......


(side question: It seems to me that doing the Period operations one way is
pretty straightforward e.g. "add a day", but is it well defined in the
other direction? e.g. what is the Period between datetime1 and datetime2?)

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/datetime-sig/attachments/20150731/79259f64/attachment-0001.html>


More information about the Datetime-SIG mailing list