We had a brief jam on date/time objects at Zope Corp. HQ today. I won't get to writing up the full proposal that came out of this, but I'd like to give at least a summary. (Th0se who were there: my thoughts have advanced a bit since this afternoon.)
My plan is to create a standard timestamp object in C that can be subclassed. The internal representation will favor extraction of broken-out time fields (year etc.) in local time. It will support comparison, basic time computations, and effbot's minimal API, as well as conversions to and from the two currently most popular time representations used by the time module: posix timestamps in UTC and 9-tuples in local time. There will be a C API.
Proposal for internal representation (also the basis for an efficient pickle format):
year 2 bytes, big-endian, unsigned (0 .. 65535) month 1 byte day 1 byte hour 1 byte minute 1 byte second 1 byte usecond 3 bytes, big-endian tzoffset 2 bytes, big-endian, signed (in minutes, -1439 .. 1439)
total 12 bytes
Things this will not address (but which you may address through subclassing):
- leap seconds - alternate calendars - years far in the future or BC - precision of timepoints (e.g. a separate Date type) - DST flags (DST is accounted for by the tzoffset field)
- Why store a broken-out local time rather than seconds (or microseconds) relative to an epoch in UTC? There are two kinds of operations on times: accessing the broken-out fields (probably in local time), and time computations. The chosen representation favors accessing broken-out fields, which I expect to be more common than time computations.
- Why a big-endian internal representation? So that comparison can be done using a single memcmp() call as long as the tzoffset fields are the same.
- Why not pack the fields closer to save a few bytes? To make the pack and unpack operations more efficient; the object footprint isn't going to make much of a difference.
- Why is the year unsigned? So memcmp() will do the right thing for comparing dates (in the same timezone).
- What's the magic number 1439? One less than 24 * 60. Timezone offsets may be up to 24 hours. (The C99 standard does it this way.)
I'll try to turn this into a proper PEP ASAP.
(Stephan: do I need to CC you or are you reading python-dev?)
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)