[Python-Dev] proposal: add basic time type to the standard library

Wed, 27 Feb 2002 21:11:08 -0500

We had a brief jam on date/time objects at Zope Corp. HQ today.  I
won't get to writing up the full proposal that came out of this, but
I'd like to give at least a summary.  (Th0se who were there: my
thoughts have advanced a bit since this afternoon.)

My plan is to create a standard timestamp object in C that can be
subclassed.  The internal representation will favor extraction of
broken-out time fields (year etc.) in local time.  It will support
comparison, basic time computations, and effbot's minimal API, as well
as conversions to and from the two currently most popular time
representations used by the time module: posix timestamps in UTC and
9-tuples in local time.  There will be a C API.

Proposal for internal representation (also the basis for an efficient
pickle format):

year	 2 bytes, big-endian, unsigned (0 .. 65535)
month	 1 byte
day	 1 byte
hour	 1 byte
minute	 1 byte
second	 1 byte
usecond	 3 bytes, big-endian
tzoffset 2 bytes, big-endian, signed (in minutes, -1439 .. 1439)

total	12 bytes

Things this will not address (but which you may address through
subclassing):

- leap seconds
- alternate calendars
- years far in the future or BC
- precision of timepoints (e.g. a separate Date type)
- DST flags (DST is accounted for by the tzoffset field)

Mini-FAQ

- Why store a broken-out local time rather than seconds (or
  microseconds) relative to an epoch in UTC?  There are two kinds of
  operations on times: accessing the broken-out fields (probably in
  local time), and time computations.  The chosen representation
  favors accessing broken-out fields, which I expect to be more common
  than time computations.

- Why a big-endian internal representation?  So that comparison can be
  done using a single memcmp() call as long as the tzoffset fields are
  the same.

- Why not pack the fields closer to save a few bytes?  To make the
  pack and unpack operations more efficient; the object footprint
  isn't going to make much of a difference.

- Why is the year unsigned?  So memcmp() will do the right thing for
  comparing dates (in the same timezone).

- What's the magic number 1439?  One less than 24 * 60.  Timezone
  offsets may be up to 24 hours.  (The C99 standard does it this way.)

I'll try to turn this into a proper PEP ASAP.

(Stephan: do I need to CC you or are you reading python-dev?)

--Guido van Rossum (home page: http://www.python.org/~guido/)