[Python-Dev] PEP: New timestamp formats

Victor Stinner victor.stinner at haypocalc.com
Thu Feb 2 02:03:15 CET 2012


Even if I am not really conviced that a PEP helps to design an API,
here is a draft of a PEP to add new timestamp formats to Python 3.3.
Don't see the draft as a final proposition, it is just a document
supposed to help the discussion :-)

---

PEP: xxx
Title: New timestamp formats
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at haypocalc.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 01-Feburary-2012
Python-Version: 3.3


Abstract
========

Python 3.3 introduced functions supporting nanosecond resolutions. Python 3.3
only supports int or float to store timestamps, but these types cannot be use
to store a timestamp with a nanosecond resolution.


Motivation
==========

Python 2.3 introduced float timestamps to support subsecond resolutions,
os.stat() uses float timestamps by default since Python 2.5. Python 3.3
introduced functions supporting nanosecond resolutions:

 * os.stat()
 * os.utimensat()
 * os.futimens()
 * time.clock_gettime()
 * time.clock_getres()
 * time.wallclock() (reuse time.clock_gettime(time.CLOCK_MONOTONIC))

The problem is that floats of 64 bits are unable to store nanoseconds (10^-9)
for timestamps bigger than 2^24 seconds (194 days 4 hours: 1970-07-14 for an
Epoch timestamp) without loosing precision.

.. note::
   64 bits float starts to loose precision with microsecond (10^-6) resolution
   for timestamp bigger than 2^33 seconds (272 years: 2242-03-16 for an Epoch
   timestamp).


Timestamp formats
=================

Choose a new format for nanosecond resolution
---------------------------------------------

To support nanosecond resolution, four formats were considered:

 * 128 bits float
 * decimal.Decimal
 * datetime.datetime
 * tuple of integers

Criteria
--------

It should be possible to do arithmetic, for example::

    t1 = time.time()
    # ...
    t2 = time.time()
    dt = t2 - t1

Two timestamps should be comparable (t2 > t1).

The format should have a resolution of a least 1 nanosecond (without loosing
precision). It is better if the format can have an arbitrary resolution.

128 bits float
--------------

Add a new IEEE 754-2008 quad-precision float type. The IEEE 754-2008 quad
precision float has 1 sign bit, 15 bits of exponent and 112 bits of mantissa.

128 bits float is supported by GCC (4.3), Clang and ICC. The problem is that
Visual C++ 2008 doesn't support it. Python must be portable and so cannot rely
on a type only available on some platforms. Another example: GCC 4.3 does not
support __float128 in 32-bit mode on x86 (but gcc 4.4 does).

Intel CPUs have FPU supporting 80-bit floats, but not using SSE intructions.
Other CPU vendors don't support this float size.

There is also a license issue: GCC uses the MPFR library which is distributed
under the GNU LGPL license. This license is incompatible with the Python
Software License.

datetime.datetime
-----------------

datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond.

datetime.datetime has issues:

- there is no easy way to convert it into "seconds since the epoch"
- any broken-down time has issues of time stamp ordering in the
  duplicate hour of switching from DST to normal time
- time zone support is flaky-to-nonexistent in the datetime module

decimal.Decimal
---------------

The decimal module is implemented in Python and is not really fast.

Using Decimal by default would cause bootstrap issue because the module is
implemented in Python.

Decimal can store a timestamp with any resolution, not only nanosecond, the
resolution is configurable at runtime.

Decimal objects support all arithmetics operations and are compatible with int
and float.

The decimal module is slow, but there is a C reimplementation of the decimal
module which is almost ready for inclusion.

tuple
-----

Various kind of tuples have been proposed. All propositions only use integers:

 * a) (sec, nsec): C timespec structure, useful for os.futimens() for example
 * b) (sec, floatpart, exponent): value = sec + floatpart * 10**exponent
 * c) (sec, floatpart, divisor): value = sec + floatpart / divisor

The format (a) only supports nanosecond resolution.

The format (a) and (b) may loose precision if the clock divisor is not a
power of 10.

For format (c) should be enough for most cases.

Creating a tuple of integers is fast.

Arithmetic operations cannot be done directly on tuple: t2-t1 doesn't work for
example.

Final formats
-------------

The PEP proposes to provide 5 different timestamp formats:

 * numbers:

   * int
   * float
   * decimal.Decimal
   * datetime.timedelta

 * broken-down time:

   * datetime.datetime


API design
==========

Change the default result type
------------------------------

Python 2.3 introduced os.stat_float_times(). The problem is that this flag
is global, and so may break libraries if the application changes the type.

Changing the default result type would break backward compatibility.

Callback and creating a new module to convert timestamps
--------------------------------------------------------

Use a callback taking integers to create a timestamp. Example with float:

    def timestamp_to_float(seconds, floatpart, divisor):
        return seconds + floatpart / divisor

The time module can provide some builtin converters, and other module, like
datetime, can provide their own converters. Users can define their own types.

An alternative is to add new module for all functions converting timestamps.

The problem is that we have to design the API of the callback and we cannot
change it later. We may need more information for future needs later.

os.stat: add new fields
-----------------------

It was proposed to add 3 fields to os.stat() structure to get nanoseconds of
timestamps.

Add an argument to change the result type
-----------------------------------------

Add a argument to all functions creating timestamps, like time.time(), to
change their result type. It was first proposed to use a string argument,
e.g. time.time(format="decimal"). The problem is that the function has
to import internally a module. Then it was decided to pass directly the
type, e.g. time.time(format=decimal.Decimal). Using a type, the user has
first to import the module. There is no direct link between a type and the
function used to create the timestamp.

By default, the float type is used to keep backward compatibility. For stat
functions like os.stat(), the default type depends on os.stat_float_times().

Add new functions
-----------------

Add new functions for each type, examples:

 * time.time_decimal()
 * os.stat_decimal()
 * os.stat_datetime()
 * etc.


Changes
=======

 * Add *format* optional argument to time.clock(), time.clock_gettime(),
   time.clock_getres(), time.time() and time.wallclock().
 * Add *timestamp* optional argument to os.fstat(), os.fstatat(), os.lstat()
   and os.stat().

Functions accepting timestamp as input should support decimal.Decimal objects
without an internal conversion to float which may loose precision:

 * datetime.datetime.fromtimestamp()
 * time.localtime()
 * time.gmtime()

TODO:

 * Change os.utimensat() and os.futimens() to accept Decimal
 * Change os.utimensat() and os.futimens() to not accept tuple anymore
 * Drop os.utimensat() and os.futimens() and patch os.utimeat() instead?
 * datetime should maybe support nanosecond?


Backwards Compatibility
=======================

Changes only add an new optional argument. The default type is unchanged and
there is no impact on performances.


Links
=====

 * `Issue #11457: os.stat(): add new fields to get timestamps as
Decimal objects with nanosecond resolution
<http://bugs.python.org/issue11457>`_
 * `Issue #13882: Add format argument for time.time(), time.clock(),
... to get a timestamp as a Decimal object
<http://bugs.python.org/issue13882>`_
 * `[Python-Dev] Store timestamps as decimal.Decimal objects
<http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_


Copyright
=========

This document has been placed in the public domain.


More information about the Python-Dev mailing list