[Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

Victor Stinner victor.stinner at gmail.com
Fri Feb 17 01:04:47 CET 2012


>> The problem is that shutil.copy2() produces sometimes *older*
>> timestamp :-/ (...)
>
> Have you been able to reproduce this with an actual Makefile? What's
> the scenario?

Hum. I asked the Internet who use shutil.copy2() and I found an "old"
issue (Decimal('43462967.173053') seconds ago):

Python issue #10148: st_mtime differs after shutil.copy2 (october 2010)
"When copying a file with shutil.copy2() between two ext4 filesystems
on 64-bit Linux, the mtime of the destination file is different after
the copy. It appears as if the resolution is slightly different, so
the mtime is truncated slightly. (...)"

I don't know if it is a "theorical" or "practical" issue. Then I found:

Python issue #11941: Support st_atim, st_mtim and st_ctim attributes
in os.stat_result
"They would expose relevant functionality from libc's stat() and
provide better precision than floating-point-based st_atime, st_mtime
and st_ctime attributes."

Which is connected the issue that motivated me to write the PEP:

Python issue #11457: os.stat(): add new fields to get timestamps as
Decimal objects with nanosecond resolution
"Support for such precision is available at the least on 2.6 Linux kernels."
"This is important for example with the tarfile module with the pax
tar format. The POSIX tar standard[3] mandates storing the mtime in
the extended header (if it is not an integer) with as much precision
as is available in the underlying file system, and likewise to restore
this time properly upon extraction. Currently this is not possible."
"The mailbox module would benefit from having this precision available."

For the tarfile use case, we need at least a way to get the
modification time with a nanosecond resolution *and* to set the
modification time with a nanosecond resolution. We just need to decide
which type is the best for this usecase, which is the purpose of the
PEP 410 :-)

Another use case of nanosecond timestamps are profilers (and maybe
benchmark tools). The profiler itself may be implemented in a
different language than Python. For example, DTrace uses nanosecond
timestamps.

--

Other examples.

Debian bug #627460: (gcp) Expose nanoseconds in python (15 May 2011)
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627460
Debian bug #626787: (gcp) gcp: timestamp is not always copied exact
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626787
"When copying a (large) file from HDD to USB the files timestamp is
not copied exact. It seems to work fine with smaller files (up to
1Gig), I couldn't spot the time-diff on these files."
("gcp is a grid enabled version of the scp copy command.")

fuse-python supports nanosecond resolution: they chose to mimick the C
API using:

class Timespec(FuseStruct):
    """
    Cf. struct timespec in time.h:
    http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html
    """
    def __init__(self, name=None, **kw):
        self.tv_sec  = None
        self.tv_nsec = None
        kw['name'] = name
        FuseStruct.__init__(self, **kw)

Python issue #9079: Make gettimeofday available in time module
"... exposes gettimeofday as time.gettimeofday() returning (sec, usec) pair"

The Oracle database supports timestamps with a nanosecond resolution.
A related article about Ruby:
http://marcricblog.blogspot.com/2010/04/who-cares-about-nanosecond.html
"Files are uploaded in groups (fifteen maximum). It was important to
know the order on which files have been upload. Depending on the size
of the files and users’ internet broadband capacity, some files could
be uploaded in the same second."

And a last one for the fun:

"This Week in Python Stupidity: os.stat, os.utime and Sub-Second
Timestamps" (November 15, 2009)
http://ciaranm.wordpress.com/2009/11/15/this-week-in-python-stupidity-os-stat-os-utime-and-sub-second-timestamps/
"Yup, that’s right, Python’s underlying type for floats is an IEEE 754
double, which is only good for about sixteen decimal digits. With ten
digits before the decimal point, that leaves six for sub-second
resolutions, which is three short of the range required to preserve
POSIX nanosecond-resolution timestamps. With dates after the year 2300
or so, that leaves only five accurate digits, which isn’t even enough
to deal with microseconds correctly. Brilliant."
"Python does have a half-assed fixed point type. Not sure why they
don’t use it more."

Victor


More information about the Python-Dev mailing list