I proposed to use nanoseconds because UNIX has 1 ns resolution in timespec,
the most recent API, and Windows has 100 ns.
Using picoseconds would confuse users who may expect sub-nanosecond
resolution, whereas no OS support them currently.
Moreover, nanoseconds as int already landed in os.stat and os.utime.
Last but not least, I already strugle in pytime.c to prevent integer
overflow with 1 ns resolution. It can quickly become much more complex if
there is no native C int type supporting a range large enough to more 1
picosecond resolution usable. I really like using int64_t for _PyTime_t,
it's well supported, very easy to use (ex: "t = t2 - t1"). 64-bit int
supports year after 2200 for delta since 1970.
Note: I only know Ruby which chose picoseconds.
Victor
Le 15 oct. 2017 19:18, "Antoine Pitrou"
Since new APIs are expensive and we'd like to be future-proof, why not move to picoseconds? That would be safe until clocks reach the THz barrier, which is quite far away from us.
Regards
Antoine.
On Fri, 13 Oct 2017 16:12:39 +0200 Victor Stinner
wrote: Hi,
I would like to add new functions to return time as a number of nanosecond (Python int), especially time.time_ns().
It would enhance the time.time() clock resolution. In my experience, it decreases the minimum non-zero delta between two clock by 3 times, new "ns" clock versus current clock: 84 ns (2.8x better) vs 239 ns on Linux, and 318 us (2.8x better) vs 894 us on Windows, measured in Python.
The question of this long email is if it's worth it to add more "_ns" time functions than just time.time_ns()?
I would like to add:
* time.time_ns() * time.monotonic_ns() * time.perf_counter_ns() * time.clock_gettime_ns() * time.clock_settime_ns()
time(), monotonic() and perf_counter() clocks are the 3 most common clocks and users use them to get the best available clock resolution. clock_gettime/settime() are the generic UNIX API to access these clocks and so should also be enhanced to get nanosecond resolution.
== Nanosecond resolution ==
More and more clocks have a frequency in MHz, up to GHz for the "TSC" CPU clock, and so the clocks resolution is getting closer to 1 nanosecond (or even better than 1 ns for the TSC clock!).
The problem is that Python returns time as a floatting point number which is usually a 64-bit binary floatting number (in the IEEE 754 format). This type starts to loose nanoseconds after 104 days. Conversion from nanoseconds (int) to seconds (float) and then back to nanoseconds (int) to check if conversions loose precision:
# no precision loss
x=2**52+1; int(float(x * 1e-9) * 1e9) - x 0 # precision loss! (1 nanosecond) x=2**53+1; int(float(x * 1e-9) * 1e9) - x -1 print(datetime.timedelta(seconds=2**53 / 1e9)) 104 days, 5:59:59.254741
While a system administrator can be proud to have an uptime longer than 104 days, the problem also exists for the time.time() clock which returns the number of seconds since the UNIX epoch (1970-01-01). This clock started to loose nanoseconds since mid-May 1970 (47 years ago):
import datetime print(datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=2**53 / 1e9)) 1970-04-15 05:59:59.254741
== PEP 410 ==
Five years ago, I proposed a large and complex change in all Python functions returning time to support nanosecond resolution using the decimal.Decimal type:
https://www.python.org/dev/peps/pep-0410/
The PEP was rejected for different reasons:
* it wasn't clear if hardware clocks really had a resolution of 1 nanosecond, especially when the clock is read from Python, since reading a clock in Python also takes time...
* Guido van Rossum rejected the idea of adding a new optional parameter to change the result type: it's an uncommon programming practice (bad design in Python)
* decimal.Decimal is not widely used, it might be surprised to get such type
== CPython enhancements of the last 5 years ==
Since this PEP was rejected:
* the os.stat_result got 3 fields for timestamps as nanoseconds (Python int): st_atime_ns, st_ctime_ns, st_mtime_ns
* Python 3.3 got 3 new clocks: time.monotonic(), time.perf_counter() and time.process_time()
* I enhanced the private C API of Python handling time (API called "pytime") to store all timings as the new _PyTime_t type which is a simple 64-bit signed integer. The unit of _PyTime_t is not part of the API, it's an implementation detail. The unit is currently 1 nanosecond.
This week, I converted one of the last clock to new _PyTime_t format: time.perf_counter() now has internally a resolution of 1 nanosecond, instead of using the C double type.
XXX technically https://github.com/python/cpython/pull/3983 is not merged yet :-)
== Clocks resolution in Python ==
I implemented time.time_ns(), time.monotonic_ns() and time.perf_counter_ns() which are similar of the functions without the "_ns" suffix, but return time as nanoseconds (Python int).
I computed the smallest difference between two clock reads (ignoring a differences of zero):
Linux:
* time_ns(): 84 ns <=== !!! * time(): 239 ns <=== !!! * perf_counter_ns(): 84 ns * perf_counter(): 82 ns * monotonic_ns(): 84 ns * monotonic(): 81 ns
Windows:
* time_ns(): 318000 ns <=== !!! * time(): 894070 ns <=== !!! * perf_counter_ns(): 100 ns * perf_counter(): 100 ns * monotonic_ns(): 15000000 ns * monotonic(): 15000000 ns
The difference on time.time() is significant: 84 ns (2.8x better) vs 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows. The difference will be larger next years since every day adds 864,00,000,000,000 nanoseconds to the system clock :-) (please don't bug me with leap seconds! you got my point)
The difference on perf_counter and monotonic clocks are not visible in this quick script since my script runs less than 1 minute, my computer uptime is smaller than 1 weak, ... and Python internally starts these clocks at zero *to reduce the precision loss*! Using an uptime larger than 104 days, you would probably see a significant difference (at least +/- 1 nanosecond) between the regular (seconds as double) and the "_ns" (nanoseconds as int) clocks.
== How many new nanosecond clocks? ==
The PEP 410 proposed to modify the following functions:
* os module: fstat(), fstatat(), lstat(), stat() (st_atime, st_ctime and st_mtime fields of the stat structure), sched_rr_get_interval(), times(), wait3() and wait4()
* resource module: ru_utime and ru_stime fields of getrusage()
* signal module: getitimer(), setitimer()
* time module: clock(), clock_gettime(), clock_getres(), monotonic(), time() and wallclock() ("wallclock()" was finally called "monotonic", see PEP 418)
According to my tests of the previous section, the precision loss starts after 104 days (stored in nanoseconds). I don't know if it's worth it to modify functions which return "CPU time" or "process time" of processes, since most processes live shorter than 104 days. Do you care of a resolution of 1 nanosecond for the CPU and process time?
Maybe we need 1 nanosecond resolution for profiling and benchmarks. But in that case, you might want to implement your profiler in C rather in Python, like the hotshot module, no? The "pytime" private API of CPython gives you clocks with a resolution of 1 nanosecond.
== Annex: clock performance ==
To have an idea of the cost of reading the clock on the clock resolution in Python, I also ran a microbenchmark on *reading* a clock. Example:
$ ./python -m perf timeit --duplicate 1024 -s 'import time; t=time.time' 't()'
Linux (Mean +- std dev):
* time.time(): 45.4 ns +- 0.5 ns * time.time_ns(): 47.8 ns +- 0.8 ns * time.perf_counter(): 46.8 ns +- 0.7 ns * time.perf_counter_ns(): 46.0 ns +- 0.6 ns
Windows (Mean +- std dev):
* time.time(): 42.2 ns +- 0.8 ns * time.time_ns(): 49.3 ns +- 0.8 ns * time.perf_counter(): 136 ns +- 2 ns <=== * time.perf_counter_ns(): 143 ns +- 4 ns <=== * time.monotonic(): 38.3 ns +- 0.9 ns * time.monotonic_ns(): 48.8 ns +- 1.2 ns
Most clocks have the same performance except of perf_counter on Windows: around 140 ns whereas other clocks are around 45 ns (on Linux and Windows): 3x slower. Maybe the "bad" perf_counter performance can be explained by the fact that I'm running Windows in a VM, which is not ideal for benchmarking. Or maybe my C implementation of time.perf_counter() is slow?
Note: I expect that a significant part of the numbers are the cost of Python function calls. Reading these clocks using the Python C functions are likely faster.
Victor _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/