<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace"><span style="font-family:arial,sans-serif">On Sun, Oct 15, 2017 at 8:17 PM, Antoine Pitrou </span><span dir="ltr" style="font-family:arial,sans-serif"><<a href="mailto:solipsis@pitrou.net" target="_blank">solipsis@pitrou.net</a>></span><span style="font-family:arial,sans-serif"> wrote:</span><br></div><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

Since new APIs are expensive and we'd like to be future-proof, why not<br>

move to picoseconds?  That would be safe until clocks reach the THz<br>

barrier, which is quite far away from us.<br>

<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:monospace,monospace">I somewhat like the thought, but would everyone then end up thinking about what power of 1000 they need to multiply with?</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">-- Koos</div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Regards<br>

<br>

Antoine.<br>

<br>

<br>

On Fri, 13 Oct 2017 16:12:39 +0200<br>

Victor Stinner <<a href="mailto:victor.stinner@gmail.com">victor.stinner@gmail.com</a>><br>

wrote:<br>

> Hi,<br>

><br>

> I would like to add new functions to return time as a number of<br>

> nanosecond (Python int), especially time.time_ns().<br>

><br>

> It would enhance the time.time() clock resolution. In my experience,<br>

> it decreases the minimum non-zero delta between two clock by 3 times,<br>

> new "ns" clock versus current clock: 84 ns (2.8x better) vs 239 ns on<br>

> Linux, and 318 us (2.8x better) vs 894 us on Windows, measured in<br>

> Python.<br>

><br>

> The question of this long email is if it's worth it to add more "_ns"<br>

> time functions than just time.time_ns()?<br>

><br>

> I would like to add:<br>

><br>

> * time.time_ns()<br>

> * time.monotonic_ns()<br>

> * time.perf_counter_ns()<br>

> * time.clock_gettime_ns()<br>

> * time.clock_settime_ns()<br>

><br>

> time(), monotonic() and perf_counter() clocks are the 3 most common<br>

> clocks and users use them to get the best available clock resolution.<br>

> clock_gettime/settime() are the generic UNIX API to access these<br>

> clocks and so should also be enhanced to get nanosecond resolution.<br>

><br>

><br>

> == Nanosecond resolution ==<br>

><br>

> More and more clocks have a frequency in MHz, up to GHz for the "TSC"<br>

> CPU clock, and so the clocks resolution is getting closer to 1<br>

> nanosecond (or even better than 1 ns for the TSC clock!).<br>

><br>

> The problem is that Python returns time as a floatting point number<br>

> which is usually a 64-bit binary floatting number (in the IEEE 754<br>

> format). This type starts to loose nanoseconds after 104 days.<br>

> Conversion from nanoseconds (int) to seconds (float) and then back to<br>

> nanoseconds (int) to check if conversions loose precision:<br>

><br>

> # no precision loss<br>

> >>> x=2**52+1; int(float(x * 1e-9) * 1e9) - x<br>

> 0<br>

> # precision loss! (1 nanosecond)<br>

> >>> x=2**53+1; int(float(x * 1e-9) * 1e9) - x<br>

> -1<br>

> >>> print(datetime.timedelta(<wbr>seconds=2**53 / 1e9))<br>

> 104 days, 5:59:59.254741<br>

><br>

> While a system administrator can be proud to have an uptime longer<br>

> than 104 days, the problem also exists for the time.time() clock which<br>

> returns the number of seconds since the UNIX epoch (1970-01-01). This<br>

> clock started to loose nanoseconds since mid-May 1970 (47 years ago):<br>

><br>

> >>> import datetime<br>

> >>> print(datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=2**<wbr>53 / 1e9))<br>

> 1970-04-15 05:59:59.254741<br>

><br>

><br>

> == PEP 410 ==<br>

><br>

> Five years ago, I proposed a large and complex change in all Python<br>

> functions returning time to support nanosecond resolution using the<br>

> decimal.Decimal type:<br>

><br>

>    <a href="https://www.python.org/dev/peps/pep-0410/" rel="noreferrer" target="_blank">https://www.python.org/dev/<wbr>peps/pep-0410/</a><br>

><br>

> The PEP was rejected for different reasons:<br>

><br>

> * it wasn't clear if hardware clocks really had a resolution of 1<br>

> nanosecond, especially when the clock is read from Python, since<br>

> reading a clock in Python also takes time...<br>

><br>

> * Guido van Rossum rejected the idea of adding a new optional<br>

> parameter to change the result type: it's an uncommon programming<br>

> practice (bad design in Python)<br>

><br>

> * decimal.Decimal is not widely used, it might be surprised to get such type<br>

><br>

><br>

> == CPython enhancements of the last 5 years ==<br>

><br>

> Since this PEP was rejected:<br>

><br>

> * the os.stat_result got 3 fields for timestamps as nanoseconds<br>

> (Python int): st_atime_ns, st_ctime_ns, st_mtime_ns<br>

><br>

> * Python 3.3 got 3 new clocks: time.monotonic(), time.perf_counter()<br>

> and time.process_time()<br>

><br>

> * I enhanced the private C API of Python handling time (API called<br>

> "pytime") to store all timings as the new _PyTime_t type which is a<br>

> simple 64-bit signed integer. The unit of _PyTime_t is not part of the<br>

> API, it's an implementation detail. The unit is currently 1<br>

> nanosecond.<br>

><br>

><br>

> This week, I converted one of the last clock to new _PyTime_t format:<br>

> time.perf_counter() now has internally a resolution of 1 nanosecond,<br>

> instead of using the C double type.<br>

><br>

> XXX technically <a href="https://github.com/python/cpython/pull/3983" rel="noreferrer" target="_blank">https://github.com/python/<wbr>cpython/pull/3983</a> is not<br>

> merged yet :-)<br>

><br>

><br>

><br>

> == Clocks resolution in Python ==<br>

><br>

> I implemented time.time_ns(), time.monotonic_ns() and<br>

> time.perf_counter_ns() which are similar of the functions without the<br>

> "_ns" suffix, but return time as nanoseconds (Python int).<br>

><br>

> I computed the smallest difference between two clock reads (ignoring a<br>

> differences of zero):<br>

><br>

> Linux:<br>

><br>

> * time_ns(): 84 ns <=== !!!<br>

> * time(): 239 ns <=== !!!<br>

> * perf_counter_ns(): 84 ns<br>

> * perf_counter(): 82 ns<br>

> * monotonic_ns(): 84 ns<br>

> * monotonic(): 81 ns<br>

><br>

> Windows:<br>

><br>

> * time_ns(): 318000 ns <=== !!!<br>

> * time(): 894070 ns <=== !!!<br>

> * perf_counter_ns(): 100 ns<br>

> * perf_counter(): 100 ns<br>

> * monotonic_ns(): 15000000 ns<br>

> * monotonic(): 15000000 ns<br>

><br>

> The difference on time.time() is significant: 84 ns (2.8x better) vs<br>

> 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows. The<br>

> difference will be larger next years since every day adds<br>

> 864,00,000,000,000 nanoseconds to the system clock :-) (please don't<br>

> bug me with leap seconds! you got my point)<br>

><br>

> The difference on perf_counter and monotonic clocks are not visible in<br>

> this quick script since my script runs less than 1 minute, my computer<br>

> uptime is smaller than 1 weak, ... and Python internally starts these<br>

> clocks at zero *to reduce the precision loss*! Using an uptime larger<br>

> than 104 days, you would probably see a significant difference (at<br>

> least +/- 1 nanosecond) between the regular (seconds as double) and<br>

> the "_ns" (nanoseconds as int) clocks.<br>

><br>

><br>

><br>

> == How many new nanosecond clocks? ==<br>

><br>

> The PEP 410 proposed to modify the following functions:<br>

><br>

> * os module: fstat(), fstatat(), lstat(), stat() (st_atime, st_ctime<br>

> and st_mtime fields of the stat structure), sched_rr_get_interval(),<br>

> times(), wait3() and wait4()<br>

><br>

> * resource module: ru_utime and ru_stime fields of getrusage()<br>

><br>

> * signal module: getitimer(), setitimer()<br>

><br>

> * time module: clock(), clock_gettime(), clock_getres(), monotonic(),<br>

> time() and wallclock() ("wallclock()" was finally called "monotonic",<br>

> see PEP 418)<br>

><br>

><br>

> According to my tests of the previous section, the precision loss<br>

> starts after 104 days (stored in nanoseconds). I don't know if it's<br>

> worth it to modify functions which return "CPU time" or "process time"<br>

> of processes, since most processes live shorter than 104 days. Do you<br>

> care of a resolution of 1 nanosecond for the CPU and process time?<br>

><br>

> Maybe we need 1 nanosecond resolution for profiling and benchmarks.<br>

> But in that case, you might want to implement your profiler in C<br>

> rather in Python, like the hotshot module, no? The "pytime" private<br>

> API of CPython gives you clocks with a resolution of 1 nanosecond.<br>

><br>

><br>

> == Annex: clock performance ==<br>

><br>

> To have an idea of the cost of reading the clock on the clock<br>

> resolution in Python, I also ran a microbenchmark on *reading* a<br>

> clock. Example:<br>

><br>

> $ ./python -m perf timeit --duplicate 1024 -s 'import time; t=time.time' 't()'<br>

><br>

> Linux (Mean +- std dev):<br>

><br>

> * time.time(): 45.4 ns +- 0.5 ns<br>

> * time.time_ns(): 47.8 ns +- 0.8 ns<br>

> * time.perf_counter(): 46.8 ns +- 0.7 ns<br>

> * time.perf_counter_ns(): 46.0 ns +- 0.6 ns<br>

><br>

> Windows (Mean +- std dev):<br>

><br>

> * time.time(): 42.2 ns +- 0.8 ns<br>

> * time.time_ns(): 49.3 ns +- 0.8 ns<br>

> * time.perf_counter(): 136 ns +- 2 ns <===<br>

> * time.perf_counter_ns(): 143 ns +- 4 ns <===<br>

> * time.monotonic(): 38.3 ns +- 0.9 ns<br>

> * time.monotonic_ns(): 48.8 ns +- 1.2 ns<br>

><br>

> Most clocks have the same performance except of perf_counter on<br>

> Windows: around 140 ns whereas other clocks are around 45 ns (on Linux<br>

> and Windows): 3x slower. Maybe the "bad" perf_counter performance can<br>

> be explained by the fact that I'm running Windows in a VM, which is<br>

> not ideal for benchmarking. Or maybe my C implementation of<br>

> time.perf_counter() is slow?<br>

><br>

> Note: I expect that a significant part of the numbers are the cost of<br>

> Python function calls. Reading these clocks using the Python C<br>

> functions are likely faster.<br>

><br>

><br>

> Victor<br>

> ______________________________<wbr>_________________<br>

> Python-ideas mailing list<br>

> <a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

> <a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

> Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

><br>

<br>

<br>

<br>

______________________________<wbr>_________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">+ Koos Zevenhoven + <a href="http://twitter.com/k7hoven" target="_blank">http://twitter.com/k7hoven</a> +</div>

</div></div>