<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace"><span style="font-family:arial,sans-serif">On Sun, Oct 15, 2017 at 8:17 PM, Antoine Pitrou </span><span dir="ltr" style="font-family:arial,sans-serif"><<a href="mailto:solipsis@pitrou.net" target="_blank">solipsis@pitrou.net</a>></span><span style="font-family:arial,sans-serif"> wrote:</span><br></div><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Since new APIs are expensive and we'd like to be future-proof, why not<br>
move to picoseconds? That would be safe until clocks reach the THz<br>
barrier, which is quite far away from us.<br>
<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:monospace,monospace">I somewhat like the thought, but would everyone then end up thinking about what power of 1000 they need to multiply with?</div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">-- Koos</div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Regards<br>
<br>
Antoine.<br>
<br>
<br>
On Fri, 13 Oct 2017 16:12:39 +0200<br>
Victor Stinner <<a href="mailto:victor.stinner@gmail.com">victor.stinner@gmail.com</a>><br>
wrote:<br>
> Hi,<br>
><br>
> I would like to add new functions to return time as a number of<br>
> nanosecond (Python int), especially time.time_ns().<br>
><br>
> It would enhance the time.time() clock resolution. In my experience,<br>
> it decreases the minimum non-zero delta between two clock by 3 times,<br>
> new "ns" clock versus current clock: 84 ns (2.8x better) vs 239 ns on<br>
> Linux, and 318 us (2.8x better) vs 894 us on Windows, measured in<br>
> Python.<br>
><br>
> The question of this long email is if it's worth it to add more "_ns"<br>
> time functions than just time.time_ns()?<br>
><br>
> I would like to add:<br>
><br>
> * time.time_ns()<br>
> * time.monotonic_ns()<br>
> * time.perf_counter_ns()<br>
> * time.clock_gettime_ns()<br>
> * time.clock_settime_ns()<br>
><br>
> time(), monotonic() and perf_counter() clocks are the 3 most common<br>
> clocks and users use them to get the best available clock resolution.<br>
> clock_gettime/settime() are the generic UNIX API to access these<br>
> clocks and so should also be enhanced to get nanosecond resolution.<br>
><br>
><br>
> == Nanosecond resolution ==<br>
><br>
> More and more clocks have a frequency in MHz, up to GHz for the "TSC"<br>
> CPU clock, and so the clocks resolution is getting closer to 1<br>
> nanosecond (or even better than 1 ns for the TSC clock!).<br>
><br>
> The problem is that Python returns time as a floatting point number<br>
> which is usually a 64-bit binary floatting number (in the IEEE 754<br>
> format). This type starts to loose nanoseconds after 104 days.<br>
> Conversion from nanoseconds (int) to seconds (float) and then back to<br>
> nanoseconds (int) to check if conversions loose precision:<br>
><br>
> # no precision loss<br>
> >>> x=2**52+1; int(float(x * 1e-9) * 1e9) - x<br>
> 0<br>
> # precision loss! (1 nanosecond)<br>
> >>> x=2**53+1; int(float(x * 1e-9) * 1e9) - x<br>
> -1<br>
> >>> print(datetime.timedelta(<wbr>seconds=2**53 / 1e9))<br>
> 104 days, 5:59:59.254741<br>
><br>
> While a system administrator can be proud to have an uptime longer<br>
> than 104 days, the problem also exists for the time.time() clock which<br>
> returns the number of seconds since the UNIX epoch (1970-01-01). This<br>
> clock started to loose nanoseconds since mid-May 1970 (47 years ago):<br>
><br>
> >>> import datetime<br>
> >>> print(datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=2**<wbr>53 / 1e9))<br>
> 1970-04-15 05:59:59.254741<br>
><br>
><br>
> == PEP 410 ==<br>
><br>
> Five years ago, I proposed a large and complex change in all Python<br>
> functions returning time to support nanosecond resolution using the<br>
> decimal.Decimal type:<br>
><br>
> <a href="https://www.python.org/dev/peps/pep-0410/" rel="noreferrer" target="_blank">https://www.python.org/dev/<wbr>peps/pep-0410/</a><br>
><br>
> The PEP was rejected for different reasons:<br>
><br>
> * it wasn't clear if hardware clocks really had a resolution of 1<br>
> nanosecond, especially when the clock is read from Python, since<br>
> reading a clock in Python also takes time...<br>
><br>
> * Guido van Rossum rejected the idea of adding a new optional<br>
> parameter to change the result type: it's an uncommon programming<br>
> practice (bad design in Python)<br>
><br>
> * decimal.Decimal is not widely used, it might be surprised to get such type<br>
><br>
><br>
> == CPython enhancements of the last 5 years ==<br>
><br>
> Since this PEP was rejected:<br>
><br>
> * the os.stat_result got 3 fields for timestamps as nanoseconds<br>
> (Python int): st_atime_ns, st_ctime_ns, st_mtime_ns<br>
><br>
> * Python 3.3 got 3 new clocks: time.monotonic(), time.perf_counter()<br>
> and time.process_time()<br>
><br>
> * I enhanced the private C API of Python handling time (API called<br>
> "pytime") to store all timings as the new _PyTime_t type which is a<br>
> simple 64-bit signed integer. The unit of _PyTime_t is not part of the<br>
> API, it's an implementation detail. The unit is currently 1<br>
> nanosecond.<br>
><br>
><br>
> This week, I converted one of the last clock to new _PyTime_t format:<br>
> time.perf_counter() now has internally a resolution of 1 nanosecond,<br>
> instead of using the C double type.<br>
><br>
> XXX technically <a href="https://github.com/python/cpython/pull/3983" rel="noreferrer" target="_blank">https://github.com/python/<wbr>cpython/pull/3983</a> is not<br>
> merged yet :-)<br>
><br>
><br>
><br>
> == Clocks resolution in Python ==<br>
><br>
> I implemented time.time_ns(), time.monotonic_ns() and<br>
> time.perf_counter_ns() which are similar of the functions without the<br>
> "_ns" suffix, but return time as nanoseconds (Python int).<br>
><br>
> I computed the smallest difference between two clock reads (ignoring a<br>
> differences of zero):<br>
><br>
> Linux:<br>
><br>
> * time_ns(): 84 ns <=== !!!<br>
> * time(): 239 ns <=== !!!<br>
> * perf_counter_ns(): 84 ns<br>
> * perf_counter(): 82 ns<br>
> * monotonic_ns(): 84 ns<br>
> * monotonic(): 81 ns<br>
><br>
> Windows:<br>
><br>
> * time_ns(): 318000 ns <=== !!!<br>
> * time(): 894070 ns <=== !!!<br>
> * perf_counter_ns(): 100 ns<br>
> * perf_counter(): 100 ns<br>
> * monotonic_ns(): 15000000 ns<br>
> * monotonic(): 15000000 ns<br>
><br>
> The difference on time.time() is significant: 84 ns (2.8x better) vs<br>
> 239 ns on Linux and 318 us (2.8x better) vs 894 us on Windows. The<br>
> difference will be larger next years since every day adds<br>
> 864,00,000,000,000 nanoseconds to the system clock :-) (please don't<br>
> bug me with leap seconds! you got my point)<br>
><br>
> The difference on perf_counter and monotonic clocks are not visible in<br>
> this quick script since my script runs less than 1 minute, my computer<br>
> uptime is smaller than 1 weak, ... and Python internally starts these<br>
> clocks at zero *to reduce the precision loss*! Using an uptime larger<br>
> than 104 days, you would probably see a significant difference (at<br>
> least +/- 1 nanosecond) between the regular (seconds as double) and<br>
> the "_ns" (nanoseconds as int) clocks.<br>
><br>
><br>
><br>
> == How many new nanosecond clocks? ==<br>
><br>
> The PEP 410 proposed to modify the following functions:<br>
><br>
> * os module: fstat(), fstatat(), lstat(), stat() (st_atime, st_ctime<br>
> and st_mtime fields of the stat structure), sched_rr_get_interval(),<br>
> times(), wait3() and wait4()<br>
><br>
> * resource module: ru_utime and ru_stime fields of getrusage()<br>
><br>
> * signal module: getitimer(), setitimer()<br>
><br>
> * time module: clock(), clock_gettime(), clock_getres(), monotonic(),<br>
> time() and wallclock() ("wallclock()" was finally called "monotonic",<br>
> see PEP 418)<br>
><br>
><br>
> According to my tests of the previous section, the precision loss<br>
> starts after 104 days (stored in nanoseconds). I don't know if it's<br>
> worth it to modify functions which return "CPU time" or "process time"<br>
> of processes, since most processes live shorter than 104 days. Do you<br>
> care of a resolution of 1 nanosecond for the CPU and process time?<br>
><br>
> Maybe we need 1 nanosecond resolution for profiling and benchmarks.<br>
> But in that case, you might want to implement your profiler in C<br>
> rather in Python, like the hotshot module, no? The "pytime" private<br>
> API of CPython gives you clocks with a resolution of 1 nanosecond.<br>
><br>
><br>
> == Annex: clock performance ==<br>
><br>
> To have an idea of the cost of reading the clock on the clock<br>
> resolution in Python, I also ran a microbenchmark on *reading* a<br>
> clock. Example:<br>
><br>
> $ ./python -m perf timeit --duplicate 1024 -s 'import time; t=time.time' 't()'<br>
><br>
> Linux (Mean +- std dev):<br>
><br>
> * time.time(): 45.4 ns +- 0.5 ns<br>
> * time.time_ns(): 47.8 ns +- 0.8 ns<br>
> * time.perf_counter(): 46.8 ns +- 0.7 ns<br>
> * time.perf_counter_ns(): 46.0 ns +- 0.6 ns<br>
><br>
> Windows (Mean +- std dev):<br>
><br>
> * time.time(): 42.2 ns +- 0.8 ns<br>
> * time.time_ns(): 49.3 ns +- 0.8 ns<br>
> * time.perf_counter(): 136 ns +- 2 ns <===<br>
> * time.perf_counter_ns(): 143 ns +- 4 ns <===<br>
> * time.monotonic(): 38.3 ns +- 0.9 ns<br>
> * time.monotonic_ns(): 48.8 ns +- 1.2 ns<br>
><br>
> Most clocks have the same performance except of perf_counter on<br>
> Windows: around 140 ns whereas other clocks are around 45 ns (on Linux<br>
> and Windows): 3x slower. Maybe the "bad" perf_counter performance can<br>
> be explained by the fact that I'm running Windows in a VM, which is<br>
> not ideal for benchmarking. Or maybe my C implementation of<br>
> time.perf_counter() is slow?<br>
><br>
> Note: I expect that a significant part of the numbers are the cost of<br>
> Python function calls. Reading these clocks using the Python C<br>
> functions are likely faster.<br>
><br>
><br>
> Victor<br>
> ______________________________<wbr>_________________<br>
> Python-ideas mailing list<br>
> <a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>
> Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>
><br>
<br>
<br>
<br>
______________________________<wbr>_________________<br>
Python-ideas mailing list<br>
<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>
Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">+ Koos Zevenhoven + <a href="http://twitter.com/k7hoven" target="_blank">http://twitter.com/k7hoven</a> +</div>
</div></div>