<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 10, 2016 at 1:13 PM, Victor Stinner <span dir="ltr"><<a href="mailto:victor.stinner@gmail.com" target="_blank">victor.stinner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi,<br>

<br>

Last weeks, I made researchs on how to get stable and reliable<br>

benchmarks, especially for the corner case of microbenchmarks. The<br>

first result is a serie of article, here are the first three:<br>

<br>

<a href="https://haypo.github.io/journey-to-stable-benchmark-system.html" rel="noreferrer" target="_blank">https://haypo.github.io/journey-to-stable-benchmark-system.html</a><br>

<a href="https://haypo.github.io/journey-to-stable-benchmark-deadcode.html" rel="noreferrer" target="_blank">https://haypo.github.io/journey-to-stable-benchmark-deadcode.html</a><br>

<a href="https://haypo.github.io/journey-to-stable-benchmark-average.html" rel="noreferrer" target="_blank">https://haypo.github.io/journey-to-stable-benchmark-average.html</a><br>

<br>

The second result is a new perf module which includes all "tricks"<br>

discovered in my research: compute average and standard deviation,<br>

spawn multiple worker child processes, automatically calibrate the<br>

number of outter-loop iterations, automatically pin worker processes<br>

to isolated CPUs, and more.<br>

<br>

The perf module allows to store benchmark results as JSON to analyze<br>

them in depth later. It helps to configure correctly a benchmark and<br>

check manually if it is reliable or not.<br>

<br>

The perf documentation also explains how to get stable and reliable<br>

benchmarks (ex: how to tune Linux to isolate CPUs).<br>

<br>

perf has 3 builtin CLI commands:<br>

<br>

* python -m perf: show and compare JSON results<br>

* python -m perf.timeit: new better and more reliable implementation of timeit<br>

* python -m metadata: display collected metadata<br>

<br>

Python 3 is recommended to get time.perf_counter(), use the new<br>

accurate statistics module, automatic CPU pinning (I will implement it<br>

on Python 2 later), etc. But Python 2.7 is also supported, fallbacks<br>

are implemented when needed.<br>

<br>

Example with the patched telco benchmark (benchmark for the decimal<br>

module) on a Linux with two isolated CPUs.<br>

<br>

First run the benchmark:<br>

---<br>

$ python3 telco.py --json-file=telco.json<br>

.........................<br>

Average: 26.7 ms +- 0.2 ms<br>

---<br>

<br>

<br>

Then show the JSON content to see all details:<br>

---<br>

$ python3 -m perf -v show telco.json<br>

Metadata:<br>

- aslr: enabled<br>

- cpu_affinity: 2, 3<br>

- cpu_count: 4<br>

- cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz<br>

- hostname: smithers<br>

- loops: 10<br>

- platform: Linux-4.4.9-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three<br>

- python_executable: /usr/bin/python3<br>

- python_implementation: cpython<br>

- python_version: 3.4.3<br>

<br>

Run 1/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.8 ms, 26.7 ms<br>

Run 2/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms<br>

Run 3/25: warmup (1): 26.9 ms; samples (3): 26.8 ms, 26.9 ms, 26.8 ms<br>

(...)<br>

Run 25/25: warmup (1): 26.8 ms; samples (3): 26.7 ms, 26.7 ms, 26.7 ms<br>

<br>

Average: 26.7 ms +- 0.2 ms (25 runs x 3 samples; 1 warmup)<br>

---<br>

<br>

Note: benchmarks can be analyzed with Python 2.<br>

<br>

I'm posting my email to python-dev because providing timeit results is<br>

commonly requested in review of optimization patches.<br>

<br>

The next step is to patch the CPython benchmark suite to use the perf<br>

module. I already forked the repository and started to patch some<br>

benchmarks.<br>

<br>

If you are interested by Python performance in general, please join us<br>

on the speed mailing list!<br>

<a href="https://mail.python.org/mailman/listinfo/speed" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/speed</a><br>

<br>

Victor<br>

_______________________________________________<br>

Python-Dev mailing list<br>

<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-dev" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/python-dev</a><br>

Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com" rel="noreferrer" target="_blank">https://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com</a><br>

</blockquote></div><br>This is very interesting and also somewhat related to psutil. I wonder... would increasing process priority help isolating benchmarks even more? By this I mean "os.nice(-20)".</div><div class="gmail_extra">Extra: perhaps even IO priority: <a href="https://pythonhosted.org/psutil/#psutil.Process.ionice">https://pythonhosted.org/psutil/#psutil.Process.ionice</a> ?</div><div class="gmail_extra"><br></div><div class="gmail_extra"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Giampaolo - <a href="http://grodola.blogspot.com" target="_blank">http://grodola.blogspot.com</a></div><div><br></div></div></div>

</div></div>