[docs] [issue15369] pybench and test.pystone poorly documented

Marc-Andre Lemburg report at bugs.python.org
Thu Sep 15 05:21:34 EDT 2016


Marc-Andre Lemburg added the comment:

On 15.09.2016 11:11, STINNER Victor wrote:
> 
> STINNER Victor added the comment:
> 
> Hum, since the discussion restarted, I reopen the issue ...
> 
> "Well, pybench is not just one benchmark, it's a whole collection of benchmarks for various different aspects of the CPython VM and per concept it tries to calibrate itself per benchmark, since each benchmark has different overhead."
> 
> In the performance module, you now get individual timing for each pybench benchmark, instead of an overall total which was less useful.

pybench had the same intention. It was a design mistake to add an
overall timing to each suite run. The original intention was to
compare each benchmark individually.

Perhaps it would make sense to try to port the individual benchmark
tests in pybench to performance.

> "The number of iterations per benchmark will not change between runs, since this number is fixed in each benchmark."
> 
> Please take a look at the new performance module, it has a different design. Calibration is based on minimum time per sample, no more on hardcoded things. I modified all benchmarks, not only pybench.

I think we are talking about different things here: calibration is
pybench means that you try to determine the overhead of the
outer loop and possible setup code that is needed to run the
the test.

pybench runs a calibration method which has the same
code as the main test, but without the actual operations that you
want to test, in order to determine the timing of the overhead.

It then takes the minimum timing from overhead runs and uses
this as base line for the actual test runs (it subtracts the
overhead timing from the test run results).

This may not be ideal in all cases, but it's the closest
I could get to timing of the test operations at the time.

I'll have a look at what performance does.

> "BTW: Why would you want to run benchmarks in child processes and in parallel ?"
> 
> Child processes are run sequentially.

Ah, ok.

> Running benchmarks in multiple processes help to get more reliable benchmarks. Read my article if you want to learn more about the design of my perf module:
> http://haypo-notes.readthedocs.io/microbenchmark.html#my-articles

Will do, thanks.

> "Ideally, the pybench process should be the only CPU intense work load on the entire CPU to get reasonable results."
> 
> The perf module automatically uses isolated CPU. It strongly suggests to use this amazing Linux feature to run benchmarks!
> https://haypo.github.io/journey-to-stable-benchmark-system.html
> 
> I started to write advices to get stable benchmarks:
> https://github.com/python/performance#how-to-get-stable-benchmarks
> 
> Note: See also the https://mail.python.org/mailman/listinfo/speed mailing list ;-)

I've read some of your blog posts and articles on the subject
and your journey. Interesting stuff, definitely. Benchmarking
these days appears to have gotten harder not simpler compared to
the days of pybench some 19 years ago.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15369>
_______________________________________


More information about the docs mailing list