[Speed] Disable hash randomization to get reliable benchmarks

Victor Stinner victor.stinner at gmail.com
Tue Apr 26 15:11:40 EDT 2016


2016-04-26 18:36 GMT+02:00 Antoine Pitrou <solipsis at pitrou.net>:
> The minimum is a reasonable metric for quick throwaway benchmarks as
> timeit is designed for, as it has a better hope of alleviating the
> impact of system load (as such throwaway benchmarks are often run on
> the developer's workstation).

IMHO we must at least display the standard deviation. Maybe we can do
better and provide 4 numbers:

* Average
* Standard deviation
* Minimum
* Maximum

The maximum helps to detect rare events like Maciej said (something in
the OS, GC collection, etc.).

For example, we can use this format:

   Average: 293.5 ms +/- 143.2 ms (min: 213.9 ms, max: 629.7 ms)

It's the result of still the same microbenchmark, bm_call_simple.py,
run on my laptop. As you can see, there is a large deviation: 143 ms /
293 ms is 49%, the benchmark is unstable. Maybe we should say
explicitly that the result is not significant? Example:

   Average: 293.5 ms +/- 143.2 ms (min: 213.9 ms, max: 629.7 ms) --
not significant
   The benchmark is unstable, maybe the system is heavily loaded?

By the way, "293.5 ms +/- 143.2 ms" is misleading. Maybe we should
display it as "0.3 sec +/- 0.1 sec" to not show inaccurate digits?

Another example, same laptop but using CPU isolation:

   Average: 219.5 ms +/- 1.6 ms (min: 215.9 ms, max: 223.8 ms)

In this example, we can see that "+/- 1.6" is is the standard
deviation, it's unrelated to minimum and maximum.

Victor


More information about the Speed mailing list