Mailman 3 Re: [Speed] Disable hash randomization to get reliable benchmarks - Speed

26 Apr 2016


      On Tue, Apr 26, 2016 at 11:46 AM, Victor Stinner
<victor.stinner@gmail.com> wrote:
...
Hi,
2016-04-26 10:56 GMT+02:00 Armin Rigo <arigo@tunes.org>:
...
Hi,
On 25 April 2016 at 08:25, Maciej Fijalkowski <fijall@gmail.com> wrote:
...
The problem with disabled ASLR is that you change the measurment from
a statistical distribution, to one draw from a statistical
distribution repeatedly. There is no going around doing multiple runs
and doing an average on that.
You should mention that it is usually enough to do the following:
instead of running once with PYTHONHASHSEED=0, run five or ten times
with PYTHONHASHSEED in range(5 or 10).  In this way, you get all
benefits: not-too-long benchmarking, no randomness, but still some
statistically relevant sampling.
I guess that the number of required runs to get a nice distribution
depends on the size of the largest dictionary in the benchmark. I
mean, the dictionaries that matter in performance.
The best would be to handle this transparently in perf.py. Either
disable all source of randomness, or run mutliple processes to have an
uniform distribution, rather than on only having one sample for one
specific config. Maybe it could be an option: by default, run multiple
processes, but have an option to only run one process using
PYTHONHASHSEED=0.
By the way, timeit has a very similar issue. I'm quite sure that most
Python developers run "python -m timeit ..." at least 3 times and take
the minimum. "python -m timeit" could maybe be modified to also spawn
child processes to get a better distribution, and maybe also modified
to display the minimum, the average and the standard deviation? (not
only the minimum)
taking the minimum is a terrible idea anyway, none of the statistical
discussion makes sense if you do that
...
Well, the question is also if it's a good thing to have such really
tiny microbenchmark like bm_call_simple in the Python benchmark suite.
I spend 2 or 3 days to analyze CPython running bm_call_simple with
Linux perf tool, callgrind and cachegrind. I'm still unable to
understand the link between my changes on the C code and the result.
IMHO this specific benchmark depends on very low-level things like the
CPU L1 cache.  Maybe bm_call_simple helps in some very specific use
cases, like trying to make Python function calls faster. But in other
cases, it can be a source of noise, confusion and frustration...
Victor
maybe it's just a terrible benchmark (it surely is for pypy for example)

Re: [Speed] Disable hash randomization to get reliable benchmarks

Maciej Fijalkowski

tags

participants (1)