[Speed] Disable hash randomization to get reliable benchmarks

Victor Stinner victor.stinner at gmail.com
Tue Apr 26 06:06:58 EDT 2016


Hi,

2016-04-26 11:01 GMT+02:00 Antonio Cuni <anto.cuni at gmail.com>:
> On Mon, Apr 25, 2016 at 12:49 AM, Victor Stinner <victor.stinner at gmail.com>
> wrote:
>> Last months, I spent a lot of time on microbenchmarks. Probably too
>> much time :-) I found a great Linux config to get a much more stable
>> system to get reliable microbenchmarks:
>> https://haypo-notes.readthedocs.org/microbenchmark.html
>>
>> * isolate some CPU cores
>
> you might be interested in cpusets and the cset utility: in theory, they
> allow you to isolate one CPU without having to reboot to change the kernel
> parameters:
>
> http://skebanga.blogspot.it/2012/06/cset-shield-easily-configure-cpusets.html
> https://github.com/lpechacek/cpuset

Ah, I didn't know this tool. Basically, it looks similar to the Linux
isolcpus command line parameter, but done in userpace. I see an
advantage, it can be used temporary without having to reboot the
kernel.


> However, I never did a scientific comparison between cpusets and isolcpu to
> see if the former behaves exactly like the latter.

I have a simple test:

* run a benchmark when the system is idle
* run a benchmark when the system is *very* busy (ex: system load > 5)

Using CPU isolation + nohz_full + blocking IRQ on isolated CPUs, the
benchmark result is the *same* in two cases. Try on a Linux without
any specific config to see a huge difference. For example, performance
divided by two.

I'm using CPU isolation to be able to run benchmarks while I'm still
working on my PC: use firefox, thunderbird, run heavy unit tests,
compile C code, etc.

Right code, I dedicated 2 physical cores to benchmarks and kept 2
physical cores for regular work. Maybe it's too much. It looks like
almost all benchmarks only use logical core in practice (whereas 2
physical cores give me 4 logical cores). Next time I will probably
only dedicate 1 physical core. The advantage of having two dedicated
physical cores is to be able to run two "isolated" benchmarks in
parallel ;-)

I wrote a simple tool to get a system load larger than a minimum:
https://bitbucket.org/haypo/misc/src/tip/bin/system_load.py

I also started to write a script to configure a system for CPU isolation:
https://bitbucket.org/haypo/misc/src/tip/bin/isolcpus.py

* Block IRQ on isolated CPu cores
* Disable ASLR
* Force performance CPU speed on isolated cores, but not on other
cores. I don't want to burn my PC :-) Intel P-state is still enabled
on all CPU cores, so the power state of isolated cores still change
dynamically in practice. You can see it using powertop for example.

CPU isolation is not perfect, you still have random source of noises.
There are also System Management Interrupt (SMI) and other low-level
things. I hope that running multiple iterations of the benchmark is be
enough to reduce (or remove) other sources of noise.

By the way, search "Linux realtime" to find good information about
"sources of noise" on Linux. Example:
https://rt.wiki.kernel.org/index.php/HOWTO:_Build_an_RT-application#Hardware

Hopefully, my requirements on timing are more cool than hard realtime ;-)

Victor


More information about the Speed mailing list