Re: [Speed] Disable hash randomization to get reliable benchmarks

Hi,
2016-04-26 11:01 GMT+02:00 Antonio Cuni <anto.cuni@gmail.com>:
On Mon, Apr 25, 2016 at 12:49 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Last months, I spent a lot of time on microbenchmarks. Probably too much time :-) I found a great Linux config to get a much more stable system to get reliable microbenchmarks: https://haypo-notes.readthedocs.org/microbenchmark.html
- isolate some CPU cores
you might be interested in cpusets and the cset utility: in theory, they allow you to isolate one CPU without having to reboot to change the kernel parameters:
http://skebanga.blogspot.it/2012/06/cset-shield-easily-configure-cpusets.htm... https://github.com/lpechacek/cpuset
Ah, I didn't know this tool. Basically, it looks similar to the Linux isolcpus command line parameter, but done in userpace. I see an advantage, it can be used temporary without having to reboot the kernel.
However, I never did a scientific comparison between cpusets and isolcpu to see if the former behaves exactly like the latter.
I have a simple test:
- run a benchmark when the system is idle
- run a benchmark when the system is *very* busy (ex: system load > 5)
Using CPU isolation + nohz_full + blocking IRQ on isolated CPUs, the benchmark result is the *same* in two cases. Try on a Linux without any specific config to see a huge difference. For example, performance divided by two.
I'm using CPU isolation to be able to run benchmarks while I'm still working on my PC: use firefox, thunderbird, run heavy unit tests, compile C code, etc.
Right code, I dedicated 2 physical cores to benchmarks and kept 2 physical cores for regular work. Maybe it's too much. It looks like almost all benchmarks only use logical core in practice (whereas 2 physical cores give me 4 logical cores). Next time I will probably only dedicate 1 physical core. The advantage of having two dedicated physical cores is to be able to run two "isolated" benchmarks in parallel ;-)
I wrote a simple tool to get a system load larger than a minimum: https://bitbucket.org/haypo/misc/src/tip/bin/system_load.py
I also started to write a script to configure a system for CPU isolation: https://bitbucket.org/haypo/misc/src/tip/bin/isolcpus.py
- Block IRQ on isolated CPu cores
- Disable ASLR
- Force performance CPU speed on isolated cores, but not on other cores. I don't want to burn my PC :-) Intel P-state is still enabled on all CPU cores, so the power state of isolated cores still change dynamically in practice. You can see it using powertop for example.
CPU isolation is not perfect, you still have random source of noises. There are also System Management Interrupt (SMI) and other low-level things. I hope that running multiple iterations of the benchmark is be enough to reduce (or remove) other sources of noise.
By the way, search "Linux realtime" to find good information about "sources of noise" on Linux. Example: https://rt.wiki.kernel.org/index.php/HOWTO:_Build_an_RT-application#Hardware
Hopefully, my requirements on timing are more cool than hard realtime ;-)
Victor
participants (1)
-
Victor Stinner