Hi, Your document doesn't explain how you configured the host to run benchmarks. Maybe you didn't tune Linux or anything else? Be careful with modern hardware which can make funny (or not) surprises. See my recent talk at FOSDEM (last month): "How to run a stable benchmark" https://fosdem.org/2017/schedule/event/python_stable_benchmark/ Factors impacting Python benchmarks: * Linux Address Space Layout Randomization (ASRL), /proc/sys/kernel/randomize_va_space * Python random hash function: PYTHONHASHSEED * Command line arguments and environmnet variables: enabling ASLR helps here (?) * CPU power saving and performance features: disable Intel Turbo Boost and/or use a fixed CPU frequency. * Temperature: temperature has a limited impact on benchmarks. If the CPU is below 95°C, Intel CPUs still run at full speed. With a correct cooling system, temperature is not an issue. * Linux perf probes: /proc/sys/kernel/perf_event_max_sample_rate * Code locality, CPU L1 instruction cache (L1c): Profiled Guided Optimization (PGO) helps here * Other processes and the kernel, CPU isolation (CPU pinning) helps here: use isolcpus=cpu_list and rcu_nocbs=cpu_list on the * Linux kernel command line * ... Reboot? Sadly, other unknown factors may still impact benchmarks. Sometimes, it helps to reboot to restore standard performances. https://haypo-notes.readthedocs.io/microbenchmark.html#factors-impacting-ben... Victor