On Wed, May 18, 2016 at 1:16 PM, Victor Stinner firstname.lastname@example.org wrote:
2016-05-18 8:55 GMT+02:00 Maciej Fijalkowski email@example.com:
I think you misunderstand how caches work. The way caches work depends on the addresses of memory (their value) which even with ASLR disabled can differ between runs. Then you either do or don't have cache collisions.
Ok. I'm not sure yet that it's feasible to get exactly the same memory addresses for "hot" objects allocated by Python between two versions of the code (especially when testing a small patch). Not only the addresses look to depend on external parameters, but the patch can also adds or avoids some memory allocations.
The concrete problem is that the benchmark depends on such low-level CPU feature and the perf.py doesn't ignore minor delta in performance, no?
Well the answer is to do more statistics really in my opinion. That is, perf should report average over multiple runs in multiple processes. I started a branch for pypy benchmarks for that, but never finished it actually.