STINNER Victor added the comment: 2016-09-15 11:21 GMT+02:00 Marc-Andre Lemburg <report@bugs.python.org>:
I think we are talking about different things here: calibration is pybench means that you try to determine the overhead of the outer loop and possible setup code that is needed to run the the test. (...) It then takes the minimum timing from overhead runs and uses this as base line for the actual test runs (it subtracts the overhead timing from the test run results).
Calibration in perf means computing automatically the number of outer-loops to get a sample of at least 100 ms (default min time). I simply removed the code to estimate the overhead of the outer loop in pybench. The reason is this line: # Get calibration min_overhead = min(self.overhead_times) This is no such "minimum timing", it doesn't exist :-) In benchmarks, you have to work on statistics: use average, standard deviation, etc. If you badly estimate the minimum overhead, you might get negative timings, which is not allowed in perf (even zero is an hard error in perf). It's not possible to compute *exactly* the "minimum overhead". Moreover, removing the code to estimate the overhead simplified the code.
Benchmarking these days appears to have gotten harder not simpler compared to the days of pybench some 19 years ago.
Benchmarking was always a hard problem. Modern hardware (Out of order CPU, variable CPU frequency, power saving, etc.) problably didn't help :-) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15369> _______________________________________