Re: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)

1 Oct 2012


      On Sun, Sep 30, 2012 at 9:35 PM, Steven D'Aprano wrote:
...
On Sun, Sep 30, 2012 at 07:12:47PM -0400, Brett Cannon wrote:
...
...
python3 perf.py -T --basedir ../benchmarks -f -b py3k
../cpython/builds/2.7-wide/bin/python ../cpython/builds/3.3/bin/python3.3
...
### call_method ###
Min: 0.491433 -> 0.414841: 1.18x faster
Avg: 0.493640 -> 0.416564: 1.19x faster
Significant (t=127.21)
Stddev: 0.00170 -> 0.00162: 1.0513x smaller
I'm not sure if this is the right place to discuss this,
The speed mailing list would be best.
...
but what is the
justification for recording the average and std deviation of the
benchmarks?
Because the tests, when run in a more rigorous fashion, run many more
iterations so the average is used to even out bumps thanks to executing,
e.g. 50 times. And the stddev is there to know how variable the results
were in the end.
...
If the benchmarks are based on timeit, the timeit docs warn against
taking any statistic other than the minimum.
They don't use timeit.