[Python-Dev] Python Benchmarks
Tim Peters
tim.peters at gmail.com
Sat Jun 3 01:44:07 CEST 2006
[MAL]
>>> Using the minimum looks like the way to go for calibration.
[Terry Reedy]
>> Or possibly the median.
[Andrew Dalke]
> Why? I can't think of why that's more useful than the minimum time.
A lot of things get mixed up here ;-) The _mean_ is actually useful
if you're using a poor-resolution timer with a fast test. For
example, suppose a test takes 1/10th the time of the span between
counter ticks. Then, "on average", in 9 runs out of 10 the reported
elapsed time is 0 ticks, and in 1 run out of 10 the reported time is 1
tick. 0 and 1 are both wrong, but the mean (1/10) is correct.
So there _can_ be sense to that. Then people vaguely recall that the
median is more robust than the mean, and all sense goes out the window
;-)
My answer is to use the timer with the best resolution the machine
has. Using the mean is a way to worm around timer quantization
artifacts, but it's easier and clearer to use a timer with resolution
so fine that quantization doesn't make a lick of real difference.
Forcing a test to run for a long time is another way to make timer
quantization irrelevant, but then you're also vastly increasing
chances for other processes to disturb what you're testing.
I liked benchmarking on Crays in the good old days. No time-sharing,
no virtual memory, and the OS believed to its core that its primary
purpose was to set the base address once at the start of a job so the
Fortran code could scream. Test times were reproducible to the
nanosecond with no effort. Running on a modern box for a few
microseconds at a time is a way to approximate that, provided you
measure the minimum time with a high-resolution timer :-)
More information about the Python-Dev
mailing list