[Speed] Median +- MAD or Mean +- std dev?
Victor Stinner
victor.stinner at gmail.com
Mon Mar 6 18:37:03 EST 2017
Hi,
Serhiy Storchaka opened a bug report in my perf module: perf displays
Median +- std dev, whereas median absolute deviation (MAD) should be
displayed instead:
https://github.com/haypo/perf/issues/20
I just modified perf to display Median +- MAD, but I'm not sure that
it's better than Mean +- std dev.
The question is important when a benchmark is unstable (has a lot of
outliers). There is good example below with "Median +- MAD: 276 ns +-
10 ns" and "Mean +- std dev: 371 ns +- 196 ns".
The goal of perf is to get reproductible benchmark results. So the
question is what should be displayed (median or mean?) to get the most
reproductible output?
Median +- MAD "hides" outliers. In my experience, outliers are not
"reproductible", but caused by "noise" of the system and other
applications.
I feel that Median +- MAD is what I want, but I would feel more
confortable if someone can confirm with his/her experience :-)
-----------------
haypo at selma$ PYTHONPATH=~/prog/GIT/perf ./python -m perf show --hist
--stats bench.json.gz
234 ns: 3 #
264 ns: 114 ##################################################
293 ns: 9 ####
322 ns: 2 #
351 ns: 0 |
381 ns: 0 |
410 ns: 0 |
439 ns: 1 |
469 ns: 0 |
498 ns: 1 |
527 ns: 1 |
557 ns: 0 |
586 ns: 1 |
615 ns: 1 |
644 ns: 1 |
674 ns: 2 #
703 ns: 1 |
732 ns: 1 |
762 ns: 2 #
791 ns: 15 #######
820 ns: 5 ##
Total duration: 1 min 14.5 sec
Start date: 2017-03-06 23:30:49
End date: 2017-03-06 23:33:11
Raw sample minimum: 137 ms
Raw sample maximum: 444 ms
Number of runs: 42
Total number of samples: 160
Number of samples per run: 4
Number of warmups per run: 2
Loop iterations per sample: 2^19 (128 outer-loops x 4096 inner-loops)
Minimum: 262 ns (-5%)
Median +- MAD: 276 ns +- 10 ns
Mean +- std dev: 371 ns +- 196 ns
Maximum: 847 ns (+207%)
ERROR: the benchmark is very unstable, the standard deviation is very
high (stdev/mean: 53%)!
Try to rerun the benchmark with more runs, samples and/or loops
Median +- MAD: 276 ns +- 10 ns
-----------------
See attached bench.json.gz for full data.
Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench.json.gz
Type: application/x-gzip
Size: 6108 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/speed/attachments/20170307/22e8b400/attachment.bin>
More information about the Speed
mailing list