Median +- MAD or Mean +- std dev?
Hi,
Serhiy Storchaka opened a bug report in my perf module: perf displays Median +- std dev, whereas median absolute deviation (MAD) should be displayed instead: https://github.com/haypo/perf/issues/20
I just modified perf to display Median +- MAD, but I'm not sure that it's better than Mean +- std dev.
The question is important when a benchmark is unstable (has a lot of outliers). There is good example below with "Median +- MAD: 276 ns +- 10 ns" and "Mean +- std dev: 371 ns +- 196 ns".
The goal of perf is to get reproductible benchmark results. So the question is what should be displayed (median or mean?) to get the most reproductible output?
Median +- MAD "hides" outliers. In my experience, outliers are not "reproductible", but caused by "noise" of the system and other applications.
I feel that Median +- MAD is what I want, but I would feel more confortable if someone can confirm with his/her experience :-)
haypo@selma$ PYTHONPATH=~/prog/GIT/perf ./python -m perf show --hist --stats bench.json.gz
234 ns: 3 # 264 ns: 114 ################################################## 293 ns: 9 #### 322 ns: 2 # 351 ns: 0 | 381 ns: 0 | 410 ns: 0 | 439 ns: 1 | 469 ns: 0 | 498 ns: 1 | 527 ns: 1 | 557 ns: 0 | 586 ns: 1 | 615 ns: 1 | 644 ns: 1 | 674 ns: 2 # 703 ns: 1 | 732 ns: 1 | 762 ns: 2 # 791 ns: 15 ####### 820 ns: 5 ##
Total duration: 1 min 14.5 sec Start date: 2017-03-06 23:30:49 End date: 2017-03-06 23:33:11 Raw sample minimum: 137 ms Raw sample maximum: 444 ms
Number of runs: 42 Total number of samples: 160 Number of samples per run: 4 Number of warmups per run: 2 Loop iterations per sample: 2^19 (128 outer-loops x 4096 inner-loops)
Minimum: 262 ns (-5%) Median +- MAD: 276 ns +- 10 ns Mean +- std dev: 371 ns +- 196 ns Maximum: 847 ns (+207%)
ERROR: the benchmark is very unstable, the standard deviation is very high (stdev/mean: 53%)! Try to rerun the benchmark with more runs, samples and/or loops
Median +- MAD: 276 ns +- 10 ns
See attached bench.json.gz for full data.
Victor
participants (1)
-
Victor Stinner