[Speed] Median +- MAD or Mean +- std dev?

Serhiy Storchaka storchaka at gmail.com
Wed Mar 15 02:54:44 EDT 2017


On 14.03.17 16:42, Nick Coghlan wrote:
> That would suggest that the implicit assumption of a
> measure-of-centrality with a measure-of-symmetric-deviation may need to
> be challenged, as at least some meaningful performance problems are
> going to show up as non-normal distributions in the benchmark results.
>
> Network services typically get around the "inherent variance" problem by
> looking at a few key percentiles like 50%, 90% and 95%. Perhaps that
> would be appropriate here as well?

Yes, quantiles would be useful, but I suppose they are less stable. If 
you have have only 20 samples, it is not enough to determine the 95% 
percentile.

But absolute values are not important for the purposes of our 
benchmarking. We need only know whether one build is faster or slower 
than others.

I suggested to calculate the probability of one build be faster than the 
other when compare two builds. This is just one number and it doesn't 
depend on assumptions about the normality of distributions.




More information about the Speed mailing list