[Speed] Median +- MAD or Mean +- std dev?

Serhiy Storchaka storchaka at gmail.com
Wed Mar 15 02:41:47 EDT 2017


On 14.03.17 19:05, Antoine Pitrou wrote:
> On Tue, 14 Mar 2017 09:14:45 +0200
> Serhiy Storchaka <storchaka at gmail.com>
> wrote:
>> The median tells you that results of a half of runs will be less than
>> the median and results of other half will be larger. This is pretty
>> informative and even more informative than the mean for some
>> applications.
>
> How so?  Whether a measurement is below or above the median is a
> pointless piece of information in itself, because you don't know by how
> much.  If a sample is 0.05% below the median, it might just as well be
> 0.05% above for all I care.  If half of the samples are 1% below the
> median and half of the samples are 50% above, it's not the same thing
> at all as if half of the samples are 50% below and half of the samples
> are 1% above.  Yet "median +/- MAD" gives the exact same results in
> both cases.

"half of the samples are 1% below the median and half of the samples are 
50% above" -- this is unrealistic example. In real examples samples are 
distributed around some point, with the skew and outliers. The median is 
close to the mean, but less affected by outliers. For benchmarking 
purpose the absolute value is not important. The change between two 
measurements of two builds is important. The median is more stable and 
that means that we have less chance to get the false result.




More information about the Speed mailing list