[Numpy-discussion] Standard Deviation (std): Suggested change for "ddof" default value

alex argriffi at ncsu.edu
Tue Apr 1 17:11:59 EDT 2014

On Tue, Apr 1, 2014 at 4:54 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Tue, Apr 1, 2014 at 2:08 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Tue, Apr 1, 2014 at 9:02 PM, Sturla Molden <sturla.molden at gmail.com>
>> wrote:
>> > Haslwanter Thomas <Thomas.Haslwanter at fh-linz.at> wrote:
>> >
>> >> Personally I cannot think of many applications where it would be
>> >> desired
>> >> to calculate the standard deviation with ddof=0. In addition, I feel
>> >> that
>> >> there should be consistency between standard modules such as numpy,
>> >> scipy, and pandas.
>> >
>> > ddof=0 is the maxiumum likelihood estimate. It is also needed in
>> > Bayesian
>> > estimation.
>> It's true, but the counter-arguments are also strong. And regardless
>> of whether ddof=1 or ddof=0 is better, surely the same one is better
>> for both numpy and scipy.
>> > If you are not eatimating from a sample, but rather calculating for the
>> > whole population, you always want ddof=0.
>> >
>> > What does Matlab do by default? (Yes, it is a retorical question.)
>> R (which is probably a more relevant comparison) does do ddof=1 by
>> default.
>> >> I am wondering if there is a good reason to stick to "ddof=0" as the
>> >> default for "std", or if others would agree with my suggestion to
>> >> change
>> >> the default to "ddof=1"?
>> >
>> > It is a bad idea to suddenly break everyone's code.
>> It would be a disruptive transition, but OTOH having inconsistencies
>> like this guarantees the ongoing creation of new broken code.
> This topic comes up regularly. The original choice was made for numpy 1.0b1
> by Travis, see this later thread. At this point it is probably best to leave
> it alone.

I don't have any opinion about this debate, but I love the
justification in that thread "Any surprise that is created by the
different default should be mitigated by the fact that it's an
opportunity to learn something about what you are doing."  This
masterpiece of rhetoric will surely help me win many internet
arguments in the future!

More information about the NumPy-Discussion mailing list