[Numpy-discussion] non-standard standard deviation

Sturla Molden sturla at molden.no
Sat Dec 5 18:52:17 EST 2009


Colin J. Williams skrev:
>   
>  suggested that 1 (one) would be a better default but Robert Kern told 
> us that it won't happen.
>
>   
I don't even see the need for this keyword argument, as you can always 
multiply the variance by n/(n-1) to get what you want.

Also, normalization by n gives the ML estimate (yes it has a bias, but 
it is better anyway). It is a common novice mistake to use 1/(n-1) as 
nomalization, probably due to poor advice in introductory statistics 
textbooks. It also seems that frequentists are more scared about this 
"bias" boogey monster than Bayesians. It may actually help beginners to 
avoid this mistake if numpy's implementation prompts them to ask why the 
normalization is 1/n.

If numpy is to change the implementation of std, var, and cov, I suggest 
using the two-pass algorithm to reduce rounding error. (I can provide C 
code.) This is much more important than changing the normalization to a 
bias-free but otherwise inferior value.

Sturla








More information about the NumPy-Discussion mailing list