[SciPy-user] Inconsistent standard deviation and variance implementation in scipy vs. scipy.stats
Johann Rohwer
jr at sun.ac.za
Wed Sep 24 10:05:01 EDT 2008
Hi
It seems that the default implementation of std and var differs
between numpy/scipy and scipy.stats, in that numpy/scipy is using
the "biased" formulation (i.e. dividing by N) whereas scipy.stats is
using the "unbiased" formulation (dividing by N-1) by default. Is
this intentional (it could be potentially confusing...)? I realise
that the "biased" version can be accessed in sp.stats with a kwarg,
but what is the reason for two different implementations of the
function(s)?
In [30]: a
Out[30]: array([ 1., 2., 3., 2., 3., 1.])
In [31]: np.std(a)
Out[31]: 0.81649658092772603
In [32]: sp.std(a)
Out[32]: 0.81649658092772603
In [33]: sp.stats.std(a)
Out[33]: 0.89442719099991586
In [34]: sp.stats.std(a, bias=True)
Out[34]: 0.81649658092772603
Same for np.var vs scipy.stats.var
Johann
More information about the SciPy-User
mailing list