[Numpy-discussion] Standard Deviation (std): Suggested change for "ddof" default value

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Apr 3 13:47:29 EDT 2014

On Wed, Apr 2, 2014 at 10:06 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> <josef.pktd at gmail.com> wrote:
>> pandas came later and thought ddof=1 is worth more than consistency.
> Pandas is a data analysis package. NumPy is a numerical array package.
> I think ddof=1 is justified for Pandas, for consistency with statistical
> software (SPSS et al.)
> For NumPy, there are many computational tasks where the Bessel correction
> is not wanted, so providing a uncorrected result is the correct thing to
> do. NumPy should be a low-level array library that does very little magic.
> Those who need the Bessel correction can multiply with sqrt(n/float(n-1))
> or specify ddof. Bu that belongs in the docs.
> Sturla
> P.S. Personally I am not convinced "unbiased" is ever a valid argument, as
> the biased estimator has smaller error. This is from experience in
> marksmanship: I'd rather shoot a tight series with small systematic error
> than scatter my bullets wildly but "unbiased" on the target. It is the
> total error that counts. The series with smallest total error gets the best
> score. It is better to shoot two series and calibrate the sight in between
> than use a calibration-free sight that don't allow us to aim.

calibration == bias correction ?

That's why I
> think classical statistics got this one wrong. Unbiased is never a virtue,
> but the smallest error is. Thus, if we are to repeat an experiment, we
> should calibrate our estimator just like a marksman calibrates his sight.
> But the aim should always be calibrated to give the smallest error, not an
> unbiased scatter. Noone in their right mind would claim a shotgun is more
> precise than a rifle because it has smaller bias. But that is what applying
> the Bessel correction implies.


I spent several days trying to figure out what Stata is doing for
small sample corrections to reduce the bias of the rejection interval
with "uncorrected" variance estimates.


> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

More information about the NumPy-Discussion mailing list