[Numpy-discussion] var bias reason?

Wed Oct 15 11:58:24 EDT 2008

Me too.

S

On Wednesday 15 October 2008 11:31:44 am Paul Barrett wrote:
> I'm behind Travis on this one.
>
>  -- Paul
>
> On Wed, Oct 15, 2008 at 11:19 AM, David Cournapeau 
<cournape at gmail.com> wrote:
> > On Wed, Oct 15, 2008 at 11:45 PM, Travis E. Oliphant
> >
> > <oliphant at enthought.com> wrote:
> >> Gabriel Gellner wrote:
> >>> Some colleagues noticed that var uses biased formula's by default
> >>> in numpy, searching for the reason only brought up:
> >>>
> >>> http://article.gmane.org/gmane.comp.python.numeric.general/12438/
> >>>match=var+bias
> >>>
> >>> which I totally agree with, but there was no response? Any reason
> >>> for this?
> >>
> >> I will try to respond to this as it was me who made the change.  I
> >> think there have been responses, but I think I've preferred to
> >> stay quiet rather than feed a flame war.   Ultimately, it is a
> >> matter of preference and I don't think there would be equal
> >> weights given to all the arguments surrounding the decision by
> >> everybody.
> >>
> >> I will attempt to articulate my reasons:  dividing by n is the
> >> maximum likelihood estimator of variance and I prefer that
> >> justification more than the "un-biased" justification for a
> >> default (especially given that bias is just one part of the
> >> "error" in an estimator).    Having every package that computes
> >> the mean return the "un-biased" estimate gives it more cultural
> >> weight than than the concept deserves, I think.  Any surprise that
> >> is created by the different default should be mitigated by the
> >> fact that it's an opportunity to learn something about what you
> >> are doing.    Here is a paper I wrote on the subject that you
> >> might find useful:
> >>
> >> https://contentdm.lib.byu.edu/cdm4/item_viewer.php?CISOROOT=/EER&C
> >>ISOPTR=134&CISOBOX=1&REC=1 (Hopefully, they will resolve a link
> >> problem at the above site soon, but you can read the abstract).
> >
> > Yes, I hope too, I would be happy to read the article.
> >
> > On the limit of unbiasdness, the following document mentions an
> > example (in a different context than variance estimation):
> >
> > http://www.stat.columbia.edu/~gelman/research/published/badbayesres
> >ponsemain.pdf
> >
> > AFAIK, even statisticians who consider themselves as "mostly
> > frequentist" (if that makes any sense) do not advocate unbiasdness
> > as such an important concept anymore (Larry Wasserman mentions it
> > in his "all of statistics").
> >
> > cheers,
> >
> > David
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sransom at nrao.edu             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989