[Numpy-discussion] What should be the result in some statistics corner cases?

Charles R Harris charlesr.harris at gmail.com
Sun Jul 14 17:35:29 EDT 2013


On Sun, Jul 14, 2013 at 2:55 PM, Warren Weckesser <
warren.weckesser at gmail.com> wrote:

> On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
> > Some corner cases in the mean, var, std.
> >
> > *Empty arrays*
> >
> > I think these cases should either raise an error or just return nan.
> > Warnings seem ineffective to me as they are only issued once by default.
> >
> > In [3]: ones(0).mean()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[3]: nan
> >
> > In [4]: ones(0).var()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> > RuntimeWarning: invalid value encountered in true_divide
> >   out=arrmean, casting='unsafe', subok=False)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[4]: nan
> >
> > In [5]: ones(0).std()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> > RuntimeWarning: invalid value encountered in true_divide
> >   out=arrmean, casting='unsafe', subok=False)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[5]: nan
> >
> > *ddof >= number of elements*
> >
> > I think these should just raise errors. The results for ddof >= #elements
> > is happenstance, and certainly negative numbers should never be returned.
> >
> > In [6]: ones(2).var(ddof=2)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[6]: nan
> >
> > In [7]: ones(2).var(ddof=3)
> > Out[7]: -0.0
> > *
> > nansum*
> >
> > Currently returns nan for empty arrays. I suspect it should return nan
> for
> > slices that are all nan, but 0 for empty slices. That would make it
> > consistent with sum in the empty case.
> >
>
>
> For nansum, I would expect 0 even in the case of all nans.  The point
> of these functions is to simply ignore nans, correct?  So I would aim
> for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])
>
>
Agreed, although that changes current behavior. What about the other cases?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130714/2e7ad35d/attachment.html>


More information about the NumPy-Discussion mailing list