[Numpy-discussion] the mean, var, std of empty arrays

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Nov 21 22:58:47 EST 2012


On Wed, Nov 21, 2012 at 10:35 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Nov 21, 2012 at 7:45 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Wed, Nov 21, 2012 at 9:22 PM, Olivier Delalleau <shish at keba.be> wrote:
>> > Current behavior looks sensible to me. I personally would prefer no
>> > warning
>> > but I think it makes sense to have one as it can be helpful to detect
>> > issues
>> > faster.
>>
>> I agree that nan should be the correct answer.
>> (I gave up trying to define a default for 0/0 in scipy.stats ttests.)
>>
>> some funnier cases
>>
>> >>> np.var([1], ddof=1)
>> 0.0
>
>
> This one is a nan in development.
>
>>
>> >>> np.var([1], ddof=5)
>> -0
>> >>> np.var([1,2], ddof=5)
>> -0.16666666666666666
>> >>> np.std([1,2], ddof=5)
>> nan
>>
>
> These still do this. Also
>
> In [10]: var([], ddof=1)
> Out[10]: -0
>
> Which suggests that the nan is pretty much an accidental byproduct of
> division by zero. I think it might make sense to have a definite policy for
> these corner cases.

It would also be consistent with the usual pattern to raise a
ValueError on this. ddof too large, size too small.
It wouldn't be the case that for some columns or rows we get valid
answers in this case, as long as we don't allow for missing values.


quick check with np.ma

looks correct except when delegating to numpy ?

>>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=5, axis=0)
>>> s
masked_array(data = [-- --],
             mask = [ True  True],
       fill_value = 1e+20)

>>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=1, axis=0)
>>> s
masked_array(data = [0.0 --],
             mask = [False  True],
       fill_value = 1e+20)

>>> s = np.ma.std([1,2], ddof=5)
>>> s
masked
>>> type(s)
<class 'numpy.ma.core.MaskedConstant'>

>>> np.ma.var([1,2], ddof=5)
-0.16666666666666666


Josef

>
> <snip>
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list