[Numpy-discussion] Nansum function behavior

Juan Nunez-Iglesias jni.soma at gmail.com
Sat Oct 24 02:08:18 EDT 2015


Hi Charles,


Just providing an outsider's perspective...




Your specific use-case doesn't address the general definition of nansum: perform a sum while ignoring nans. As others have pointed out, (especially in the linked thread) the sum of nothing is 0. Although the current behaviour of nansum doesn't quite match your use-case, there is no doubt at all that it follows a consistent convention. "Wrong" is certainly not the correct way to describe it.




You can easily cater to your use case as follows:




def rilhac_nansum(ar, axis=None):

    if axis is None:

        return np.nanmean(ar)

    else:

        return np.nanmean(ar, axis=axis) * ar.shape[axis]




nanmean _consistently_ returns nans when encountering nan-only values because the mean of nothing is nan (the sum of nothing divided by the length of nothing, ie 0/0).




Hope this helps...




Juan.

On Sat, Oct 24, 2015 at 12:44 PM, Charles Rilhac <webmastertux1 at gmail.com>
wrote:

> I saw this thread and I totally disagree with thouis argument…
> Of course, you can have NaN if there are only NaNs. Thanks goodness, There is a lot of way to do that. 
> But it’s not convenient, consistent and above all, it is wrong logically to do that. NaN does not mean zeros and operation with NaN only cannot return a figure…
> You lose information about your array. It is easier to fill the result of nansum with zeros than to keep a mask of your orignal array or whatever you do.
> Why it’s misleading ? 
> For example you want to sum rows of a array and mean the result :
> a = np.array([[2,np.nan,4], [np.nan,np.nan, np.nan]])
> b = np.nansum(a, axis=1) # array([ 6.,  0.])
> m = np.nanmean(b) # 3.0 WRONG because you wanted to get 6
>> On 24 Oct 2015, at 09:28, Stephan Hoyer <shoyer at gmail.com> wrote:
>> 
>> Hi Charles,
>> 
>> You should read the previous discussion about this issue on GitHub:
>> https://github.com/numpy/numpy/issues/1721
>> 
>> For what it's worth, I do think the new definition of nansum is more consistent.
>> 
>> If you want to preserve NaN if there are no non-NaN values, you can often calculate this desired quantity from nanmean, which does return NaN if there are only NaNs.
>> 
>> Stephan
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151023/e6653410/attachment.html>


More information about the NumPy-Discussion mailing list