[Numpy-discussion] Re: ndarray.fill and ma.array.filled

Michael Sorich michael.sorich at gmail.com
Mon Apr 10 17:18:15 EDT 2006


On 4/11/06, Sasha <ndarray at mac.com> wrote:
>
> On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > > > [... longish example snipped ...]
> > > >
> > > >>> ma.array([1,1], mask=[0,1]).sum()
> > >
> > > 1
> > So ? The result is not `masked`, the missing value has been omitted.
> >
> I am just making your point with a shorter example.
>
> > [...]
> > Mrf. I'm still not convinced, but I have nothing against it. Along with
> a
> > mask=False_ by default ?
> >
> It looks like there is little opposition here.  I'll submit a patch
> soon and unless better names are suggested, it will probably go in.
>
> > > With the current behavior, how would you achieve masking (no fill)
> a.sum()?
> > Er, why would I want to get MA.masked along one axis if one value is
> masked  ?
>
> Because if you don't know one of the addends you don't know the sum.
> Replacing missing values with zeros is not always the right strategy.
> If you know that your data has non-zero mean, for example, you might
> want to replace missing values with the mean instead of zero.


I feel that in general implicitly replacing masked values will definitely
lead to bugs in my code. Unless it is really obvious what the best way to
deal with the masked values is for the particular function, then I would
definitely prefer to be explicit about it. In most cases there are a number
of reasonable options for what can be done. Masking the result when masked
values are involved seems the most transparent default option.

For example, it gives me a really bad feeling to think that sum will
automatically return the sum of all non-masked values. When dealing with
large datasets, I will not always know when I need to be careful of missing
values. Summing over the non-masked arrays will often not be the appropriate
course and I fear that I will not notice that this has actually occurred. If
masked values are returned it is pretty obvious what has happened and easily
to go back and explicitly handle the masked data in another way if
appropriate.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/6e38460d/attachment-0001.html>


More information about the NumPy-Discussion mailing list