[Numpy-discussion] Warnings in numpy.ma.test()

Eric Firing efiring at hawaii.edu
Thu Mar 18 17:12:55 EDT 2010

Ryan May wrote:
> On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker
> <Chris.Barker at noaa.gov> wrote:
>> Gael Varoquaux wrote:
>>> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote:
>>>> sure -- that's kind of my point -- if EVERY numpy array were
>>>> (potentially) masked, then folks would write code to deal with them
>>>> appropriately.
>>> That's pretty much saying: "I have a complicated problem and I want every
>>> one else to have to deal with the full complexity of it, even if they
>>> have a simple problem".
>> Well -- I did say it was a fantasy...
>> But I disagree -- having invalid data is a very common case. What we
>> have now is a situation where we have two parallel systems, masked
>> arrays and regular arrays. Each time someone does something new with
>> masked arrays, they often find another missing feature, and have to
>> solve that. Also, the fact that masked arrays are tacked on means that
>> performance suffers.
> Case in point, I just found a bug in np.gradient where it forces the
> output to be an ndarray.
> (http://projects.scipy.org/numpy/ticket/1435).  Easy fix that doesn't
> actually require any special casing for masked arrays, just making
> sure to use the proper function to create a new array of the same
> subclass as the input.  However, now for any place that I can't patch
> I have to use a custom function until a fixed numpy is released.
> Maybe universal support for masked arrays (and masking invalid points)
> is a pipe dream, but every function in numpy should IMO deal properly
> with subclasses of ndarray.

1) This can't be done in general because subclasses can change things to 
the point where there is little one can count on.  The matrix subclass, 
for example, redefines multiplication and iteration, making it difficult 
to write functions that will work for ndarrays or matrices.

2) There is a lot that can be done to improve the handling of masked 
arrays, and I still believe that much of it should be done at the C 
level, where it can be done with speed and simplicity.  Unfortunately, 
figuring out how to do it well, and implementing it well, will require a 
lot of intensive work.  I suspect it won't get done unless we can figure 
out how to get a qualified person dedicated to it.


> Ryan

More information about the NumPy-Discussion mailing list