[Numpy-discussion] Warnings in numpy.ma.test()

Darren Dale dsdale24 at gmail.com
Thu Mar 18 17:42:53 EDT 2010

On Thu, Mar 18, 2010 at 5:12 PM, Eric Firing <efiring at hawaii.edu> wrote:
> Ryan May wrote:
>> On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker
>> <Chris.Barker at noaa.gov> wrote:
>>> Gael Varoquaux wrote:
>>>> On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote:
>>>>> sure -- that's kind of my point -- if EVERY numpy array were
>>>>> (potentially) masked, then folks would write code to deal with them
>>>>> appropriately.
>>>> That's pretty much saying: "I have a complicated problem and I want every
>>>> one else to have to deal with the full complexity of it, even if they
>>>> have a simple problem".
>>> Well -- I did say it was a fantasy...
>>> But I disagree -- having invalid data is a very common case. What we
>>> have now is a situation where we have two parallel systems, masked
>>> arrays and regular arrays. Each time someone does something new with
>>> masked arrays, they often find another missing feature, and have to
>>> solve that. Also, the fact that masked arrays are tacked on means that
>>> performance suffers.
>> Case in point, I just found a bug in np.gradient where it forces the
>> output to be an ndarray.
>> (http://projects.scipy.org/numpy/ticket/1435).  Easy fix that doesn't
>> actually require any special casing for masked arrays, just making
>> sure to use the proper function to create a new array of the same
>> subclass as the input.  However, now for any place that I can't patch
>> I have to use a custom function until a fixed numpy is released.
>> Maybe universal support for masked arrays (and masking invalid points)
>> is a pipe dream, but every function in numpy should IMO deal properly
>> with subclasses of ndarray.
> 1) This can't be done in general because subclasses can change things to
> the point where there is little one can count on.  The matrix subclass,
> for example, redefines multiplication and iteration, making it difficult
> to write functions that will work for ndarrays or matrices.

I'm more optimistic that it can be done in general, if we provide a
mechanism where the subclass with highest priority can customize the
execution of the function (ufunc or not). In principle, the subclass
could even override the buffer operation, like in the case of
matrices. It still can put a lot of responsibility on the authors of
the subclass, but what is gained is a framework where np.add (for
example) could yield the appropriate result for any subclass, as
opposed to the current situation of needing to know which add function
can be used for a particular type of input.

All speculative, of course. I'll start throwing some examples together
when I get a chance.


More information about the NumPy-Discussion mailing list