[Numpy-discussion] numpy.trapz() doesn't respect subclass

Ryan May rmay31 at gmail.com
Sat Mar 27 14:31:24 EDT 2010


On Sat, Mar 27, 2010 at 11:12 AM,  <josef.pktd at gmail.com> wrote:
> On Sat, Mar 27, 2010 at 1:00 PM, Ryan May <rmay31 at gmail.com> wrote:
>> On Mon, Mar 22, 2010 at 8:14 AM, Ryan May <rmay31 at gmail.com> wrote:
>>> On Sun, Mar 21, 2010 at 11:57 PM,  <josef.pktd at gmail.com> wrote:
>>>> On Mon, Mar 22, 2010 at 12:49 AM, Ryan May <rmay31 at gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I found that trapz() doesn't work with subclasses:
>>>>>
>>>>> http://projects.scipy.org/numpy/ticket/1438
>>>>>
>>>>> A simple patch (attached) to change asarray() to asanyarray() fixes
>>>>> the problem fine.
>>>>
>>>> Are you sure this function works with matrices and other subclasses?
>>>>
>>>> Looking only very briefly at it: the multiplication might be a problem.
>>>
>>> Correct, it probably *is* a problem in some cases with matrices.  In
>>> this case, I was using quantities (Darren Dale's unit-aware array
>>> package), and the result was that units were stripped off.
>>>
>>> The patch can't make trapz() work with all subclasses. However, right
>>> now, you have *no* hope of getting a subclass out of trapz().  With
>>> this change, subclasses that don't redefine operators can work fine.
>>> If you're passing a Matrix to trapz() and expecting it to work, IMHO
>>> you're doing it wrong.  You can still pass one in by using asarray()
>>> yourself.  Without this patch, I'm left with copying and maintaining a
>>> copy of the code elsewhere, just so I can loosen the function's input
>>> processing. That seems wrong, since there's really no need in my case
>>> to drop down to an ndarray. The input I'm giving it supports all the
>>> operations it needs, so it should just work with my original input.
>
> With asarray it gives correct results for matrices and all array_like
> and subclasses, it just doesn't preserve the type.
> Your patch would break matrices and possibly other types, masked_arrays?, ...

It would break matrices, yes.  I would argue that masked arrays are
already broken with trapz:

In [1]: x = np.arange(10)

In [2]: y = x * x

In [3]: np.trapz(y, x)
Out[3]: 244.5

In [4]: ym = np.ma.array(y, mask=(x>4)&(x<7))

In [5]: np.trapz(ym, x)
Out[5]: 244.5

In [6]: y[5:7] = 0

In [7]: ym = np.ma.array(y, mask=(x>4)&(x<7))

In [8]: np.trapz(ym, x)
Out[8]: 183.5

Because of the call to asarray(), the mask is completely discarded and
you end up with identical results to an unmasked array,
which is not what I'd expect.  Worse, the actual numeric value of the
positions that were masked affect the final answer. My patch allows
this to work as expected too.

> One solution would be using arraywrap as in numpy.linalg.

By arraywrap, I'm assuming you mean:

def _makearray(a):
    new = asarray(a)
    wrap = getattr(a, "__array_prepare__", new.__array_wrap__)
    return new, wrap

I'm not sure if that's identical to just letting the subclass handle
what's needed.  To my eyes, that doesn't look as though it'd be
equivalent, both for handling masked arrays and Quantities. For
quantities at least, the result of trapz will have different units
than either of the inputs.

> for related discussion:
> http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html

Actually, that discussion kind of makes my point.  Matrices are a pain
to make work in a general sense because they *break* ndarray
conventions--to me it doesn't make sense to help along classes that
break convention at the expense of making well-behaved classes a pain
to use.  You should need an *explicit* cast of a matrix to an ndarray
instead of the function quietly doing it for you. ("Explicit is better
than implicit") It just seems absurd that if I make my own ndarray
subclass that *just* adds some behavior to the array, but doesn't
break *any* operations, I need to do one of the following:

1) Have my own copy of trapz that works with my class
2) Wrap every call to numpy's own trapz() to put the metadata back.

Does it not seem backwards that the class that breaks conventions
"just works" while those that don't break conventions, will work
perfectly with the function as written, need help to be treated
properly?

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma



More information about the NumPy-Discussion mailing list