[Numpy-discussion] ndarray subclassing

Thu May 1 00:37:18 EDT 2008

Hi!

I ran into some strange (at least to me) issues with sublasses of
ndarray. The following minimal class definition illustrates the
problem:

====================================================

import numpy as np
class TestArray(np.ndarray):
    def __new__(cls, data, info=None, dtype=None, copy=False):
        subarr = np.array(data, dtype=dtype, copy=copy)
        subarr = subarr.view(cls)
        return subarr

    def __array_finalize__(self,obj):
        print "self: ",self.shape
        print "obj: ",obj.shape

=====================================================

When I run this code interactively with IPython and then generate
TestArray instances, __array_finalize__ seems to get called when
printing out arrays with more than 1 dimension and self.shape seems to
drop a dimension. Everything works fine if the array has just 1
dimension:

In [3]: x = TestArray(np.arange(5))
self:  (5,)
obj:  (5,)

In [4]: x
Out[4]: TestArray([0, 1, 2, 3, 4])

This is all expected behavior.
However things change when the array is 2-D:

In [5]: x = TestArray(np.zeros((2,3)))
self:  (2, 3)
obj:  (2, 3)

In [6]: x
Out[6]: self:  (3,)
obj:  (2, 3)
self:  (3,)
obj:  (2, 3)

TestArray([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

Now when printing out the array, __array_finalize__ seems to get
called twice and each time self seems to only refer to one row of the
array. Can anybody explain what is going on and why? This behavior
seems to lead to problems when the __array_finalize__ method performs
checks on the shape of the array. In the matrix class this seems to be
circumvented with a special _getitem flag that bypasses the shape
checks in __array_finalize__ and an analogous solution works for my
class, too. However, I'm still puzzled by this behavior and am hoping
that somebody here can shed some light on it.

Thanks!
CTW