On Mon, Oct 24, 2011 at 5:54 PM, David Voong <voong.david@gmail.com> wrote:
Hi guys,

I have a question regarding subclassing of the numpy.matrix class.

I read through the wiki page, http://docs.scipy.org/doc/numpy/user/basics.subclassing.html

and tried to subclass numpy.matrix, I find that if I override the __finalize_array__ method I have problems using the sum method and get the following error,


Traceback (most recent call last):
  File "test.py", line 61, in <module>
    print (a * b).sum()
  File "/afs/cern.ch/user/d/dvoong/programs/lib/python2.6/site-packages/numpy/matrixlib/defmatrix.py", line 435, in sum
    return N.ndarray.sum(self, axis, dtype, out)._align(axis)
  File "/afs/cern.ch/user/d/dvoong/programs/lib/python2.6/site-packages/numpy/matrixlib/defmatrix.py", line 370, in _align
    return self[0,0]
  File "/afs/cern.ch/user/d/dvoong/programs/lib/python2.6/site-packages/numpy/matrixlib/defmatrix.py", line 305, in __getitem__
    out = N.ndarray.__getitem__(self, index)
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index



Hi,

Thanks for asking this - I'm also trying to subclass np.matrix and running into similar problems; I never generally need to sum my vectors so this wasn't a problem I had noticed thus far.

Anyway, for np.matrix, there are definitely particular issues beyond what is described on the array subclassing wiki. I think I have a workaround, based on struggling with my own subclass. This is really a hack since I'm not sure how some parts of matrix actually work, so if someone has a better solution please speak up!

You didn't give details on the actual subclass, but I can recreate the error with the following minimal example (testing with Numpy 1.6.1 inside EPD 7.1):

class MatSubClass1(np.matrix):
    def __new__(cls, input_array):
        obj = np.asarray(input_array).view(cls)
        return obj
    def __array_finalize__(self, obj):
        pass
    def __array_wrap__(self, out_arr, context=None):
        return np.ndarray.__array_wrap__(self, out_arr, context)

In [2]: m1 = MatSubClass1( [[2,0],[1,1]] )
In [3]: m1.sum()
...
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index


The problem is that __array_finalize__ of the matrix class that needs to get called, to preserve dimensions (matrix should always have 2 dimensions). You can't just add the matrix __array_finalize__ because the initial call happens when you create the object, in which case obj is a ndarray object, not a matrix. So, check to see obj is a matrix first before calling it. In addition, there is some undocumented _getitem attribute inside matrix, and I do not know what it does. If you just set that attribute during __new__, you get something that seems to work:

class MatSubClass2(np.matrix):
    def __new__(cls, input_array):
        obj = np.asarray(input_array).view(cls)
        obj._getitem = False
        return obj
    def __array_finalize__(self, obj):
        if isinstance(obj, np.matrix):
            np.matrix.__array_finalize__(self, obj)
    def __array_wrap__(self, out_arr, context=None):
        return np.ndarray.__array_wrap__(self, out_arr, context)

In [4]: m2 = MatSubClass2( [[2,0],[1,1]] )

In [5]: m2.sum(), m2.sum(0), m2.sum(1)
Out[5]: (4, matrix([[3, 1]]), matrix([[2], [2]]))


HTH,
Aronne