Trouble subclassing ndarray

Hi all,
Sorry if this is the wrong forum for a question like this.
I'm trying to create an object with multiple inheritance, one of which is from numpy.ndarray. The other gives it cacheable properties and defines a __getattr__ to deliver the cached properties. The initial instantiation is successful, but ufuncs and slices cause an infinite recursion where the __getattr__ function is used but the _cacheable attribute is not set (ie: from __init__ )
I am using docs.scipy.org/doc/numpy/user/basics.subclassing.html as a reference. Here is code that shows my problem (python 2.7, numpy 1.8.2)
=====================
from __future__ import print_function import numpy as np
class Cacheable(object):
def __init__(self,*args,**kwargs): self._cacheable = {}
def __getattr__(self,key): print("getting %s"%key) if key in self._cacheable: print(" found it") self._cacheable[key]() return self.__dict__[key] #if chache function does't update # data you're going to have a bad time else: raise AttributeError
def _clear(self): '''clears derived properties''' for key in self._cacheable: if key in self.__dict__: del self.__dict__[key]
class BaseGeometry(np.ndarray,Cacheable): '''Numpy array with extra attributes that are cacheable arrays'''
def __new__(cls,input_array,dtype=np.float64,*args,**kwargs): # Input array is an already formed ndarray instance # We first cast to be our class type obj = np.asarray(input_array,dtype=dtype).view(cls) # Finally, we must return the newly created object: return obj
def __init__(self,dims=None,dtype=np.float64,readonly=True): #TODO: sort through args and kwargs to make better self.readonly=readonly if readonly: self.flags.writeable=False self.dims=dims self._dtype = dtype Cacheable.__init__(self)
def writeable_copy(self): ret = np.copy(self) ret.flags.writeable = True return ret
def __array_finalize_(self,obj): #New object, will be created in __new__ print("array_finalize") if obj is None: return # created from slice or template print("finalizing slice") self._cacheable = getattr(obj, '_cacheable', None)
if __name__ == "__main__": n = 5 test = BaseGeometry(np.random.randint(-25,25,(n,2))) print("this works:",test._cacheable) broken = test[1:4] #interestingly, no problem here print(broken) #infinite recursion
===================
array_finalize is never called.
output is:
this works: {} getting _cacheable getting _cacheable getting _cacheable
[and on and on]
getting _cacheable getting _cacheable <repr(<__main__.BaseGeometry at 0x7fa42e2921b8>) failed: RuntimeError: maximum recursion depth exceeded while calling a Python object>
-- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Trouble-subclassing-ndarray-tp40... Sent from the Numpy-discussion mailing list archive at Nabble.com.

On Fri, Apr 10, 2015 at 12:30 AM, Elliot permafacture@gmail.com wrote:
I'm trying to create an object with multiple inheritance, one of which is from numpy.ndarray.
Sorry that this is an unhelpful answer, but I want to quickly say that this sentence sets off all kinds of alarm bells in my mind. Subclassing ndarray is almost always a bad idea (really it is always a bad idea, just sometimes you have absolutely no alternative), and multiple inheritance is almost always a bad idea (well, personally I think it actually always is a bad idea, but I recognize that opinions differ), and I am 99.999% sure that any design that can be described by the sentence quoted above is a design that you will look back on and regret. Sorry to be the bearer of bad news. Maybe you can just have a simple object that implements the cacheable behaviour and also has an ndarray as an attribute (i.e., your object could HAS-A ndarray instead of IS-A ndarray)?
Hopefully someone with a bit more time will be more helpful and figure out what is actually going on in your example, I'm sure it's some wackily weird issue...
-n

On Do, 2015-04-09 at 21:30 -0700, Elliot wrote:
Hi all,
Sorry if this is the wrong forum for a question like this.
I'm trying to create an object with multiple inheritance, one of which is from numpy.ndarray. The other gives it cacheable properties and defines a __getattr__ to deliver the cached properties. The initial instantiation is successful, but ufuncs and slices cause an infinite recursion where the __getattr__ function is used but the _cacheable attribute is not set (ie: from __init__ )
You have a typo in your __array_finalize__ it misses the last underscore, that is probably why it is never called. About the infinite recursion, not sure on first sight.
I am using docs.scipy.org/doc/numpy/user/basics.subclassing.html as a reference. Here is code that shows my problem (python 2.7, numpy 1.8.2)
=====================
from __future__ import print_function import numpy as np
class Cacheable(object):
def __init__(self,*args,**kwargs): self._cacheable = {} def __getattr__(self,key): print("getting %s"%key) if key in self._cacheable: print(" found it") self._cacheable[key]() return self.__dict__[key] #if chache function does't update # data you're going to have a bad time else: raise AttributeError def _clear(self): '''clears derived properties''' for key in self._cacheable: if key in self.__dict__: del self.__dict__[key]
class BaseGeometry(np.ndarray,Cacheable): '''Numpy array with extra attributes that are cacheable arrays'''
def __new__(cls,input_array,dtype=np.float64,*args,**kwargs): # Input array is an already formed ndarray instance # We first cast to be our class type obj = np.asarray(input_array,dtype=dtype).view(cls) # Finally, we must return the newly created object: return obj def __init__(self,dims=None,dtype=np.float64,readonly=True): #TODO: sort through args and kwargs to make better self.readonly=readonly if readonly: self.flags.writeable=False self.dims=dims self._dtype = dtype Cacheable.__init__(self) def writeable_copy(self): ret = np.copy(self) ret.flags.writeable = True return ret def __array_finalize_(self,obj): #New object, will be created in __new__ print("array_finalize") if obj is None: return # created from slice or template print("finalizing slice") self._cacheable = getattr(obj, '_cacheable', None)
if __name__ == "__main__": n = 5 test = BaseGeometry(np.random.randint(-25,25,(n,2))) print("this works:",test._cacheable) broken = test[1:4] #interestingly, no problem here print(broken) #infinite recursion
===================
array_finalize is never called.
output is:
this works: {} getting _cacheable getting _cacheable getting _cacheable
[and on and on]
getting _cacheable getting _cacheable <repr(<__main__.BaseGeometry at 0x7fa42e2921b8>) failed: RuntimeError: maximum recursion depth exceeded while calling a Python object>
-- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Trouble-subclassing-ndarray-tp40... Sent from the Numpy-discussion mailing list archive at Nabble.com. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

You have a typo in your __array_finalize__ it misses the last underscore,
that is probably why it is never called. About the infinite recursion, not sure on first sight.
Oh gosh, it was the underscrore! infinite recursion no longer. I was searching all over for a misspelled "_cacheable".
Subclassing ndarray is almost always a bad idea (really it is always a bad
idea, just sometimes you have absolutely no alternative), and multiple inheritance is almost always a bad idea (well, personally I think it actually always is a bad idea, but I recognize that opinions differ), and I am 99.999% sure that any design that can be described by the sentence quoted above is a design that you will look back on and regret.
So, now that this works, I'm open to hear more about why this is an awful idea (if it is). Why might I regret this later? And will this add object creation overhead to every ufunc and slice or otherwise degrade performance?
Thanks

On Fr, 2015-04-10 at 11:40 -0500, Elliot Hallmark wrote:
You have a typo in your __array_finalize__ it misses the last
underscore, that is probably why it is never called. About the infinite recursion, not sure on first sight.
Oh gosh, it was the underscrore! infinite recursion no longer. I was searching all over for a misspelled "_cacheable".
Subclassing ndarray is almost always a bad idea (really it is always
a bad idea, just sometimes you have absolutely no alternative), and multiple inheritance is almost always a bad idea (well, personally I think it actually always is a bad idea, but I recognize that opinions differ), and I am 99.999% sure that any design that can be described by the sentence quoted above is a design that you will look back on and regret.
So, now that this works, I'm open to hear more about why this is an awful idea (if it is). Why might I regret this later? And will this add object creation overhead to every ufunc and slice or otherwise degrade performance?
Performance wise it should not matter significantly, that is not the problem. However, know that you will never get it to work quite right with some functionality. So if at some point you find a quirk and do not know how to fix it.... It is quite likely there is no fix. On the other hand, if you do not care much about losing your subclass (i.e. getting a normal array) for some numpy functions, nor do weirder things (messing with the shape or such) you are probably mostly fine. For bigger projects I might still worry if it is the right path, but for something more limited, maybe it is the simplest way to get to something good enough.
- Sebastian
Thanks
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (4)
-
Elliot
-
Elliot Hallmark
-
Nathaniel Smith
-
Sebastian Berg