I have subclassed the numpy.ndarray object, but need some help setting some attributes. I have read http://scipy.org/Subclasses but it doesn't provide the answer I am looking for. I create an instance of the class in my __new__ method as: import numpy class MyClass(numpy.ndarray): __new__(self,…): # Some stuff here H, edges = numpy.histogramdd(…) return H This sets H as the instance of my object. I would also like to have edges be an attribute of MyClass. I can't do: self.edges = edges because the object hasn't been instantiated yet. Can someone show me how I can also keep the information from the variable edges? Thanks, Jeremy
On Sunday 04 February 2007 20:22:44 Jeremy Conlin wrote:
I have subclassed the numpy.ndarray object, but need some help setting some attributes. I have read http://scipy.org/Subclasses but it doesn't provide the answer I am looking for.
Actually, yes: In the example given in http://scipy.org/Subclasses an attribute 'info' is defined from a class-generic one '__defaultinfo'. Just do the same thing with your 'edges' def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges That should the trick.
On 2/4/07, Pierre GM <pgmdevlist@gmail.com> wrote:
On Sunday 04 February 2007 20:22:44 Jeremy Conlin wrote:
I have subclassed the numpy.ndarray object, but need some help setting some attributes. I have read http://scipy.org/Subclasses but it doesn't provide the answer I am looking for.
Actually, yes: In the example given in http://scipy.org/Subclasses an attribute 'info' is defined from a class-generic one '__defaultinfo'. Just do the same thing with your 'edges'
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
That should the trick.
Thanks for clarifying that. I didn't understand what the __array_finalize__ did. Jeremy
On Monday 05 February 2007 11:32:22 Jeremy Conlin wrote:
Thanks for clarifying that. I didn't understand what the __array_finalize__ did.
That means I should clarify some points on the wiki, then. A good exercise is to put some temporary comments in your code in __new__ and __array_finalize__, to show when these methods are called and how (that's how I learned) Thinking about it, the example you gave can't work. Your __new__ method returns H, viz, a pure ndarray. There won't be any call to __array_finalize__ in that case, which is not what you want. Force the call by accessing a view of your array: class myhistog(N.ndarray): def __new__(self, iniarray, inibin): (H,edges) = N.histogramdd(iniarray,inibin) self._defedges = edges return H.view(self) Now, you do return a 'myhistog' class, not a pure 'ndarray', and __array_finalize__ is called. def __array_finalize__(self, obj): print "__array_finalize__ got %s as %s" % (obj, type(obj)) if not hasattr(self, 'edges'): self.edges = self._defedges myhistog._defedges = None Note the last line: you reset the class default to None (if this is what you want). Otherwise, new 'myhistog' objects wil inherit the previous edges.
On 2/5/07, Pierre GM <pgmdevlist@gmail.com> wrote:
On Monday 05 February 2007 11:32:22 Jeremy Conlin wrote:
Thanks for clarifying that. I didn't understand what the __array_finalize__ did.
That means I should clarify some points on the wiki, then. A good exercise is to put some temporary comments in your code in __new__ and __array_finalize__, to show when these methods are called and how (that's how I learned)
Thinking about it, the example you gave can't work. Your __new__ method returns H, viz, a pure ndarray. There won't be any call to __array_finalize__ in that case, which is not what you want. Force the call by accessing a view of your array:
class myhistog(N.ndarray): def __new__(self, iniarray, inibin): (H,edges) = N.histogramdd(iniarray,inibin) self._defedges = edges return H.view(self)
Now, you do return a 'myhistog' class, not a pure 'ndarray', and __array_finalize__ is called.
def __array_finalize__(self, obj): print "__array_finalize__ got %s as %s" % (obj, type(obj)) if not hasattr(self, 'edges'): self.edges = self._defedges myhistog._defedges = None
Note the last line: you reset the class default to None (if this is what you want). Otherwise, new 'myhistog' objects wil inherit the previous edges.
Excellent now it does what I want! But it raises more questions. What exactly is a "view" of H? Thanks again, Jeremy
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
So in order to get an instance attribute, one has to temporarily define it as a class attribute? What happens if there is a thread switch between __new__ and __array_finalize__? This design is not thread safe and can produce strange race conditions. IMHO, the preferred way to set an instance attribute is to use __init__ method, which is the 'Pythonic' way to do it. Sturla Molden
On Tue, Feb 06, 2007 at 01:06:37PM +0100, Sturla Molden wrote:
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
IMHO, the preferred way to set an instance attribute is to use __init__ method, which is the 'Pythonic' way to do it.
I don't pretend to know all the inner workings of subclassing, but I don't think that would work, given the following output: In [1]: import numpy as N In [2]: import numpy as N In [3]: In [3]: class MyArray(N.ndarray): ...: def __new__(cls,data): ...: return N.asarray(data).view(cls) ...: ...: def __init__(self,obj): ...: print "This is where __init__ is called" ...: ...: def __array_finalize__(self,obj): ...: print "This is where __array_finalize__ is called" ...: In [4]: x = MyArray(3) This is where __array_finalize__ is called This is where __init__ is called In [5]: y = N.array([1,2,3]) In [6]: x+y This is where __array_finalize__ is called Out[6]: MyArray([4, 5, 6]) Regards Stéfan
I don't pretend to know all the inner workings of subclassing, but I don't think that would work, given the following output:
In [6]: x+y This is where __array_finalize__ is called Out[6]: MyArray([4, 5, 6])
Why is not __new__ called for the return value of x + y? Does it call __new__ for ndarray instead of MyArray?
On 2/6/07, Sturla Molden <sturla@molden.no> wrote:
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
So in order to get an instance attribute, one has to temporarily define it as a class attribute? What happens if there is a thread switch between __new__ and __array_finalize__? This design is not thread safe and can produce strange race conditions.
IMHO, the preferred way to set an instance attribute is to use __init__ method, which is the 'Pythonic' way to do it.
Sturla Molden
Yes using __init__ to set an instance attribute is the Pythonic way to do this. However, I calculate/create the data in __new__. The data is unavailable to __init__. Jeremy
Yes using __init__ to set an instance attribute is the Pythonic way to do this. However, I calculate/create the data in __new__. The data is unavailable to __init__.
The signatures of __new__ and __init__ is: def __new__(cls, *args, **kwds) def __init__(self, *args, **kwds) If __new__ has access to the data, __init__ has access to the data as well. But in order for __init__ to be called, it must return an instance of cls. Otherwise, Python leaves the class as returned by __new__. But it remains that the subclassing example is not thread safe. The only way to make it thread safe would be if __new__ sets a global lock and __array_finalize_ releases it. I think NumPy can get away with this because it holds the GIL inside its C extension, but when you subclass ndarray in Python, the GIL is released.
Sturla Molden wrote:
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
So in order to get an instance attribute, one has to temporarily define it as a class attribute?
No, you don't *have* to do it this way for all instance attributes. In this example, the user was trying to keep the edges computed during the __new__ method as an attribute. What are the possibilities? 1) Use the __new__ method to create the object in full and then store the edges in some kind of global (or class global) variable. This solution because it uses global variables has all of the thread problems global variables bring. 2) Create a "dummy" arrayobject in the __new__ method and fill it in (i.e. using setstate or resize) during the __init__ method where the instance attribute is actually set. The __array_finalize__ method is intended for "passing-on" attributes to sub-classes from parent classes during operations where __new__ and __init__ are not called (but a new instance is still created). It was not intended to be used in all circumstances. -Travis
On Wed, 2007-02-07 at 14:36 -0700, Travis Oliphant wrote:
Sturla Molden wrote:
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
So in order to get an instance attribute, one has to temporarily define it as a class attribute?
No, you don't *have* to do it this way for all instance attributes.
In this example, the user was trying to keep the edges computed during the __new__ method as an attribute. What are the possibilities?
1) Use the __new__ method to create the object in full and then store the edges in some kind of global (or class global) variable.
This solution because it uses global variables has all of the thread problems global variables bring.
2) Create a "dummy" arrayobject in the __new__ method and fill it in (i.e. using setstate or resize) during the __init__ method where the instance attribute is actually set.
I'm probably missing something obvious here, but why can't you just attach the attribute to the actual object in the __new__ method before returning it. For example: class MyClass(numpy.ndarray): def __new__(self, ...): # Some stuff here H, edges = numpy.histogramdd(...) result = H.view(MyClass) result.edges = edges return result def __array_finalize__(self, obj): self.edges = getattr(obj, 'edges', []) If you could show me the error of my ways, it would help me in *my* attempt to subclass ndarray.
The __array_finalize__ method is intended for "passing-on" attributes to sub-classes from parent classes during operations where __new__ and __init__ are not called (but a new instance is still created). It was not intended to be used in all circumstances.
Thanks, -Reggie
Reggie Dugard wrote:
On Wed, 2007-02-07 at 14:36 -0700, Travis Oliphant wrote:
Sturla Molden wrote:
def __new__(cls,...) ... (H, edges) = numpy.histogramdd(..) cls.__defaultedges = edges
def __array_finalize__(self, obj): if not hasattr(self, 'edges'): self.edges = self.__defaultedges
So in order to get an instance attribute, one has to temporarily define it as a class attribute?
No, you don't *have* to do it this way for all instance attributes.
In this example, the user was trying to keep the edges computed during the __new__ method as an attribute. What are the possibilities?
1) Use the __new__ method to create the object in full and then store the edges in some kind of global (or class global) variable.
This solution because it uses global variables has all of the thread problems global variables bring.
2) Create a "dummy" arrayobject in the __new__ method and fill it in (i.e. using setstate or resize) during the __init__ method where the instance attribute is actually set.
I'm probably missing something obvious here, but why can't you just attach the attribute to the actual object in the __new__ method before returning it. For example:
Good point. I guess I thought the OP had tried that already. It turns out it works fine, too. The __array_finalize__ is useful if you want the attribute to be carried around when arrays are created automatically internally (after math operations for example). -Travis
Good point. I guess I thought the OP had tried that already. It turns out it works fine, too.
The __array_finalize__ is useful if you want the attribute to be carried around when arrays are created automatically internally (after math operations for example).
I too may be missing something here. Will using __array_finalize__ this way be thread safe or not? Sturla Molden
Sturla Molden wrote:
Good point. I guess I thought the OP had tried that already. It turns out it works fine, too.
The __array_finalize__ is useful if you want the attribute to be carried around when arrays are created automatically internally (after math operations for example).
I too may be missing something here.
Will using __array_finalize__ this way be thread safe or not?
Yes because __array_finalize__ is called while NumPy owns the GIL. It is called in one place during array creation (in the C routine that all array-creation routines call). -Travis
participants (6)
-
Jeremy Conlin
-
Pierre GM
-
Reggie Dugard
-
Stefan van der Walt
-
Sturla Molden
-
Travis Oliphant