
On Saturday 16 December 2006 19:55, Colin J. Williams wrote: Colin, First of all, a disclaimer: I'm a (bad) hydrologist, not a computer scientist. I learned python/numpy by playing around, and really got into subclassing since 3-4 months ago. My explanations might not be completely accurate, I'll ask more experienced users to correct me if I'm wrong. `__new__` is the class constructor method. A call to `__new__(cls,...)` creates a new instance of the class `cls`, but doesn't initialize the instance, that's the role of the `__init__` method. According to the python documentation, If __new__() returns an instance of cls, then the new instance's __init__() method will be invoked like "__init__(self[, ...])", where self is the new instance and the remaining arguments are the same as were passed to __new__(). If __new__() does not return an instance of cls, then the new instance's __init__() method will not be invoked. __new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It turns out that ndarrays behaves as immutable types, therefore an `__init__` method is never called. How can we initialize the instance, then ? By calling `__array_finalize__`. `__array_finalize__` is called automatically once an instance is created with `__new__`. Moreover, it is called each time a new array is returned by a method, even if the method doesn't specifically call `__new__`. For example, the `__add__`, `__iadd__`, `reshape` return new arrays, so `__array_finalize` is called. Note that these methods do not create a new array from scratch, so there is no call to `__new__`. As another example, we can also modify the shape of the array with `resize`. However, this method works in place, so a new array is NOT created. About the `obj` argument in `__array_finalize__`: The first time a subarray is created, `__array_finalize__` is called with the argument `obj` as a regular ndarray. Afterwards, when a new array is returned without ccall to `__new__`, the `obj` argument is the initial subarray (the one calling the method). The easier is to try and see what happens. Here's a small script that defines a `InfoArray` class: just a ndarray with a tag attached. That's basically the class of the wiki, with messages printed in `__new__` and `__array_finalize__`. I join some doctest to illustrate some of the concepts, I hope it will be explanatory enough. Please let me know whether it helps. If it does, I'll update the wiki page ############################################## """ Let us define a new InfoArray object
x = InfoArray(N.arange(10), info={'name':'x'}) __new__ received <type 'numpy.ndarray'> __new__ sends <type 'numpy.ndarray'> as <class '__main__.InfoArray'> __array_finalize__ received <type 'numpy.ndarray'> __array_finalize__ defined <class '__main__.InfoArray'>
Let's get the first element:
x[0] 0
We expect a scalar, we get a scalar, everything's fine. If now we want all the elements, we can use `x[:]`, which calls `__getslice__` and returns a new array. Therefore, we expect `__array_finalize__` to get called:
x[:] __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> InfoArray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Let's add 1 to the array: this operation calls the `__add__` method, which returns a new array from `x`
x+1 __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> InfoArray([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Let us change the shape of the array from *(10,)* to *(2,5)* with the `reshape` method. The method returns a new array, so we expect a call to `array_finalize`:
y = x.reshape((2,5)) __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'>
If now we print y, we call the __repr__ method, which in turns defines as many arrays as rows: we expect 2 calls to `__array_finalize__`:
print y __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> [[0 1 2 3 4] [5 6 7 8 9]]
Let's change the shape of `y` back to *(10,)*, but using the `resize` method this time. `resize` works in place, so a new array isn't be created, and `array_finalize` is not called.
y.resize((10,)) y.shape (10,)
OK, and what about `transpose` ? Well, it returns a new array (1 call), plus as we print it, we have *rows* calls to `array_finalize`, a total of *rows+1* calls
y.resize((5,2)) print y.T __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> [[0 1 2 3 4] [5 6 7 8 9]]
Now let's create a new array from scratch. `__new__` is called, but as the argument is already an InfoArray, the *__new__ sends...* line is bypassed. Moreover, if we don't precise the type, we call `data.astype` which in turn calls `__array_finalize__`. Then, `__array_finalize__` is called a second time, this time to initialize the new object.
z = InfoArray(x) __new__ received <class '__main__.InfoArray'> __new__ saw another dtype. __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'> __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'>
Note that if we precise the dtype, we don't have to call `data.astype`, and `__array_finalize`` gets called once:
z = InfoArray(x, dtype=x.dtype) __new__ received <class '__main__.InfoArray'> __new__ saw the same dtype. __array_finalize__ received <class '__main__.InfoArray'> __array_finalize__ defined <class '__main__.InfoArray'>
""" import numpy as N class InfoArray(N.ndarray): def __new__(subtype, data, info=None, dtype=None, copy=False): # When data is an InfoArray print "__new__ received %s" % type(data) if isinstance(data, InfoArray): if not copy and dtype==data.dtype: print "__new__ saw the same dtype." return data.view(subtype) else: print "__new__ saw another dtype." return data.astype(dtype).view(subtype) subtype._info = info subtype.info = subtype._info print "__new__ sends %s as %s" % (type(N.asarray(data)), subtype) return N.array(data).view(subtype) def __array_finalize__(self,obj): print "__array_finalize__ received %s" % type(obj) if hasattr(obj, "info"): # The object already has an info tag: just use it self.info = obj.info else: # The object has no info tag: use the default self.info = self._info print "__array_finalize__ defined %s" % type(self) def _test(): import doctest doctest.testmod(verbose=True) if __name__ == "__main__": _test()
participants (2)
-
Colin J. Williams
-
Pierre GM