Subclassing ndarray with concatenate

I am trying to create a subclass of ndarray that has additional attributes. These attributes are maintained with most numpy functions if __array_finalize__ is used.
The main exception I have found is concatenate (and hstack/vstack, which just wrap concatenate). In this case, __array_finalize__ is passed an array that has already been stripped of the additional attributes, and I don't see a way to recover this information.
In my particular case at least, there are clear ways to handle corner cases (like being passed a class that lacks these attributes), so in principle there no problem handling concatenate in a general way, assuming I can get access to the attributes.
So is there any way to subclass ndarray in such a way that concatenate can be handled properly?
I have been looking extensively online, but have not been able to find a clear answer on how to do this, or if there even is a way.

Hey,
On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
I am trying to create a subclass of ndarray that has additional attributes. These attributes are maintained with most numpy functions if __array_finalize__ is used.
You can cover a bit more if you also implement `__array_wrap__`, though unless you want to do something fancy, that just replaces the `__array_finalize__` for the most part. But some (very few) functions currently call `__array_wrap__` explicitly.
The main exception I have found is concatenate (and hstack/vstack, which just wrap concatenate). In this case, __array_finalize__ is passed an array that has already been stripped of the additional attributes, and I don't see a way to recover this information.
There are quite a few functions that simply do not preserve subclasses (though I think more could/should call `__array_wrap__` probably, even if the documentation may say that it is about ufuncs, there are some example of this already). `np.concatenate` is one of these. It always returns a base array. In any case it gets a bit difficult if you have multiple input arrays (which may not matter for you).
In my particular case at least, there are clear ways to handle corner cases (like being passed a class that lacks these attributes), so in principle there no problem handling concatenate in a general way, assuming I can get access to the attributes.
So is there any way to subclass ndarray in such a way that concatenate can be handled properly?
Quite simply, no. If you compare masked arrays, they also provide their own concatenate for this reason.
I hope that helps a bit...
Regards,
Sebastian
I have been looking extensively online, but have not been able to find a clear answer on how to do this, or if there even is a way.
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Tue, 2013-01-22 at 13:44 +0100, Sebastian Berg wrote:
Hey,
On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
I am trying to create a subclass of ndarray that has additional attributes. These attributes are maintained with most numpy functions if __array_finalize__ is used.
You can cover a bit more if you also implement `__array_wrap__`, though unless you want to do something fancy, that just replaces the `__array_finalize__` for the most part. But some (very few) functions currently call `__array_wrap__` explicitly.
Actually have to correct myself here. The default __array_wrap__ causes __array_finalize__ to be called as you would expect, so there is no need to use it unless you want to do something fancy.
The main exception I have found is concatenate (and hstack/vstack, which just wrap concatenate). In this case, __array_finalize__ is passed an array that has already been stripped of the additional attributes, and I don't see a way to recover this information.
There are quite a few functions that simply do not preserve subclasses (though I think more could/should call `__array_wrap__` probably, even if the documentation may say that it is about ufuncs, there are some example of this already). `np.concatenate` is one of these. It always returns a base array. In any case it gets a bit difficult if you have multiple input arrays (which may not matter for you).
In my particular case at least, there are clear ways to handle corner cases (like being passed a class that lacks these attributes), so in principle there no problem handling concatenate in a general way, assuming I can get access to the attributes.
So is there any way to subclass ndarray in such a way that concatenate can be handled properly?
Quite simply, no. If you compare masked arrays, they also provide their own concatenate for this reason.
I hope that helps a bit...
Regards,
Sebastian
I have been looking extensively online, but have not been able to find a clear answer on how to do this, or if there even is a way.
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Tue, Jan 22, 2013 at 1:44 PM, Sebastian Berg sebastian@sipsolutions.netwrote:
Hey,
On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
The main exception I have found is concatenate (and hstack/vstack, which just wrap concatenate). In this case, __array_finalize__ is passed an array that has already been stripped of the additional attributes, and I don't see a way to recover this information.
There are quite a few functions that simply do not preserve subclasses (though I think more could/should call `__array_wrap__` probably, even if the documentation may say that it is about ufuncs, there are some example of this already). `np.concatenate` is one of these. It always returns a base array. In any case it gets a bit difficult if you have multiple input arrays (which may not matter for you).
I don't think this is right. I tried it and it doesn't return a base array, it returns an instance of the original array subclass.
In my particular case at least, there are clear ways to handle corner cases (like being passed a class that lacks these attributes), so in principle there no problem handling concatenate in a general way, assuming I can get access to the attributes.
So is there any way to subclass ndarray in such a way that concatenate can be handled properly?
Quite simply, no. If you compare masked arrays, they also provide their own concatenate for this reason.
I hope that helps a bit...
Is this something that should be available? For instance a method that provides both the new array and the arrays that were used to construct it. This would seem to be an extremely common use-case for array subclasses, so letting them gracefully handle this would seem to be very important.

On Wed, 2013-01-30 at 10:24 +0100, Todd wrote:
On Tue, Jan 22, 2013 at 1:44 PM, Sebastian Berg sebastian@sipsolutions.net wrote: Hey,
On Tue, 2013-01-22 at 10:21 +0100, Todd wrote: > The main exception I have found is concatenate (and hstack/vstack, > which just wrap concatenate). In this case, __array_finalize__ is > passed an array that has already been stripped of the additional > attributes, and I don't see a way to recover this information. > There are quite a few functions that simply do not preserve subclasses (though I think more could/should call `__array_wrap__` probably, even if the documentation may say that it is about ufuncs, there are some example of this already). `np.concatenate` is one of these. It always returns a base array. In any case it gets a bit difficult if you have multiple input arrays (which may not matter for you).
I don't think this is right. I tried it and it doesn't return a base array, it returns an instance of the original array subclass.
Yes you are right it preserves type, I was fooled by `__array_priority__` being 0 as default, thought it defaulted to more then 0 (for ufuncs everything beats arrays, not sure if it really should) but so I missed.
In any case, yes, it calls __array_finalize__, but as you noticed, it calls it without the original array. Now it would be very easy and harmless to change that, however I am not sure if giving only the parent array is very useful (ie. you only get the one with highest array priority).
Another way to get around it would be maybe to call __array_wrap__ like ufuncs do (with a context, so you get all inputs, but then the non-array axis argument may not be reasonably placed into the context).
In any case, if you think it would be helpful to at least get the single parent array, that would be a very simple change, but I feel the whole subclassing could use a bit thinking and quite a bit of work probably, since I am not quite convinced that calling __array_wrap__ with a complicated context from as many functions as possible is the right approach for allowing more complex subclasses.
> In my particular case at least, there are clear ways to handle corner > cases (like being passed a class that lacks these attributes), so in > principle there no problem handling concatenate in a general way, > assuming I can get access to the attributes. > > > So is there any way to subclass ndarray in such a way that concatenate > can be handled properly? > Quite simply, no. If you compare masked arrays, they also provide their own concatenate for this reason. I hope that helps a bit...
Is this something that should be available? For instance a method that provides both the new array and the arrays that were used to construct it. This would seem to be an extremely common use-case for array subclasses, so letting them gracefully handle this would seem to be very important.
NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Wed, Jan 30, 2013 at 11:20 AM, Sebastian Berg <sebastian@sipsolutions.net
wrote:
> In my particular case at least, there are clear ways to handle corner > cases (like being passed a class that lacks these attributes), so in > principle there no problem handling concatenate in a general way, > assuming I can get access to the attributes. > > > So is there any way to subclass ndarray in such a way that concatenate > can be handled properly? > Quite simply, no. If you compare masked arrays, they also provide their own concatenate for this reason. I hope that helps a bit...
Is this something that should be available? For instance a method that provides both the new array and the arrays that were used to construct it. This would seem to be an extremely common use-case for array subclasses, so letting them gracefully handle this would seem to be very important.
In any case, yes, it calls __array_finalize__, but as you noticed, it calls it without the original array. Now it would be very easy and harmless to change that, however I am not sure if giving only the parent array is very useful (ie. you only get the one with highest array priority).
Another way to get around it would be maybe to call __array_wrap__ like ufuncs do (with a context, so you get all inputs, but then the non-array axis argument may not be reasonably placed into the context).
In any case, if you think it would be helpful to at least get the single parent array, that would be a very simple change, but I feel the whole subclassing could use a bit thinking and quite a bit of work probably, since I am not quite convinced that calling __array_wrap__ with a complicated context from as many functions as possible is the right approach for allowing more complex subclasses.
I was more thinking of a new method that is called when more than one input array is used, maybe something like __multi_array_finalize__. This would allow more fine-grained handling of such cases and would not break backwards compatibility with any existing subclasses (if they don't override the method the current behavior will remain).
participants (2)
-
Sebastian Berg
-
Todd