On Tue, Oct 30, 2018 at 2:22 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
The Liskov substitution principle (LSP) suggests that the set of reasonable ndarray subclasses are exactly those that could also in principle correspond to a new dtype. Of np.ndarray subclasses in wide-spread use, I think only the various "array with units" types come close satisfying this criteria. They only fall short insofar as they present a misleading dtype (without unit information).

How about subclasses that only add functionality? My only use case of subclassing is exactly that:

I have a "bounding box" object (probably could have been called a rectangle) that is a subclass of ndarray, is always shape (2,2), and has various methods for merging two such boxes, etc, adding a point, etc.

I did it that way, 'cause I had a lot of code already that simply used a (2,2) array to represent a bounding box, and I wanted all that code to still work.

I have had zero problems with it.

Maybe that's too trivial to be worth talking about, but this kind of use case can be handy.

It is a bit awkward to write the code, though -- it would be nice to have a cleaner API for this sort of subclassing (not that I have any idea how to do that)

The main problem with subclassing for numpy.ndarray is that it guarantees too much: a large set of operations/methods along with a specific memory layout exposed as part of its public API.

This is a big deal -- we really have two concepts here:
 - a Python class (type) with certain behaviors in Python code
 - a wrapper around a strided memory block.

maybe it's possible to be clear about that distinction:

"Duck Arrays" are the Python API
Maybe a C-API object  would be useful, that shares the memory layout, but could have completely different functionality at the Python level. 



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception