
On Tue, Oct 30, 2018 at 2:22 PM, Stephan Hoyer <shoyer@gmail.com> wrote:
The Liskov substitution principle (LSP) suggests that the set of reasonable ndarray subclasses are exactly those that could also in principle correspond to a new dtype. Of np.ndarray subclasses in wide-spread use, I think only the various "array with units" types come close satisfying this criteria. They only fall short insofar as they present a misleading dtype (without unit information).
How about subclasses that only add functionality? My only use case of subclassing is exactly that: I have a "bounding box" object (probably could have been called a rectangle) that is a subclass of ndarray, is always shape (2,2), and has various methods for merging two such boxes, etc, adding a point, etc. I did it that way, 'cause I had a lot of code already that simply used a (2,2) array to represent a bounding box, and I wanted all that code to still work. I have had zero problems with it. Maybe that's too trivial to be worth talking about, but this kind of use case can be handy. It is a bit awkward to write the code, though -- it would be nice to have a cleaner API for this sort of subclassing (not that I have any idea how to do that) The main problem with subclassing for numpy.ndarray is that it guarantees
too much: a large set of operations/methods along with a specific memory layout exposed as part of its public API.
This is a big deal -- we really have two concepts here: - a Python class (type) with certain behaviors in Python code - a wrapper around a strided memory block. maybe it's possible to be clear about that distinction: "Duck Arrays" are the Python API Maybe a C-API object would be useful, that shares the memory layout, but could have completely different functionality at the Python level. - CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov