Hi Marten,

Thanks for the thoughtful reply.


On Sat, Jul 21, 2018 at 6:39 PM, Marten van Kerkwijk <m.h.vankerkwijk@gmail.com> wrote:
Hi Ralf,

Overall, this looks good. But I think the subclassing section is somewhat misleading in suggesting `ndarray` is not well designed to be subclassed. At least, for neither my work on Quantity nor that on MaskedArray, I've found that the design of `ndarray` itself was a problem. Instead, it was the functions that were, as most were not written with subclassing or duck typing in mind, but rather with the assumption that all input should be an array, and that somehow it is useful to pass anything users pass in through `asarray`. With then layers on top to avoid this in specific circumstances... But perhaps this is what you meant? (I would agree, though, that some ndarray subclasses have been designed poorly - especially, matrix, which then led to a problematic duck array in sparse - and that this has resulted in substantial hassle. Also, subclassing the subclasses is much more problematic that subclassing ndarray - MaskedArray being a particularly annoying example!)

You're completely right I think. We have had problems with subclasses for a long time, but that is due to mainly np.matrix being badly behaved, which then led to code everywhere using asarray, which then led to lots of issues with other subclasses. This basically meant subclasses were problematic, and hence most numpy devs would like to not see more subclasses.


The subclassing section also notes that subclassing has been discouraged for a long time. Is that so? Over time, I've certainly had comments from Nathaniel and some others in discussions of PRs  that go in that direction, which perhaps reflected some internal consensus I wasn't aware of,

I think yes there is some vague but not written down mostly-consensus, due to the dynamic with asarray above.
 
but the documentation does not seem to discourage it (check, e.g., the subclassing section [1]). I also think that it may be good to keep in mind that until `__array_ufunc__`, there wasn't much of a choice - support for duck arrays was even more half-hearted (hopefully to become much better with `__array_function__`).

True. I think long term duck arrays are the way to go, because asarray is not going to disappear. But for now we just have to do the best we can dealing with subclasses.

The subclassing doc [1] really needs an update on what the practical issues are.


Overall, it seems to me that these days in the python eco-system subclassing is simply expected to work. Even within numpy there are other examples (e.g., ufuncs, dtypes) for which there has been quite a bit of discussion about the benefits subclasses would bring.

I'm now thinking what to do with the subclassing section in the NEP. Best to completely remove? I was triggered to include it by some things Stephan said last week about subclasses being a blocker to adding new features. So if we keep the section, it may be helpful for you and Stephan to help shape that.

Cheers,
Ralf