This is the exact same reason why pandas and xarray do not support wrapping arbitrary ndarray subclasses or duck array types. The operations we use internally (on numpy.ndarray objects) may not be what you would expect externally, and may even be implementation details not considered part of the public API. For example, in xarray we use numpy.nanmean() or bottleneck.nanmean() instead of numpy.mean().
For NumPy and xarray, I think we could (and should) define an interface to support subclasses and duck types for generic operations for core use-cases. My main concern with subclasses / duck-arrays is undefined/untested behavior, especially where we might silently give the wrong answer or trigger some undesired operation (e.g., loading a lazily computed into memory) rather than raising an informative error. Leaking implementation details is another concern: we have already had several cases in NumPy where a function only worked on a subclass if a particular method was called internally, and broke when that was changed.