[Numpy-discussion] asanyarray vs. asarray
shoyer at gmail.com
Tue Oct 30 17:22:04 EDT 2018
On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> The latter - changing the behavior of multiplication breaks the principle.
> But this is not the main reason for deprecating matrix - almost all of the
> problems I’ve seen have been caused by the way that matrices behave when
> sliced. The way that m[i][j] and m[i,j] are different is just one example
> of this, the fact that they must be 2d is another.
> Matrices behaving differently on multiplication isn’t super different in
> my mind to how string arrays fail to multiply at all.
It's certainly fine for arithmetic to work differently on an element-wise
basis or even to error. But np.matrix changes the shape of results from
various ndarray operations (e.g., both multiplication and indexing), which
is more than any dtype can do.
The Liskov substitution principle (LSP) suggests that the set of reasonable
ndarray subclasses are exactly those that could also in principle
correspond to a new dtype. Of np.ndarray subclasses in wide-spread use, I
think only the various "array with units" types come close satisfying this
criteria. They only fall short insofar as they present a misleading dtype
(without unit information).
The main problem with subclassing for numpy.ndarray is that it guarantees
too much: a large set of operations/methods along with a specific memory
layout exposed as part of its public API. Worse, ndarray itself is a little
quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In
practice, it's basically impossible to layer on complex behavior with these
exact semantics, so only extremely minimal ndarray subclasses don't violate
Once we have more easily extended dtypes, I suspect most of the good use
cases for subclassing will have gone away.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion