[Numpy-discussion] asanyarray vs. asarray

Matthew Harrigan harrigan.matthew at gmail.com
Tue Oct 30 19:35:30 EDT 2018


Would the extended dtypes also violate the Liskov substitution principle?
In place operations which would mutate the dtype are one potential issue.
Would a single dtype for an array be sufficient, i.e. np.polynomial
coefficients?  Compared to ndarray subclasses, the memory layout issue goes
away, but there is still a large set of operations exposed as part of a
public API with various quirks.  I can imagine a new function "asunitless"
scattered around downstream projects.

On Tue, Oct 30, 2018 at 5:23 PM Stephan Hoyer <shoyer at gmail.com> wrote:

> On Mon, Oct 29, 2018 at 9:49 PM Eric Wieser <wieser.eric+numpy at gmail.com>
> wrote:
>
>> The latter - changing the behavior of multiplication breaks the principle.
>>
>> But this is not the main reason for deprecating matrix - almost all of
>> the problems I’ve seen have been caused by the way that matrices behave
>> when sliced. The way that m[i][j] and m[i,j] are different is just one
>> example of this, the fact that they must be 2d is another.
>>
>> Matrices behaving differently on multiplication isn’t super different in
>> my mind to how string arrays fail to multiply at all.
>>
>> Eric
>>
> It's certainly fine for arithmetic to work differently on an element-wise
> basis or even to error. But np.matrix changes the shape of results from
> various ndarray operations (e.g., both multiplication and indexing), which
> is more than any dtype can do.
>
> The Liskov substitution principle (LSP) suggests that the set of
> reasonable ndarray subclasses are exactly those that could also in
> principle correspond to a new dtype. Of np.ndarray subclasses in
> wide-spread use, I think only the various "array with units" types come
> close satisfying this criteria. They only fall short insofar as they
> present a misleading dtype (without unit information).
>
> The main problem with subclassing for numpy.ndarray is that it guarantees
> too much: a large set of operations/methods along with a specific memory
> layout exposed as part of its public API. Worse, ndarray itself is a little
> quirky (e.g., with indexing, and its handling of scalars vs. 0d arrays). In
> practice, it's basically impossible to layer on complex behavior with these
> exact semantics, so only extremely minimal ndarray subclasses don't violate
> LSP.
>
> Once we have more easily extended dtypes, I suspect most of the good use
> cases for subclassing will have gone away.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181030/dbaab248/attachment-0001.html>


More information about the NumPy-Discussion mailing list