[Numpy-discussion] asanyarray vs. asarray

Nathaniel Smith njs at pobox.com
Fri Oct 19 22:50:01 EDT 2018


On Fri, Oct 19, 2018 at 6:23 PM, Marten van Kerkwijk
<m.h.vankerkwijk at gmail.com> wrote:
> Hi All,
>
> It seems there are two extreme possibilities for general functions:
> 1. Put `asarray` everywhere. The main benefit that I can see is that even if
> people put in list instead of arrays, one is guaranteed to have shape,
> dtype, etc. But it seems a bit like calling `int` on everything that might
> get used as an index, instead of letting the actual indexing do the proper
> thing and call `__index__`.
> 2. Do not coerce at all, but rather write code assuming something is an
> array already. This will often, but not always, just work for array mimics,
> with coercion done only where necessary (e.g., in lower-lying C code such as
> that of the ufuncs which has a smaller API surface and can be overridden
> more easily).

Between these two options, Numpy's APIs are very firmly on the side of
"option 1", and this is common in most public APIs I'm familiar with
(e.g. scipy). I guess you could try to reopen the discussion, but
you'd be pushing against 15+ years of precedent there...

> The current __array_function__ work may well provide us with a way to
> combine both, if we (over time) move the coercion inside
> `ndarray.__array_function__` so that the actual implementation *can* assume
> it deals with pure ndarray - then, when relevant, calling that
> implementation will be what subclasses/duck arrays can happily do (and it is
> up to them to ensure this works).
>
> Of course, the above does not really answer what to do in the meantime. But
> perhaps it helps in thinking of what we are actually aiming for.

We need some kind of asduckarray(), that coerces lists and similar but
allows duck-arrays to pass through.

> One last thing: could we please stop bashing subclasses? One can subclass
> essentially everything in python, often to great advantage. Subclasses such
> as MaskedArray and, yes, Quantity, are widely used, and if they cause
> problems perhaps that should be seen as a sign that ndarray subclassing
> should be made easier and clearer.

Who's bashing? I've spent years thinking about this and come to the
conclusion that there are no viable solutions to the problems with
subclassing ndarray, but that's not the same as bashing :-). If you've
thought of something we've missed, you should share it...

(I also know lots of senior Python devs who believe that using
Python's subclassing support is pretty much always a mistake – this
talk is popularly cited: https://www.youtube.com/watch?v=3MNVP9-hglc –
but the issues with ndarray are much more severe than for the average
Python class.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the NumPy-Discussion mailing list