[Numpy-discussion] NEP 31 — Context-local and global overrides of the NumPy API

Hameer Abbasi einstein.edison at gmail.com
Fri Sep 6 04:32:25 EDT 2019


That's a lot of very good questions! Let me see if I can answer them 
one-by-one.

On 06.09.19 09:49, Nathaniel Smith wrote:
> Ah, whoops, I definitely missed that :-). That does change things!
> So one of the major decision points for any duck-array API work, is
> whether to modify the numpy semantics "in place", so user code
> automatically gets access to the new semantics, or else to make a new
> namespace, that users have to switch over to manually.
>
> The major disadvantage of doing changes "in place" is, of course, that
> we have to do all this careful work to move incrementally and make
> sure that we don't break things. The major (potential) advantage is
> that we have a much better chance of moving the ecosystem with us.
>
> The major advantage of making a new namespace is that it's *much*
> easier to experiment, because there's no chance of breaking any
> projects that didn't opt in. The major disadvantage is that numpy is
> super strongly entrenched, and convincing every project to switch to
> something else is incredibly difficult and costly. (I just searched
> github for "import numpy" and got 17.7 million hits. That's a lot of
> imports to update!) Also, empirically, we've seen multiple projects
> try to do this (e.g. DyND), and so far they all failed.
>
> It sounds like unumpy is an interesting approach that hasn't been
> tried before – in particular, the promise that you can "just switch
> your imports" is a much easier transition than e.g. DyND offered. Of
> course, that promise is somewhat undermined by the reality that all
> these potential backend libraries *aren't* 100% compatible with numpy,
> and can't be...
This is true, however, with minor adjustments it should be possible to 
make your code work across backends, if you don't use a few obscure 
parts of NumPy.
>   it might turn out that this ends up like asanyarray,
> where you can't really use it reliably because the thing that comes
> out will generally support *most* of the normal ndarray semantics, but
> you don't know which part. Is scipy planning to switch to using this
> everywhere, including in C code?
Not at present I think, however, it should be possible to "re-write" 
parts of scipy on top of unumpy in order to make that work, and where 
speed is required and an efficient implementation isn't available in 
terms of NumPy functions, make dispatchable multimethods and allow 
library authors to provide the said implementations. We'll call this 
project uscipy, but that's an endgame at this point. Right now, we're 
focusing on unumpy.
> If not, then how do you expect
> projects like matplotlib to switch, given that matplotlib likes to
> pass array objects into scipy functions? Are you planning to take the
> opportunity to clean up some of the obscure corners of the numpy API?

That's a completely different thing, and to answer that question 
requires a distinction between uarray and unumpy... uarray is a 
backend-mechanism, independent of array computing. We hope that 
matplotlib will adopt it to switch around it's GUI back-ends for example.

> But those are general questions about unumpy, and I'm guessing no-one
> knows all the answers yet... and these question actually aren't super
> relevant to the NEP. The NEP isn't inventing unumpy. IIUC, the main
> thing the NEP is proposes is simply to make  "numpy.overridable" an
> alias for "unumpy".
>
> It's not clear to me what problem this alias is solving. If all
> downstream users have to update their imports anyway, then they can
> write "import unumpy as np" just as easily as they can write "import
> numpy.overridable as np". I guess the main reason this is a NEP is
> because the unumpy project is hoping to get an "official stamp of
> approval" from numpy?

That's part of it. The concrete problems it's solving are threefold:

 1. Array creation functions can be overridden.
 2. Array coercion is now covered.
 3. "Default implementations" will allow you to re-write your NumPy
    array more easily, when such efficient implementations exist in
    terms of other NumPy functions. That will also help achieve similar
    semantics, but as I said, they're just "default"...

The import numpy.overridable part is meant to help garner adoption, and 
to prefer the unumpy module if it is available (which will continue to 
be developed separately). That way it isn't so tightly coupled to the 
release cycle. One alternative Sebastian Berg mentioned (and I am on 
board with) is just moving unumpy into the NumPy organisation. What we 
fear keeping it separate is that the simple act of a pip install unumpy 
will keep people from using it or trying it out.

> But even that could be accomplished by just
> putting something in the docs. And adding the alias has substantial
> risks: it makes unumpy tied to the numpy release cycle and
> compatibility rules, and it means that we're committing to maintaining
> unumpy ~forever even if Hameer or Quansight move onto other things.
> That seems like a lot to take on for such vague benefits?

I can assure you Travis has had the goal of "replatforming SciPy" from 
as far back as I met him, he's spawned quite a few efforts in that 
direction along with others from Quansight (and they've led to nice 
projects). Quansight, as I see it, is unlikely to abandon something like 
this if it becomes successful (and acceptance of this NEP will be a huge 
success story).

> On Tue, Sep 3, 2019 at 2:04 AM Hameer Abbasi<einstein.edison at gmail.com>  wrote:
>> The fact that we're having to design more and more protocols for a lot
>> of very similar things is, to me, an indicator that we do have holistic
>> problems that ought to be solved by a single protocol.
> But the reason we've had trouble designing these protocols is that
> they're each different :-). If it was just a matter of copying
> __array_ufunc__ we'd have been done in a few minutes...
uarray borrows heavily from __array_function__. It allows substituting 
(for example) __array_ufunc__ by overriding ufunc.__call__, ufunc.reduce 
and so on. It takes, as I mentioned, a holistic approach: There are 
callables that need to be overriden, possibly with nothing to dispatch 
on. And then it builds on top of that, adding coercion/conversion.
> -n
>
> --
> Nathaniel J. Smith --https://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190906/790ab0e5/attachment.html>


More information about the NumPy-Discussion mailing list