[Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

Ralf Gommers ralf.gommers at gmail.com
Thu Apr 9 07:52:12 EDT 2020


On Thu, Apr 9, 2020 at 12:02 AM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Wed, 2020-04-08 at 17:04 -0400, Andreas Mueller wrote:
> > Hey all.
> > Is there any update on this? Is there any input we can provide as
> > users?
> > I'm not entirely sure where you are in the decision making process
> > right
> > now :)
> >
>
> Hey,
>
> thanks for the ping. Things are a bit stuck right now. I think what we
> need is some clarity on the implications and alternatives.
> I was thinking about organizing a small conference call with the main
> people interested in the next weeks.
>
> There are also still some alternatives to this NEP in the race, and we
> may need to clarify which ones are actually still in the race...
>
>
> Maybe to see some of the possible sticking points:
>
> 1. What do we do about SciPy, have it under this umbrella? And how
> would we want to design that.
>

Current feeling: best to ignore it for now. It's quite a bit of work to fix
API incompatibilities for linalg that no one currently seems interested in
tackling. We can revisit once that's done.


> 2. Context managers have some composition issues, maybe less so if they
> are in the downstream package. Or should we have global defaults as
> well?
>

+1 for adding this right next to get_array_module().


> 3. How do we ensure safe transitions for users as much as possible.
>    * If you use this, can functions suddenly return a different type
>      in the future?
>    * Should we force you to cast to NumPy arrays in a transition
>      period, or force you to somehow silence a transition warning?
>
> 4. Is there a serious push to have a "reduced" API or even a versioned
> API?
>

There is, it'll take a few months.

>
> But I am probably forgetting some other things.
>
>
> In my personal opinion, I think NEP 37 with minor modifications is
> still the best duck in the race. I feel we should be able to find a
> reasonable solution for SciPy.
> Point 2. about Context managers may be true, but this is much smaller
> in scope from the ones uarray proposed IIRC, and I could not figure out
> major scoping issues with it yet (the sklearn draft).
>
> About the safe transition, that may be the stickiest point. But e.g. if
> you enable `get_array_module` sklearn could limit a certain function to
> error out if it finds something other than NumPy?
> The main problem is how to do opt-in into future behaviour. A context
> manager can do that, although the danger is that someone just uses that
> everywhere...
>
> On the reduced/versioned API front, I would hope that we can defer that
> as a semi-orthogonal issue, basically saying that for now you have to
> provide a NumPy API that faithfully reproduces whatever NumPy version
> is installed on the system.
>

I think it would be nice to have a separate NEP 37 implementation outside
of NumPy to play with. Unlike __array_function__, I don't think it has to
go into NumPy immediately. This avoids the whole "experimental API" issue,
it would be quite useful to test this with, e.g., CuPy + scikit-learn
without being stuck with any decisions in a released NumPy version. Also
makes switching on/off very easy for users, just (don't) `pip install
numpy-array-module`.

Cheers,
Ralf


> Cheers,
>
> Sebastian
>
>
> > Cheers,
> > Andy
> >
> > On 3/3/20 6:34 PM, Sebastian Berg wrote:
> > > On Fri, 2020-02-28 at 11:28 -0500, Allan Haldane wrote:
> > > > On 2/23/20 6:59 PM, Ralf Gommers wrote:
> > > > > One of the main rationales for the whole NEP, and the argument
> > > > > in
> > > > > multiple places
> > > > > (
> > > > >
> https://numpy.org/neps/nep-0037-array-module.html#opt-in-vs-opt-out-for-users
> > > > > )
> > > > > is that it's now opt-in while __array_function__ was opt-out.
> > > > > This
> > > > > isn't
> > > > > really true - the problem is simply *moved*, from the duck
> > > > > array
> > > > > libraries to the array-consuming libraries. The end user will
> > > > > still
> > > > > see
> > > > > the backwards incompatible change, with no way to turn it off.
> > > > > It
> > > > > will
> > > > > be easier with __array_module__ to warn users, but this should
> > > > > be
> > > > > expanded on in the NEP.
> > > > Might it be possible to flip this NEP back to opt-out while
> > > > keeping
> > > > the
> > > > nice simplifications and configurabile array-creation routines,
> > > > relative
> > > > to __array_function__?
> > > >
> > > > That is, what if we define two modules, "numpy" and
> > > > "numpy_strict".
> > > > "numpy_strict" would raise an exception on duck-arrays defining
> > > > __array_module__ (as numpy currently does). "numpy" would be a
> > > > wrapper
> > > > around "numpy_strict" that decorates all numpy methods with a
> > > > call to
> > > > "get_array_module(inputs).func(inputs)".
> > > This would be possible, but I think we strongly leaned against the
> > > idea. Basically, if you have to opt-out, from a library perspective
> > > there may be `np.asarray` calls, which for example later call into
> > > C
> > > and expect arrays.
> > > So, I have large doubts that an opt-out solution works easily for
> > > library authors. Array function is opt-out, but effectively most
> > > clean
> > > library code already opted out...
> > >
> > > We had previously discussed the opposite, of having a namespace of
> > > implicit dispatching based on get_array_module, but if we keep
> > > array
> > > function around, I am not sure there is much reason for it.
> > >
> > > > Then end-user code that did "import numpy as np" would accept
> > > > ducktypes
> > > > by default, while library developers who want to signal they
> > > > don't
> > > > support ducktypes can opt-out by doing "import numpy_strict as
> > > > np".
> > > > Issues with `np.as_array` seem mitigated compared to
> > > > __array_function__
> > > > since that method would now be ducktype-aware.
> > > My tendency is that if we want to go there, we would need to push
> > > ahead
> > > with the `np.duckarray()` idea instead.
> > >
> > > To be clear: I currently very much prefer the get_array_module()
> > > idea.
> > > It just seems much cleaner for library authors, and they are the
> > > primary issue at the moment in my opinion.
> > >
> > > - Sebastian
> > >
> > >
> > > > -Allan
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > >
> > > >
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200409/ea29abf5/attachment.html>


More information about the NumPy-Discussion mailing list