[Numpy-discussion] NEP 31 — Context-local and global overrides of the NumPy API
einstein.edison at gmail.com
Tue Sep 10 11:28:34 EDT 2019
On 07.09.19 22:06, Sebastian Berg wrote:
> On Fri, 2019-09-06 at 14:45 -0700, Ralf Gommers wrote:
> Let me try to move the discussion from the github issue here (this may
> not be the best place). (https://github.com/numpy/numpy/issues/14441
> which asked for easier creation functions together with `__array_function__`).
> I think an important note mentioned here is how users interact with
> unumpy, vs. __array_function__. The former is an explicit opt-in, while
> the latter is implicit choice based on an `array-like` abstract base
> class and functional type based dispatching.
> To quote NEP 18 on this: "The downsides are that this would require an
> explicit opt-in from all existing code, e.g., import numpy.api as np,
> and in the long term would result in the maintenance of two separate
> NumPy APIs. Also, many functions from numpy itself are already
> overloaded (but inadequately), so confusion about high vs. low level
> APIs in NumPy would still persist."
> (I do think this is a point we should not just ignore, `uarray` is a
> thin layer, but it has a big surface area)
> Now there are things where explicit opt-in is obvious. And the FFT
> example is one of those, there is no way to implicitly choose another
> backend (except by just replacing it, i.e. monkeypatching) . And
> right now I think these are _very_ different.
> Now for the end-users choosing one array-like over another, seems nicer
> as an implicit mechanism (why should I not mix sparse, dask and numpy
> arrays!?). This is the promise `__array_function__` tries to make.
> Unless convinced otherwise, my guess is that most library authors would
> strive for implicit support (i.e. sklearn, skimage, scipy).
You can, once you register the backend it becomes implicit, so all
backends are tried until one succeeds. Unless you explicitly say "I do
not want another backend" (only/coerce=True).
> Circling back to creation and coercion. In a purely Object type system,
> these would be classmethods, I guess, but in NumPy and the libraries
> above, we are lost.
> Solution 1: Create explicit opt-in, e.g. through uarray. (NEP-31)
> * Required end-user opt-in.
> * Seems cleaner in many ways
> * Requires a full copy of the API.
> Solution 2: Add some coercion "protocol" (NEP-30) and expose a way to
> create new arrays more conveniently. This would practically mean adding
> an `array_type=np.ndarray` argument.
> * _Not_ used by end-users! End users should use dask.linspace!
> * Adds "strange" API somewhere in numpy, and possible a new
> "protocol" (additionally to coercion).
> I still feel these solve different issues. The second one is intended
> to make array likes work implicitly in libraries (without end users
> having to do anything). While the first seems to force the end user to
> opt in, sometimes unnecessarily:
> def my_library_func(array_like):
> exp = np.exp(array_like)
> idx = np.arange(len(exp))
> return idx, exp
> Would have all the information for implicit opt-in/Array-like support,
> but cannot do it right now. This is what I have been wondering, if
> uarray/unumpy, can in some way help me make this work (even _without_
> the end user opting in). The reason is that simply, right now I am very
> clear on the need for this use case, but not sure about the need for
> end user opt in, since end users can just use dask.arange().
Sure, the end user can, but library authors cannot. And end users may
want to easily port code to GPU or between back-ends, just as library
>  To be honest, I do think a lot of the "issues" around
> monkeypatching exists just as much with backend choosing, the main
> difference seems to me that a lot of that:
> 1. monkeypatching was not done explicit
> (import mkl_fft; mkl_fft.monkeypatch_numpy())?
> 2. A backend system allows libaries to prefer one locally?
> (which I think is a big advantage)
>  There are the options of adding `linspace_like` functions somewhere
> in a numpy submodule, or adding `linspace(..., array_type=np.ndarray)`,
> or simply inventing a new "protocl" (which is not really a protocol?),
> and make it `ndarray.__numpy_like_creation_functions__.arange()`.
Handling things like RandomState can get complicated here.
More information about the NumPy-Discussion