New draft of NEP 31 — Context-local and global overrides of the NumPy API
Hello everyone, I’ve improved upon the content of NEP 31 to make it simpler, and also according to the new NEP template, only part of the NEP is being sent out to the mailing list. For the full nep, please see PR 14793<https://github.com/numpy/numpy/pull/14793>. ============================================================ NEP 31 — Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi <habbasi@quansight.com> :Author: Ralf Gommers <rgommers@quansight.com> :Author: Peter Bell <pbell@quansight.com> :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides in a separate namespace, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The primary end-goal of this NEP is to make the following possible: .. code:: python # On the library side import numpy.overridable as unp def library_function(array): array = unp.asarray(array) # Code using unumpy as usual return array # On the user side: import numpy.overridable as unp import uarray as ua import dask.array as da ua.register_backend(da) # Can be done within Dask itself library_function(dask_array) # works and returns dask_array with unp.set_backend(da): library_function([1, 2, 3, 4]) # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_ becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides. This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. In simpler words, one needs things like ``np.asarray(...)`` or an alternative to "just work" and return duck-arrays. Usage and Impact ---------------- This NEP allows for global and context-local overrides, as well as automatic overrides a-la ``__array_function__``. Here are some use-cases this NEP would enable, besides the first one stated in the motivation section: The first is allowing alternate dtypes to return their respective arrays. .. code:: python # Returns an XND array x = unp.ones((5, 5), dtype=xnd_dtype) # Or torch dtype The second is allowing overrides for parts of the API. This is to allow alternate and/or optimised implementations for ``np.linalg``, BLAS, and ``np.random``. .. code:: python import numpy as np import pyfftw # Or mkl_fft # Makes pyfftw the default for FFT np.set_global_backend(pyfftw) # Uses pyfftw without monkeypatching np.fft.fft(numpy_array) with np.set_backend(pyfftw) # Or mkl_fft, or numpy # Uses the backend you specified np.fft.fft(numpy_array) This will allow an official way for overrides to work with NumPy without monkeypatching or distributing a modified version of NumPy. Here are a few other use-cases, implied but not already stated: .. code:: python data = da.from_zarr('myfile.zarr') # result should still be dask, all things being equal result = library_function(data) result.to_zarr('output.zarr') This second one would work if ``magic_library`` was built on top of ``unumpy``. .. code:: python from dask import array as da from magic_library import pytorch_predict data = da.from_zarr('myfile.zarr') # normally here one would use e.g. data.map_overlap result = pytorch_predict(data) result.to_zarr('output.zarr') Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP.
participants (1)
-
Hameer Abbasi