[Numpy-discussion] NEP 37: A dispatch protocol for NumPy-like modules

Eric Wieser wieser.eric+numpy at gmail.com
Wed Feb 5 13:13:52 EST 2020


>  scipy.linalg is a superset of numpy.linalg

This isn't completely accurate - numpy.linalg supports almost all
operations* over stacks of matrices via gufuncs, but scipy.linalg does not
appear to.

Eric

*: not lstsq due to an ungeneralizable public API

On Wed, 5 Feb 2020 at 17:38, Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
> On Wed, Feb 5, 2020 at 10:01 AM Andreas Mueller <t3kcit at gmail.com> wrote:
>
>> A bit late to the NEP 37 party.
>> I just wanted to say that at least from my perspective it seems a great
>> solution that will help sklearn move towards more flexible compute engines.
>> I think one of the biggest issues is array creation (including random
>> arrays), and that's handled quite nicely with NEP 37.
>>
>> There's some discussion on the scikit-learn side here:
>> https://github.com/scikit-learn/scikit-learn/pull/14963
>> https://github.com/scikit-learn/scikit-learn/issues/11447
>>
>> Two different groups of people tried to use __array_function__ to
>> delegate to MxNet and CuPy respectively in scikit-learn, and ran into the
>> same issues.
>>
>> There's some remaining issues in sklearn that will not be handled by NEP
>> 37 but they go beyond NumPy in some sense.
>> Just to briefly bring them up:
>>
>> - We use scipy.linalg in many places, and we would need to do a separate
>> dispatching to check whether we can use module.linalg instead
>>  (that might be an issue for many libraries but I'm not sure).
>>
>
> That is an issue, and goes in the opposite direction we need -
> scipy.linalg is a superset of numpy.linalg, so we'd like to encourage using
> scipy. This is something we may want to consider fixing by making the
> dispatch decorator public in numpy and adopting in scipy.
>
> Cheers,
> Ralf
>
>
>
>>
>> - Some models have several possible optimization algorithms, some of
>> which are pure numpy and some which are Cython. If someone provides a
>> different array module,
>>  we might want to choose an algorithm that is actually supported by that
>> module. While this exact issue is maybe sklearn specific, a similar issue
>> could appear for most downstream libs that use Cython in some places.
>>  Many Cython algorithms could be implemented in pure numpy with a
>> potential slowdown, but once we have NEP 37 there might be a benefit to
>> having a pure NumPy implementation as an alternative code path.
>>
>>
>> Anyway, NEP 37 seems a great step in the right direction and would enable
>> sklearn to actually dispatch in some places. Dispatching just based on
>> __array_function__ seems not really feasible so far.
>>
>> Best,
>> Andreas Mueller
>>
>>
>> On 1/6/20 11:29 PM, Stephan Hoyer wrote:
>>
>> I am pleased to present a new NumPy Enhancement Proposal for discussion:
>> "NEP-37: A dispatch protocol for NumPy-like modules." Feedback would be
>> very welcome!
>>
>> The full text follows. The rendered proposal can also be found online at
>> https://numpy.org/neps/nep-0037-array-module.html
>>
>> Best,
>> Stephan Hoyer
>>
>> ===================================================
>> NEP 37 — A dispatch protocol for NumPy-like modules
>> ===================================================
>>
>> :Author: Stephan Hoyer <shoyer at google.com>
>> :Author: Hameer Abbasi
>> :Author: Sebastian Berg
>> :Status: Draft
>> :Type: Standards Track
>> :Created: 2019-12-29
>>
>> Abstract
>> --------
>>
>> NEP-18's ``__array_function__`` has been a mixed success. Some projects
>> (e.g.,
>> dask, CuPy, xarray, sparse, Pint) have enthusiastically adopted it. Others
>> (e.g., PyTorch, JAX, SciPy) have been more reluctant. Here we propose a
>> new
>> protocol, ``__array_module__``, that we expect could eventually subsume
>> most
>> use-cases for ``__array_function__``. The protocol requires explicit
>> adoption
>> by both users and library authors, which ensures backwards compatibility,
>> and
>> is also significantly simpler than ``__array_function__``, both of which
>> we
>> expect will make it easier to adopt.
>>
>> Why ``__array_function__`` hasn't been enough
>> ---------------------------------------------
>>
>> There are two broad ways in which NEP-18 has fallen short of its goals:
>>
>> 1. **Maintainability concerns**. `__array_function__` has significant
>>    implications for libraries that use it:
>>
>>    - Projects like `PyTorch
>>      <https://github.com/pytorch/pytorch/issues/22402>`_, `JAX
>>      <https://github.com/google/jax/issues/1565>`_ and even `scipy.sparse
>>      <https://github.com/scipy/scipy/issues/10362>`_ have been reluctant
>> to
>>      implement `__array_function__` in part because they are concerned
>> about
>>      **breaking existing code**: users expect NumPy functions like
>>      ``np.concatenate`` to return NumPy arrays. This is a fundamental
>>      limitation of the ``__array_function__`` design, which we chose to
>> allow
>>      overriding the existing ``numpy`` namespace.
>>    - ``__array_function__`` currently requires an "all or nothing"
>> approach to
>>      implementing NumPy's API. There is no good pathway for **incremental
>>      adoption**, which is particularly problematic for established
>> projects
>>      for which adopting ``__array_function__`` would result in breaking
>>      changes.
>>    - It is no longer possible to use **aliases to NumPy functions** within
>>      modules that support overrides. For example, both CuPy and JAX set
>>      ``result_type = np.result_type``.
>>    - Implementing **fall-back mechanisms** for unimplemented NumPy
>> functions
>>      by using NumPy's implementation is hard to get right (but see the
>>      `version from dask <https://github.com/dask/dask/pull/5043>`_),
>> because
>>      ``__array_function__`` does not present a consistent interface.
>>      Converting all arguments of array type requires recursing into
>> generic
>>      arguments of the form ``*args, **kwargs``.
>>
>> 2. **Limitations on what can be overridden.** ``__array_function__`` has
>> some
>>    important gaps, most notably array creation and coercion functions:
>>
>>    - **Array creation** routines (e.g., ``np.arange`` and those in
>>      ``np.random``) need some other mechanism for indicating what type of
>>      arrays to create. `NEP 36 <https://github.com/numpy/numpy/pull/14715
>> >`_
>>      proposed adding optional ``like=`` arguments to functions without
>>      existing array arguments. However, we still lack any mechanism to
>>      override methods on objects, such as those needed by
>>      ``np.random.RandomState``.
>>    - **Array conversion** can't reuse the existing coercion functions like
>>      ``np.asarray``, because ``np.asarray`` sometimes means "convert to an
>>      exact ``np.ndarray``" and other times means "convert to something
>> _like_
>>      a NumPy array." This led to the `NEP 30
>>      <https://numpy.org/neps/nep-0030-duck-array-protocol.html>`_
>> proposal for
>>      a separate ``np.duckarray`` function, but this still does not
>> resolve how
>>      to cast one duck array into a type matching another duck array.
>>
>> ``get_array_module`` and the ``__array_module__`` protocol
>> ----------------------------------------------------------
>>
>> We propose a new user-facing mechanism for dispatching to a duck-array
>> implementation, ``numpy.get_array_module``. ``get_array_module`` performs
>> the
>> same type resolution as ``__array_function__`` and returns a module with
>> an API
>> promised to match the standard interface of ``numpy`` that can implement
>> operations on all provided array types.
>>
>> The protocol itself is both simpler and more powerful than
>> ``__array_function__``, because it doesn't need to worry about actually
>> implementing functions. We believe it resolves most of the
>> maintainability and
>> functionality limitations of ``__array_function__``.
>>
>> The new protocol is opt-in, explicit and with local control; see
>> :ref:`appendix-design-choices` for discussion on the importance of these
>> design
>> features.
>>
>> The array module contract
>> =========================
>>
>> Modules returned by ``get_array_module``/``__array_module__`` should make
>> a
>> best effort to implement NumPy's core functionality on new array types(s).
>> Unimplemented functionality should simply be omitted (e.g., accessing an
>> unimplemented function should raise ``AttributeError``). In the future, we
>> anticipate codifying a protocol for requesting restricted subsets of
>> ``numpy``;
>> see :ref:`requesting-restricted-subsets` for more details.
>>
>> How to use ``get_array_module``
>> ===============================
>>
>> Code that wants to support generic duck arrays should explicitly call
>> ``get_array_module`` to determine an appropriate array module from which
>> to
>> call functions, rather than using the ``numpy`` namespace directly. For
>> example:
>>
>> .. code:: python
>>
>>     # calls the appropriate version of np.something for x and y
>>     module = np.get_array_module(x, y)
>>     module.something(x, y)
>>
>> Both array creation and array conversion are supported, because
>> dispatching is
>> handled by ``get_array_module`` rather than via the types of function
>> arguments. For example, to use random number generation functions or
>> methods,
>> we can simply pull out the appropriate submodule:
>>
>> .. code:: python
>>
>>     def duckarray_add_random(array):
>>         module = np.get_array_module(array)
>>         noise = module.random.randn(*array.shape)
>>         return array + noise
>>
>> We can also write the duck-array ``stack`` function from `NEP 30
>> <https://numpy.org/neps/nep-0030-duck-array-protocol.html>`_, without
>> the need
>> for a new ``np.duckarray`` function:
>>
>> .. code:: python
>>
>>     def duckarray_stack(arrays):
>>         module = np.get_array_module(*arrays)
>>         arrays = [module.asarray(arr) for arr in arrays]
>>         shapes = {arr.shape for arr in arrays}
>>         if len(shapes) != 1:
>>             raise ValueError('all input arrays must have the same shape')
>>         expanded_arrays = [arr[module.newaxis, ...] for arr in arrays]
>>         return module.concatenate(expanded_arrays, axis=0)
>>
>> By default, ``get_array_module`` will return the ``numpy`` module if no
>> arguments are arrays. This fall-back can be explicitly controlled by
>> providing
>> the ``module`` keyword-only argument. It is also possible to indicate
>> that an
>> exception should be raised instead of returning a default array module by
>> setting ``module=None``.
>>
>> How to implement ``__array_module__``
>> =====================================
>>
>> Libraries implementing a duck array type that want to support
>> ``get_array_module`` need to implement the corresponding protocol,
>> ``__array_module__``. This new protocol is based on Python's dispatch
>> protocol
>> for arithmetic, and is essentially a simpler version of
>> ``__array_function__``.
>>
>> Only one argument is passed into ``__array_module__``, a Python
>> collection of
>> unique array types passed into ``get_array_module``, i.e., all arguments
>> with
>> an ``__array_module__`` attribute.
>>
>> The special method should either return an namespace with an API matching
>> ``numpy``, or ``NotImplemented``, indicating that it does not know how to
>> handle the operation:
>>
>> .. code:: python
>>
>>     class MyArray:
>>         def __array_module__(self, types):
>>             if not all(issubclass(t, MyArray) for t in types):
>>                 return NotImplemented
>>             return my_array_module
>>
>> Returning custom objects from ``__array_module__``
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> ``my_array_module`` will typically, but need not always, be a Python
>> module.
>> Returning a custom objects (e.g., with functions implemented via
>> ``__getattr__``) may be useful for some advanced use cases.
>>
>> For example, custom objects could allow for partial implementations of
>> duck
>> array modules that fall-back to NumPy (although this is not recommended in
>> general because such fall-back behavior can be error prone):
>>
>> .. code:: python
>>
>>     class MyArray:
>>         def __array_module__(self, types):
>>             if all(issubclass(t, MyArray) for t in types):
>>                 return ArrayModule()
>>             else:
>>                 return NotImplemented
>>
>>     class ArrayModule:
>>         def __getattr__(self, name):
>>             import base_module
>>             return getattr(base_module, name, getattr(numpy, name))
>>
>> Subclassing from ``numpy.ndarray``
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> All of the same guidance about well-defined type casting hierarchies from
>> NEP-18 still applies. ``numpy.ndarray`` itself contains a matching
>> implementation of ``__array_module__``,  which is convenient for
>> subclasses:
>>
>> .. code:: python
>>
>>     class ndarray:
>>         def __array_module__(self, types):
>>             if all(issubclass(t, ndarray) for t in types):
>>                 return numpy
>>             else:
>>                 return NotImplemented
>>
>> NumPy's internal machinery
>> ==========================
>>
>> The type resolution rules of ``get_array_module`` follow the same model as
>> Python and NumPy's existing dispatch protocols: subclasses are called
>> before
>> super-classes, and otherwise left to right. ``__array_module__`` is
>> guaranteed
>> to be called only  a single time on each unique type.
>>
>> The actual implementation of `get_array_module` will be in C, but should
>> be
>> equivalent to this Python code:
>>
>> .. code:: python
>>
>>     def get_array_module(*arrays, default=numpy):
>>         implementing_arrays, types =
>> _implementing_arrays_and_types(arrays)
>>         if not implementing_arrays and default is not None:
>>             return default
>>         for array in implementing_arrays:
>>             module = array.__array_module__(types)
>>             if module is not NotImplemented:
>>                 return module
>>         raise TypeError("no common array module found")
>>
>>     def _implementing_arrays_and_types(relevant_arrays):
>>         types = []
>>         implementing_arrays = []
>>         for array in relevant_arrays:
>>             t = type(array)
>>             if t not in types and hasattr(t, '__array_module__'):
>>                 types.append(t)
>>                 # Subclasses before superclasses, otherwise left to right
>>                 index = len(implementing_arrays)
>>                 for i, old_array in enumerate(implementing_arrays):
>>                     if issubclass(t, type(old_array)):
>>                         index = i
>>                         break
>>                 implementing_arrays.insert(index, array)
>>         return implementing_arrays, types
>>
>> Relationship with ``__array_ufunc__`` and ``__array_function__``
>> ----------------------------------------------------------------
>>
>> These older protocols have distinct use-cases and should remain
>> ===============================================================
>>
>> ``__array_module__`` is intended to resolve limitations of
>> ``__array_function__``, so it is natural to consider whether it could
>> entirely
>> replace ``__array_function__``. This would offer dual benefits: (1)
>> simplifying
>> the user-story about how to override NumPy and (2) removing the slowdown
>> associated with checking for dispatch when calling every NumPy function.
>>
>> However, ``__array_module__`` and ``__array_function__`` are pretty
>> different
>> from a user perspective: it requires explicit calls to
>> ``get_array_function``,
>> rather than simply reusing original ``numpy`` functions. This is probably
>> fine
>> for *libraries* that rely on duck-arrays, but may be frustratingly
>> verbose for
>> interactive use.
>>
>> Some of the dispatching use-cases for ``__array_ufunc__`` are also solved
>> by
>> ``__array_module__``, but not all of them. For example, it is still
>> useful to
>> be able to define non-NumPy ufuncs (e.g., from Numba or SciPy) in a
>> generic way
>> on non-NumPy arrays (e.g., with dask.array).
>>
>> Given their existing adoption and distinct use cases, we don't think it
>> makes
>> sense to remove or deprecate ``__array_function__`` and
>> ``__array_ufunc__`` at
>> this time.
>>
>> Mixin classes to implement ``__array_function__`` and ``__array_ufunc__``
>> =========================================================================
>>
>> Despite the user-facing differences, ``__array_module__`` and a module
>> implementing NumPy's API still contain sufficient functionality needed to
>> implement dispatching with the existing duck array protocols.
>>
>> For example, the following mixin classes would provide sensible defaults
>> for
>> these special methods in terms of ``get_array_module`` and
>> ``__array_module__``:
>>
>> .. code:: python
>>
>>     class ArrayUfuncFromModuleMixin:
>>
>>         def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
>>             arrays = inputs + kwargs.get('out', ())
>>             try:
>>                 array_module = np.get_array_module(*arrays)
>>             except TypeError:
>>                 return NotImplemented
>>
>>             try:
>>                 # Note this may have false positive matches, if
>> ufunc.__name__
>>                 # matches the name of a ufunc defined by NumPy.
>> Unfortunately
>>                 # there is no way to determine in which module a ufunc was
>>                 # defined.
>>                 new_ufunc = getattr(array_module, ufunc.__name__)
>>             except AttributeError:
>>                 return NotImplemented
>>
>>             try:
>>                 callable = getattr(new_ufunc, method)
>>             except AttributeError:
>>                 return NotImplemented
>>
>>             return callable(*inputs, **kwargs)
>>
>>     class ArrayFunctionFromModuleMixin:
>>
>>         def __array_function__(self, func, types, args, kwargs):
>>             array_module = self.__array_module__(types)
>>             if array_module is NotImplemented:
>>                 return NotImplemented
>>
>>             # Traverse submodules to find the appropriate function
>>             modules = func.__module__.split('.')
>>             assert modules[0] == 'numpy'
>>             for submodule in modules[1:]:
>>                 module = getattr(module, submodule, None)
>>             new_func = getattr(module, func.__name__, None)
>>             if new_func is None:
>>                 return NotImplemented
>>
>>             return new_func(*args, **kwargs)
>>
>> To make it easier to write duck arrays, we could also add these mixin
>> classes
>> into ``numpy.lib.mixins`` (but the examples above may suffice).
>>
>> Alternatives considered
>> -----------------------
>>
>> Naming
>> ======
>>
>> We like the name ``__array_module__`` because it mirrors the existing
>> ``__array_function__`` and ``__array_ufunc__`` protocols. Another
>> reasonable
>> choice could be ``__array_namespace__``.
>>
>> It is less clear what the NumPy function that calls this protocol should
>> be
>> called (``get_array_module`` in this proposal). Some possible
>> alternatives:
>> ``array_module``, ``common_array_module``, ``resolve_array_module``,
>> ``get_namespace``, ``get_numpy``, ``get_numpylike_module``,
>> ``get_duck_array_module``.
>>
>> .. _requesting-restricted-subsets:
>>
>> Requesting restricted subsets of NumPy's API
>> ============================================
>>
>> Over time, NumPy has accumulated a very large API surface, with over 600
>> attributes in the top level ``numpy`` module alone. It is unlikely that
>> any
>> duck array library could or would want to implement all of these
>> functions and
>> classes, because the frequently used subset of NumPy is much smaller.
>>
>> We think it would be useful exercise to define "minimal" subset(s) of
>> NumPy's
>> API, omitting rarely used or non-recommended functionality. For example,
>> minimal NumPy might include ``stack``, but not the other stacking
>> functions
>> ``column_stack``, ``dstack``, ``hstack`` and ``vstack``. This could
>> clearly
>> indicate to duck array authors and users want functionality is core and
>> what
>> functionality they can skip.
>>
>> Support for requesting a restricted subset of NumPy's API would be a
>> natural
>> feature to include in  ``get_array_function`` and ``__array_module__``,
>> e.g.,
>>
>> .. code:: python
>>
>>     # array_module is only guaranteed to contain "minimal" NumPy
>>     array_module = np.get_array_module(*arrays, request='minimal')
>>
>> To facilitate testing with NumPy and use with any valid duck array
>> library,
>> NumPy itself would return restricted versions of the ``numpy`` module when
>> ``get_array_module`` is called only on NumPy arrays. Omitted functions
>> would
>> simply not exist.
>>
>> Unfortunately, we have not yet figured out what these restricted subsets
>> should
>> be, so it doesn't make sense to do this yet. When/if we do, we could
>> either add
>> new keyword arguments to ``get_array_module`` or add new top level
>> functions,
>> e.g., ``get_minimal_array_module``. We would also need to add either a new
>> protocol patterned off of ``__array_module__`` (e.g.,
>> ``__array_module_minimal__``), or could add an optional second argument to
>> ``__array_module__`` (catching errors with ``try``/``except``).
>>
>> A new namespace for implicit dispatch
>> =====================================
>>
>> Instead of supporting overrides in the main `numpy` namespace with
>> ``__array_function__``, we could create a new opt-in namespace, e.g.,
>> ``numpy.api``, with versions of NumPy functions that support dispatching.
>> These
>> overrides would need new opt-in protocols, e.g.,
>> ``__array_function_api__``
>> patterned off of ``__array_function__``.
>>
>> This would resolve the biggest limitations of ``__array_function__`` by
>> being
>> opt-in and would also allow for unambiguously overriding functions like
>> ``asarray``, because ``np.api.asarray`` would always mean "convert an
>> array-like object."  But it wouldn't solve all the dispatching needs met
>> by
>> ``__array_module__``, and would leave us with supporting a considerably
>> more
>> complex protocol both for array users and implementors.
>>
>> We could potentially implement such a new namespace *via* the
>> ``__array_module__`` protocol. Certainly some users would find this
>> convenient,
>> because it is slightly less boilerplate. But this would leave users with a
>> confusing choice: when should they use `get_array_module` vs.
>> `np.api.something`. Also, we would have to add and maintain a whole new
>> module,
>> which is considerably more expensive than merely adding a function.
>>
>> Dispatching on both types and arrays instead of only types
>> ==========================================================
>>
>> Instead of supporting dispatch only via unique array types, we could also
>> support dispatch via array objects, e.g., by passing an ``arrays``
>> argument as
>> part of the ``__array_module__`` protocol. This could potentially be
>> useful for
>> dispatch for arrays with metadata, such provided by Dask and Pint, but
>> would
>> impose costs in terms of type safety and complexity.
>>
>> For example, a library that supports arrays on both CPUs and GPUs might
>> decide
>> on which device to create a new arrays from functions like ``ones`` based
>> on
>> input arguments:
>>
>> .. code:: python
>>
>>     class Array:
>>         def __array_module__(self, types, arrays):
>>             useful_arrays = tuple(a in arrays if isinstance(a, Array))
>>             if not useful_arrays:
>>                 return NotImplemented
>>             prefer_gpu = any(a.prefer_gpu for a in useful_arrays)
>>             return ArrayModule(prefer_gpu)
>>
>>     class ArrayModule:
>>         def __init__(self, prefer_gpu):
>>             self.prefer_gpu = prefer_gpu
>>
>>         def __getattr__(self, name):
>>             import base_module
>>             base_func = getattr(base_module, name)
>>             return functools.partial(base_func,
>> prefer_gpu=self.prefer_gpu)
>>
>> This might be useful, but it's not clear if we really need it. Pint seems
>> to
>> get along OK without any explicit array creation routines (favoring
>> multiplication by units, e.g., ``np.ones(5) * ureg.m``), and for the most
>> part
>> Dask is also OK with existing ``__array_function__`` style overides (e.g.,
>> favoring ``np.ones_like`` over ``np.ones``). Choosing whether to place an
>> array
>> on the CPU or GPU could be solved by `making array creation lazy
>> <https://github.com/google/jax/pull/1668>`_.
>>
>> .. _appendix-design-choices:
>>
>> Appendix: design choices for API overrides
>> ------------------------------------------
>>
>> There is a large range of possible design choices for overriding NumPy's
>> API.
>> Here we discuss three major axes of the design decision that guided our
>> design
>> for ``__array_module__``.
>>
>> Opt-in vs. opt-out for users
>> ============================
>>
>> The ``__array_ufunc__`` and ``__array_function__`` protocols provide a
>> mechanism for overriding NumPy functions *within NumPy's existing
>> namespace*.
>> This means that users need to explicitly opt-out if they do not want any
>> overridden behavior, e.g., by casting arrays with ``np.asarray()``.
>>
>> In theory, this approach lowers the barrier for adopting these protocols
>> in
>> user code and libraries, because code that uses the standard NumPy
>> namespace is
>> automatically compatible. But in practice, this hasn't worked out. For
>> example,
>> most well-maintained libraries that use NumPy follow the best practice of
>> casting all inputs with ``np.asarray()``, which they would have to
>> explicitly
>> relax to use ``__array_function__``. Our experience has been that making a
>> library compatible with a new duck array type typically requires at least
>> a
>> small amount of work to accommodate differences in the data model and
>> operations
>> that can be implemented efficiently.
>>
>> These opt-out approaches also considerably complicate backwards
>> compatibility
>> for libraries that adopt these protocols, because by opting in as a
>> library
>> they also opt-in their users, whether they expect it or not. For winning
>> over
>> libraries that have been unable to adopt ``__array_function__``, an opt-in
>> approach seems like a must.
>>
>> Explicit vs. implicit choice of implementation
>> ==============================================
>>
>> Both ``__array_ufunc__`` and ``__array_function__`` have implicit control
>> over
>> dispatching: the dispatched functions are determined via the appropriate
>> protocols in every function call. This generalizes well to handling many
>> different types of objects, as evidenced by its use for implementing
>> arithmetic
>> operators in Python, but it has two downsides:
>>
>> 1. *Speed*: it imposes additional overhead in every function call,
>> because each
>>    function call needs to inspect each of its arguments for overrides.
>> This is
>>    why arithmetic on builtin Python numbers is slow.
>> 2. *Readability*: it is not longer immediately evident to readers of code
>> what
>>    happens when a function is called, because the function's
>> implementation
>>    could be overridden by any of its arguments.
>>
>> In contrast, importing a new library (e.g., ``import  dask.array as da``)
>> with
>> an API matching NumPy is entirely explicit. There is no overhead from
>> dispatch
>> or ambiguity about which implementation is being used.
>>
>> Explicit and implicit choice of implementations are not mutually exclusive
>> options. Indeed, most implementations of NumPy API overrides via
>> ``__array_function__`` that we are familiar with (namely, dask, CuPy and
>> sparse, but not Pint) also include an explicit way to use their version of
>> NumPy's API by importing a module directly (``dask.array``, ``cupy`` or
>> ``sparse``, respectively).
>>
>> Local vs. non-local vs. global control
>> ======================================
>>
>> The final design axis is how users control the choice of API:
>>
>> - **Local control**, as exemplified by multiple dispatch and Python
>> protocols for
>>   arithmetic, determines which implementation to use either by checking
>> types
>>   or calling methods on the direct arguments of a function.
>> - **Non-local control** such as `np.errstate
>>   <
>> https://docs.scipy.org/doc/numpy/reference/generated/numpy.errstate.html
>> >`_
>>   overrides behavior with global-state via function decorators or
>>   context-managers. Control is determined hierarchically, via the
>> inner-most
>>   context.
>> - **Global control** provides a mechanism for users to set default
>> behavior,
>>   either via function calls or configuration files. For example,
>> matplotlib
>>   allows setting a global choice of plotting backend.
>>
>> Local control is generally considered a best practice for API design,
>> because
>> control flow is entirely explicit, which makes it the easiest to
>> understand.
>> Non-local and global control are occasionally used, but generally either
>> due to
>> ignorance or a lack of better alternatives.
>>
>> In the case of duck typing for NumPy's public API, we think non-local or
>> global
>> control would be mistakes, mostly because they **don't compose well**. If
>> one
>> library sets/needs one set of overrides and then internally calls a
>> routine
>> that expects another set of overrides, the resulting behavior may be very
>> surprising. Higher order functions are especially problematic, because the
>> context in which functions are evaluated may not be the context in which
>> they
>> are defined.
>>
>> One class of override use cases where we think non-local and global
>> control are
>> appropriate is for choosing a backend system that is guaranteed to have an
>> entirely consistent interface, such as a faster alternative
>> implementation of
>> ``numpy.fft`` on NumPy arrays. However, these are out of scope for the
>> current
>> proposal, which is focused on duck arrays.
>>
>> _______________________________________________
>> NumPy-Discussion mailing listNumPy-Discussion at python.orghttps://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200205/2fde7273/attachment-0001.html>


More information about the NumPy-Discussion mailing list