[Numpy-discussion] Rules for argument parsing/forwarding in __array_function__ and __array_ufunc__
Sebastian Berg
sebastian at sipsolutions.net
Wed Dec 2 18:34:46 EST 2020
Hi all,
I am curious about the correct argument for "normalizing" and
forwarding arguments in `__array_function__` and `__array_ufunc__`. I
have listed the three rules as I understand how the "right" way is
below.
Mainly this is just to write down the rules that I think we should aim
for in case it comes up.
I admit, I do have "hidden agendas" where it may come up:
* pytorch breaks rule 2 in their copy of `__array_function__`, because
it allows to easily optimize away some overheads.
* `__array_ufunc__` breaks rule 1 (it allows too much) and I think we
should be OK to change that [1].
* A few `__array_function__`s break rule 3. That is completely
harmless, but it might be nice to just be clear whether we consider
it technically wrong. [2]
Rules
-----
1. If an argument is invalid in NumPy it is considered and error.
For example:
np.log(arr, my_weird_argument=True)
is always an error even if the `__array_function__` implementation
of `arr` would support it.
NEP 18 explicitly says that allowing forwarding could be done, but
will not be done at this time.
2. Arguments must only be forwarded if they are passed in:
np.mean(cupy_array)
ends up as `cupy.mean(cupy_array)` and not:
cupy.mean(cupy_array, axis=None, dtype=None, out=None,
keepdims=False, where=True)
meaning that CuPy does not need to implement all of those kwargs and
NumPy can add new ones without breaking anyones code.
3. NumPy should not check the *validity* of the arguments. For example:
`np.add.reduce(xarray, axis="long")` should probably work in xarray.
(`xarray.DataArray` does not actually implement the above.)
But a string cannot be used as an axis in NumPy.
Cheers,
Sebastian
[1] I think `dask` breaks this rule by using an `output_dtypes`
keyword. I would just consider this a valid exception and keep allowing
it. In fact, `output_dtypes` may very well be a useful argument for
NumPy itself. `dtype` looks like it serves that purpose, but it does
not have quite the same meaning.
This has been discussed also here:
https://github.com/numpy/numpy/issues/8892
[2] This is done for performance reasons, although it is entirely
avoidable. However, avoiding it might just add a bunch of annoying code
unless part of a larger maintenance effort.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201202/4dfd3866/attachment.sig>
More information about the NumPy-Discussion
mailing list