[Numpy-discussion] Rules for argument parsing/forwarding in __array_function__ and __array_ufunc__

Sebastian Berg sebastian at sipsolutions.net
Fri Dec 4 16:51:29 EST 2020


On Wed, 2020-12-02 at 21:07 -0800, Stefan van der Walt wrote:
> Hi Sebastian,
> 
> Looking at these three rules, they all seem to stem from one simple
> question: do we desire for a single code snippet to be runnable on
> multiple array implementations?
> 
> On Wed, Dec 2, 2020, at 15:34, Sebastian Berg wrote:
> > 1. If an argument is invalid in NumPy it is considered and error.
> >    For example:
> > 
> >        np.log(arr, my_weird_argument=True)
> > 
> >    is always an error even if the `__array_function__`
> > implementation
> >    of `arr` would support it.
> >    NEP 18 explicitly says that allowing forwarding could be done,
> > but
> >    will not be done at this time.
> 
> Relaxing this rule will mean that code working for one array
> implementation (which has this keyword) may not work for another.


Indeed, while NEP 18 mentions it, I personally don't see why we should
relax it. (The NEP 13 implementation does so, but this is an
unintentional, and not optimal, implementation detail.)


> 
> > 2. Arguments must only be forwarded if they are passed in:
> > 
> >        np.mean(cupy_array)
> > 
> >    ends up as `cupy.mean(cupy_array)` and not:
> > 
> >        cupy.mean(cupy_array, axis=None, dtype=None, out=None,
> >                  keepdims=False, where=True)
> > 
> >    meaning that CuPy does not need to implement all of those kwargs
> > and
> >    NumPy can add new ones without breaking anyones code.
> 
> This may ultimately make it harder for array implementors (they will
> only see errors once someone tries to pass in an argument that they
> forgot to implement).  Perhaps better to pass all so they know what
> they're dealing with?

True, we do this for `np.mean(obj)`, etc. which end up calling
`obj.mean()`, but compared to protocols which explicitly ask for NumPy
compatibility, those method forwards are not as clearly defined.
So maybe we should actually pass on everything (including the default
value?), that is actually safer if we ever update the default.

The downside would remain that a newer NumPy is likely to cause a break
until the project updates (e.g. if we add a keyword argument).

If we were open to this (plus an insignificant change in subclass
handling), it would be easy to at least half the overhead of Python
`__array_function__` dispatching.
That is because it would allow us to inline (in python):

    def function(arg1, arg2, kwarg1=None):
        dispatched = dispatch((arg1,), arg1, arg2, kwarg1=kwarg1)
	if dispatched is not NotImplemented:
            return dispatched

        # normal code here (some argument validation could come first)


This may look strange, but has to go through 1-2 function calls
where currently we go through 4.

The other change, would also allow us to remove *all* overhead for
functions defined in C.


> 
> > 3. NumPy should not check the *validity* of the arguments. For
> > example:
> >    `np.add.reduce(xarray, axis="long")` should probably work in
> > xarray.
> >    (`xarray.DataArray` does not actually implement the above.)
> >    But a string cannot be used as an axis in NumPy.
> 
> Getting back to the original question: if this code is to be run on
> multiple implementations, we should ensure that no strange values
> pass through.
> 
> Personally, I like the idea of a single API that works on multiple
> backends.  As such, I would 1) not pass through unknown arguments, 2)
> always pass through all arguments, and 3) validate inputs to each
> call.


Thanks for the input!  I think point 2) is in the sense the  most
interesting, because the approach `pytorch` takes to remove the
overhead of array-function gets very complicated without it.

In the end, parsing validity should maybe be considered an
implementation detail... I.e. if there is a good reason why validating
is a problem, we can stop doing it and otherwise there is no need to
worry about it. (Although for ufuncs, I would go the non-validating
route for now personally.)

Cheers,

Sebastian


> 
> Best regards,
> Stéfan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201204/8d8b0311/attachment.sig>


More information about the NumPy-Discussion mailing list