[Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs

Marten van Kerkwijk m.h.vankerkwijk at gmail.com
Fri Jun 1 17:41:18 EDT 2018

Hi Nathaniel,

On Matt's prompting, I added release notes to the frozen/flexible PR [1];
see text attached below.

Having done that, I felt the examples actually justified the frozen
dimensions quite well. Given that you're the who expressed most doubts
about them, could you have a look? Ideally, I'd avoid having to write a NEP
for this, and the examples do seem to make it quite obvious that this
change to the signature is the way to go, as its meaning is dead obvious.
And the implementation is super-straightforward...

For the broadcasted core dimensions, I do agree the case is less strong and
the meaning perhaps less obvious (implementation is relatively simple), and
I think a short NEP may be called for (unless others on the list have
super-convincing use cases...). I will add here, though, that even if we
implement `all_equal` as a method on `equal`, it would still be useful to
have a signature that can actually describe it.

-- Marten

[1] https://github.com/numpy/numpy/pull/11175/files

Generalized ufunc signatures now allow fixed-size dimensions
By using a numerical value in the signature of a generalized ufunc, one can
indicate that the given function requires input or output to have dimensions
with the given size. E.g., the signature of a function that converts a polar
angle to a two-dimensional cartesian unit vector would be ``()->(2)``; that
for one that converts two spherical angles to a three-dimensional unit
would be ``(),()->(3)``; and that for the cross product of two
three-dimensional vectors would be ``(3),(3)->(3)``.

Note that to the elementary function these dimensions are not treated any
differently from variable ones indicated with a letter; the loop still is
passed the corresponding size, but it can now count on that being equal to
fixed size given in the signature.

Generalized ufunc signatures now allow flexible dimensions

Some functions, in particular numpy's implementation of ``@`` as ``matmul``,
are very similar to generalized ufuncs in that they operate over core
dimensions, but one could not present them as such because they were able to
deal with inputs in which a dimension is missing. To support this, it is now
allowed to postfix a dimension name with a question mark to indicate that
dimension does not necessarily have to be present.

With this addition, the signature for ``matmul`` can be expressed as
``(m?,n),(n,p?)->(m?,p?)``.  This indicates that if, e.g., the second
has only one dimension, for the purposes of the elementary function it will
treated as if that input has core shape ``(n, 1)``, and the output has the
corresponding core shape of ``(m, 1)``. The actual output array, however,
flexible dimension removed, i.e., it will have shape ``(..., n)``.
Similarly, if both arguments have only a single dimension, the inputs will
presented as having shapes ``(1, n)`` and ``(n, 1)`` to the elementary
function, and the output as ``(1, 1)``, while the actual output array
will have shape ``()``. In this way, the signature thus allows one to use a
single elementary function for four related but different signatures,
``(m,n),(n,p)->(m,p)``, ``(n),(n,p)->(p)``, ``(m,n),(n)->(m)`` and
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180601/4b166abd/attachment-0001.html>

More information about the NumPy-Discussion mailing list