[Numpy-discussion] Add an axis argument to generalized ufuncs?
shoyer at gmail.com
Sat Oct 18 00:56:01 EDT 2014
Yesterday I created a GitHub issue proposing adding an axis argument to
I was told I should repost this on the mailing list, so here's the recap:
I would like to write generalized ufuncs (probably using numba), to create
fast functions such as nanmean (signature '(n)->()') or rolling_mean
(signature '(n),()->(n)') that take the axis along which to aggregate as a
keyword argument, e.g., nanmean(x, axis=0) or rolling_mean(x, window=5,
Of course, I could write my own wrapper for this that reorders dimensions
using swapaxes or transpose. But I also think that an "axis" argument to
allow for specifying the core dimensions of gufuncs would be more generally
useful, and we should consider adding it to numpy.
Nathaniel and Jaime added some good points, noting that such an axis
argument should cleanly handle multiple input and output arguments and have
a plan for handling optional dimensions (e.g., (m?,n),(n,p?)->(m?,p?) for
the new dot).
Here are my initial thoughts on the syntax:
(1) Generally speaking, I think the "nested tuple" syntax (e.g., axis=[(0,
1), (2, 3)]) would be most congruous with the axis arguments numpy already
(2) For gufuncs with simpler signatures, we should support supplying an
integer or an unnested tuple, e.g.,
- axis=0 for (n)->()
- axis=(0, 1) for (n)(m)->() or (n,m)->()
- axis=[(0, 1), 2] for (n,m),(o)->().
(3) If we require a full axis specification for core dimensions, we could
use the axis argument for unambiguous control of optional core dimensions:
e.g., axis=(0, 1) would indicate that you want the "vectorized inner
product" version of the new dot operator, rather than matrix
multiplication, and axis=[(-2, -1), -1] would mean that you want the
"vectorized matrix-vector product". This seems relatively tidy, although I
admit I am not convinced that optional core dimensions are necessary.
(4) We can either include the output axis as part of the signature, or add
another argument "axis_out" or "out_axis". I think prefer the separate
argument, particularly if we require "axis" to specify all core dimensions,
which may be a good idea even if we don't use "axis" for controlling
optional core dimensions.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion