This PR intends to address the limitation that the weights supplied as an optional argument in np.average()
can only be the same shape as the input array, or one-dimensional. With these changes, weights that have the same shape as the input array over the specified axis
can be supplied. The PR also makes some minor improvements to the sanity checks on the weights. We have implemented the extension for both np.average()
and np.ma.average()
, and added tests for the new functionality (all preexisting tests pass).
Some functionalities of the numpy.average()
function can be improved, and the treatment of weights in particular.
Currently, np.average()
is restricted to weights that are either
This restriction is not necessary and is disadvantageous when averaging over subdimensions the input array via supplying the axis
argument.
If it's possible to average over subdimensions of the input array via axis
, then it is useful to be able to supply
weights in accordance, checking that they have the required dimensionality and shape.
Consider, for example,
y = np.arange(12).reshape(2, 2, 3) w = np.array([0., 0., 1., .5, .5, 0., 0., .5, .5, 1., 0., 0.]).reshape(2, 2, 3)
y
and the weights w
have shape (2, 2, 3)
.If I want to average over the whole array, that is OK:
np.average(y, weights=w) 5.5
axis=0
):np.average(y, axis=0 weights=w[:, 0, 1]) array([[ 6., 7., 8.], [ 9., 10., 11.]])
y
that are not one-dimensional,and supply weights with the same shape as the subdimensions, that is not possible:
np.average(y, axis=(0, 1) weights=w[:, :, 1]) TypeError: 1D weights expected when shapes of a and weights differ.
shape of y
and get around this limitation, however, depending on the shape ofa
in the dimension that is not being averaged over, that might be very
expensive in terms of memory and perhaps prohibitively so (it was for me in one case).
This PR addresses this limitation.
Moreover, some of the checks on the weights could be improved,
for example with the previous implementation if I supplied 1d weights but a tuple for the axis
:
np.average(y, axis=(0, 1), weights=w[0, 0, :]) TypeError: tuple indices must be integers or slices, not tuple
This PR improves that as well.
The main idea was to generalise the function np.average()
, but I ended up doing the same for np.ma.average()
too, for completeness, even though I have relatively little experience with masked arrays.
The implementations have been tested in lib/tests/test_function_base.py
andma/tests/test_extras.py
, respectively. The two sets of tests mirror one another. All preexisting tests pass.
I'd appreciate if people could kindly take a look at this PR and let know any
feedback they have.
Thank you
James