Jaime brought up the same issue recently, along with some other issues for ufunc.reduceat:

https://mail.scipy.org/pipermail/numpy-discussion/2016-March/075199.html

I completely agree with both of you that the current behavior for empty slices is very strange and should be changed to remove the special case. Nathaniel Smith voiced the same opinion on the GitHub issue [1].

I think the path forward here (as Nathaniel writes) is pretty clear:

1. Start issuing a FutureWarning about a future change.

2. Fix this in a release or two.

[1] https://github.com/numpy/numpy/issues/834

On Fri, Jul 29, 2016 at 11:42 AM, Erik Brinkman <erik.brinkman@umich.edu> wrote:

Hi,

The behavior of a ufuncs reduceat on empty slices seems a little strange, and I wonder if there's a reason behind it / if there's a route to potentially changing it. First, I'll go into a little background.

I've been making a lot of use of ufuncs reduceat functionality on staggered arrays. In general, I'll have "n" arrays, each with size "s[n]" and I'll store them in one array "x", such that "s.sum() == x.size". reduceat is great because I use

ufunc.reduceat(x, np.insert(s[:-1].cumsum(), 0, 0))

to get some summary information about each array. However, reduceat seems to behave strangely for empty slices. To make things concrete, let's assume:

import numpy as np
s = np.array([3, 0, 2])
x = np.arange(s.sum())
inds = np.insert(s[:-1].cumsum(), 0, 0)
# [0, 3, 3]
np.add.reduceat(x, inds)
# [3, 3, 7] not [3, 0, 7]
# This is distinct from
np.fromiter(map(np.add.reduce, np.array_split(x, inds[1:])), x.dtype, s.size - 1)
# [3, 0, 7] what I wanted

The current documentation on reduceat first states:

For i in range(len(indices)), reduceat computes ufunc.reduce(a[indices[i]:indices[i+1]])

That would suggest the outcome, that I expected. However, in the examples section it goes into a bunch of examples which contradict that statement and instead suggest that the actual algorithm is more akin to:

ufunc.reduce(a[indices[i]:indices[i+1]]) if indices[i+1] > indices[i] else indices[i]

Looking at the source, it seems like it's copying indices[i], and then while there are more elements to process it keeps reducing, resulting in this unexpected behavior. It seems like the proper thing to do would be start with ufunc.identity, and then reduce. This is slightly less performant than what's implemented, but more "correct." There could, of course, just be a switch to copy the identity only when the slice is empty.

Is there a reason it's implemented like this? Is it just for performance, or is this strange behavior useful somewhere? It seems like "fixing" this would be bad because you'll be changing somewhat documented functionality in a backwards incompatible way. What would the best approach to "fixing" this be? add another function "reduceat_"? add a flag to reduceat to do the proper thing for empty slices?

Finally, is there a good way to work around this? I think for now I'm just going to mask out the empty slices and use insert to add them back in, but if I'm missing an obvious solution, I'll look at that too. I need to mask them out because, np.add.reduceat(x, 5) would ideally return 0, but instead it throws an error since 5 is out of range...

Thanks for indulging my curiosity,
Erik

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion