Two of the oldest issues in the tracker (#834 and #835) are about how .reduceat() handles its indices parameter. I have been taking a look at the source code, and it would be relatively easy to modify, the hardest part being to figure out what the exact behavior should be.

Current behavior is that np.ufunc.reduceat(x, ind) returns [np.ufunc.reduce(a[ind[i]:ind[i+1]] for i in range(len(ind))] with a couple of caveats:
  1. if ind[i] >= ind[i+1], then a[ind[i]] is returned, rather than a reduction over an empty slice.
  2. an index of len(ind) is appended to the indices argument, to be used as the endpoint of the last slice to reduce over.
  3. aside from this last case, the indices are required to be strictly inbounds, 0 <= index < len(x), or an error is raised
The proposed new behavior, with some optional behaviors, would be:
  1. if ind[i] >= ind[i+1], then a reduction over an empty slice, i.e. the ufunc identity, is returned. This includes raising an error if the ufunc does not have an identity, e.g. np.minimum.
  2. to fully support the "reduction over slices" idea, some form of out of bounds indices should be allowed. This could mean either that:
    1. only index = len(x) is allowed without raising an error, to allow computing the full reduction anywhere, not just as the last entry of the return, or
    2. allow any index in -len(x) <= index <= len(x), with the usual meaning given to negative values, or
    3. any index is allowed, with reduction results clipped to existing values (and the usual meaning for negative values).
  3. Regarding the appending of that last index of len(ind) to indices, we could:
    1. keep appending it, or
    2. never append it, since you can now request it without an error being raised, or
    3. only append it if the last index is smaller than len(x).
My thoughts on the options:
  • The minimal, more conservative approach would go with 2.1 and 3.1. And of course 1, if we don't implement that none of this makes sense.
  • I kind of think 2.2 or even 2.3 are a nice enhancement that shouldn't break too much stuff.
  • 3.2 I'm not sure about, probably hurts more than it helps at this point, although in a brand new design you probably would either not append the last index or also prepend a zero, as in np.split.
  • And 3.3 seems too magical, probably not a good idea, only listed it for completeness.
Any other thoughts or votes on what, if anything, should we implement, and what the deprecation of current behavior should look like?

Jaime

--
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.