[Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

Stephan Hoyer shoyer at gmail.com
Wed Jun 27 00:48:40 EDT 2018


On Tue, Jun 26, 2018 at 12:46 AM Robert Kern <robert.kern at gmail.com> wrote:

> I think having more self-contained descriptions of the semantics of each
> of these would be a good idea. The current description of `.vindex` spends
> more time talking about what it doesn't do, compared to the other methods,
> than what it does.
>

Will do.


> I'm still leaning towards not warning on current, unproblematic common
> uses. It's unnecessary churn for currently working, understandable code. I
> would still reserve warnings and deprecation for the cases where the
> current behavior gives us something that no one wants. Those are the real
> traps that people need to be warned away from.
>
> If someone is mixing slices and integer indices, that's a really good sign
> that they thought indexing behaved in a different way (e.g. orthogonal
> indexing).
>

I agree, but I'm still not  entirely sure where to draw the line on
behavior that should issue a warning. Some options, in roughly descending
order of severity:
1. Warn if [] would give a different result than .oindex[]. This is the
current proposal in the NEP, but based on the feedback we should hold back
on it for now.
2. Warn if there is a mixture of arrays/slice objects in indices for [],
even implicitly (e.g., including arr[idx] when is equivalent to arr[idx,
:]). In this case, indices end up at the end both for legacy_index and
vindex, but arguably that is only a happy coincidence.
3. Warn if [] would give a different result from .vindex[]. This is a
little weaker than the previous condition, because arr[idx, :] or arr[idx,
...] would not give a warning. However, cases like arr[..., idx] or arr[:,
idx, :] would still start to give warnings, even though they are arguably
well defined according to either outer indexing (if idx.ndim == 1) or
legacy indexing (due to dimension reordering rules that will be omitted
from vindex).
4. Warn if there are multiple arrays/integer indices separated by a slice
object, e.g., arr[idx1, :, idx2]. This is the edge case that really trips
up users.

As I said in my other response, in the long term, I would prefer to either
(a) drop support for vectorized indexing in [] or (b) if we stick with
supporting vectorized indexing in [], at least ensure consistent dimension
ordering rules for [] and vindex[]. That would suggest using either my
proposed rule 2 or 3.

I also agree with you that anyone mixing slices and integers probably is
confused about how indexing works, at least in edge cases. But given the
lengths that legacy indexing goes to to support "outer indexing-like"
behavior in the common case of a single integer array and many slices, I am
hesitant to start warning in this case. The result of arr[..., idx, :] is
relatively easy to understand, even though it uses its own set of rules,
which happen to be more consistent with oindex[] than vindex[].

We certainly could make the conservative choice of only adopting 4 for now
and leaving further cleanup for later. I guess this uncertainty about
whether direct indexing should be more like vindex[] or oindex[] in the
long term is a good argument for holding off on other warnings for now. But
I think we are almost certainly going to want to make further
warnings/deprecations of some form.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180626/269ffae6/attachment.html>


More information about the NumPy-Discussion mailing list