[Numpy-discussion] Rethinking multiple dimensional indexing with sequences?

Tue Feb 18 15:59:13 EST 2014

So to be clear - what's being suggested is that code like this will be
deprecated in 1.9, and then in some future release break:

slices = []
for i in ...:
    slices.append(make_slice(...))
subarray = arr[slices]

Instead, you will have to do:

subarray = arr[tuple(slices)]

And the reason is that when we allow multi-dimensional indexes to be
passed as lists instead of a tuple, numpy has no reliable way to tell
what to do with something like

   arr[[0, 1]]

Maybe it means

   arr[0, 1]

Or maybe it means

   arr[np.asarray([0, 1])]

Who knows? Right now we have some heuristics to guess based on what
exact index objects are in there, but really making a guess at all is
a pretty broken approach, and will be getting more broken as more
non-ndarray array-like types come into common use -- in particular,
the way things are right now, arr[pandas_series] will soon be (or is
already) triggering this same guessing logic.

So, any objections to requiring tuples here?

-n

On Tue, Feb 18, 2014 at 11:09 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> Hey all,
>
> currently in numpy this is possible:
>
> a = np.zeros((5, 5))
> a[[0, slice(None, None)]]
> #this behaviour has its quirks, since the "correct" way is:
> a[(0, slice(None, None))] # or identically a[0, :]
>
> The problem with using an arbitrary sequence is, that an arbitrary
> sequence is also typically an "array like" so there is a lot of guessing
> involved:
>
> a[[0, slice(None, None)]]  == a[(0, slice(None, None))]
> # but:
> a[[0, 1]] == a[np.array([0, 1])]
>
> Now also NumPy commonly uses lists here to build up indexing tuples
> (since they are mutable), however would it really be so bad if we had to
> do `arr[tuple(slice_list)]` in the end to resolve this issue? So the
> proposal would be to deprecate anything but (base class) tuples, or
> maybe at least only allow this weird logic for lists and not all
> sequences. I do not believe we can find a logic to decide what to do
> which will not be broken in some way...
>
> PS: The code implementing the "advanced index or nd-index" logic is
> here:
> https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/mapping.c#L196
>
> - Sebastian
>
>
> Another confusing example:
>
> In [9]: a = np.arange(10)
>
> In [10]: a[[(0, 1), (2, 3)] * 17] # a[np.array([(0, 1), (2, 3)] * 17)]
> Out[10]:
> array([[0, 1],
>       <snip>
>        [2, 3]])
>
> In [11]: a[[(0, 1), (2, 3)]] # a[np.array([0, 1]), np.array([2, 3])]
> ---------------------------------------------------------------------------
> IndexError                                Traceback (most recent call
> last)
> <ipython-input-11-57b93f64dfa6> in <module>()
> ----> 1 a[[(0, 1), (2, 3)]]
>
> IndexError: too many indices for array
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org