[Numpy-discussion] Rethinking multiple dimensional indexing with sequences?
Nathaniel Smith
njs at pobox.com
Tue Feb 18 15:59:13 EST 2014
So to be clear - what's being suggested is that code like this will be
deprecated in 1.9, and then in some future release break:
slices = []
for i in ...:
slices.append(make_slice(...))
subarray = arr[slices]
Instead, you will have to do:
subarray = arr[tuple(slices)]
And the reason is that when we allow multi-dimensional indexes to be
passed as lists instead of a tuple, numpy has no reliable way to tell
what to do with something like
arr[[0, 1]]
Maybe it means
arr[0, 1]
Or maybe it means
arr[np.asarray([0, 1])]
Who knows? Right now we have some heuristics to guess based on what
exact index objects are in there, but really making a guess at all is
a pretty broken approach, and will be getting more broken as more
non-ndarray array-like types come into common use -- in particular,
the way things are right now, arr[pandas_series] will soon be (or is
already) triggering this same guessing logic.
So, any objections to requiring tuples here?
-n
On Tue, Feb 18, 2014 at 11:09 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> Hey all,
>
> currently in numpy this is possible:
>
> a = np.zeros((5, 5))
> a[[0, slice(None, None)]]
> #this behaviour has its quirks, since the "correct" way is:
> a[(0, slice(None, None))] # or identically a[0, :]
>
> The problem with using an arbitrary sequence is, that an arbitrary
> sequence is also typically an "array like" so there is a lot of guessing
> involved:
>
> a[[0, slice(None, None)]] == a[(0, slice(None, None))]
> # but:
> a[[0, 1]] == a[np.array([0, 1])]
>
> Now also NumPy commonly uses lists here to build up indexing tuples
> (since they are mutable), however would it really be so bad if we had to
> do `arr[tuple(slice_list)]` in the end to resolve this issue? So the
> proposal would be to deprecate anything but (base class) tuples, or
> maybe at least only allow this weird logic for lists and not all
> sequences. I do not believe we can find a logic to decide what to do
> which will not be broken in some way...
>
> PS: The code implementing the "advanced index or nd-index" logic is
> here:
> https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/mapping.c#L196
>
> - Sebastian
>
>
> Another confusing example:
>
> In [9]: a = np.arange(10)
>
> In [10]: a[[(0, 1), (2, 3)] * 17] # a[np.array([(0, 1), (2, 3)] * 17)]
> Out[10]:
> array([[0, 1],
> <snip>
> [2, 3]])
>
> In [11]: a[[(0, 1), (2, 3)]] # a[np.array([0, 1]), np.array([2, 3])]
> ---------------------------------------------------------------------------
> IndexError Traceback (most recent call
> last)
> <ipython-input-11-57b93f64dfa6> in <module>()
> ----> 1 a[[(0, 1), (2, 3)]]
>
> IndexError: too many indices for array
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
More information about the NumPy-Discussion
mailing list