Multidimensional Indexing
With the indexing example from the documentation: y = np.arange(35).reshape(5,7) Why does selecting an item from explicitly every row work as I’d expect:
y[np.array([0,1,2,3,4]),np.array([0,0,0,0,0])] array([ 0, 7, 14, 21, 28])
But doing so from a full slice (which, I would naively expect to mean “Every Row”) has some…other… behaviour:
y[:,np.array([0,0,0,0,0])] array([[ 0, 0, 0, 0, 0], [ 7, 7, 7, 7, 7], [14, 14, 14, 14, 14], [21, 21, 21, 21, 21], [28, 28, 28, 28, 28]])
What is going on in this example, and how do I get what I expect? By explicitly passing in an extra array with value===index? What is the rationale for this difference in behaviour? Thanks, Nick
On Di, 2015-04-07 at 00:49 +0100, Nicholas Devenish wrote:
With the indexing example from the documentation:
y = np.arange(35).reshape(5,7)
Why does selecting an item from explicitly every row work as I’d expect:
y[np.array([0,1,2,3,4]),np.array([0,0,0,0,0])] array([ 0, 7, 14, 21, 28])
But doing so from a full slice (which, I would naively expect to mean “Every Row”) has some…other… behaviour:
y[:,np.array([0,0,0,0,0])] array([[ 0, 0, 0, 0, 0], [ 7, 7, 7, 7, 7], [14, 14, 14, 14, 14], [21, 21, 21, 21, 21], [28, 28, 28, 28, 28]])
What is going on in this example, and how do I get what I expect? By explicitly passing in an extra array with value===index? What is the rationale for this difference in behaviour?
The rationale is historic. Indexing with arrays (advanced indexing) works different from slicing. So two arrays will be iterated together, while slicing is not (we sometimes call it outer/orthogonal indexing for that matter, there is just a big discussion about this). These are different beasts, you can basically get the slicing like behaviour by adding appropriate axes to your indexing arrays: y[np.array([[0],[1],[2],[3],[4]]),np.array([0,0,0,0,0])] The other way around is not possible. Note that if it was the case: y[:, :] would give the diagonal (if possible) and not the full array as you would probably also expect. One warning: If you index with more then one array (scalars are also arrays in this sense -- so `[0, :, array]` is an example) in combination with slices, the result can be transposed in a confusing way (it is not that difficult, but usually unexpected). - Sebastian
Thanks,
Nick _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
I think the rationale is to allow selection of whole rows / columns. If you want to choose a single element from each row/column, then, yes, you have to pass np.arange(...). There is also np.choose function, but not recommended to use for such cases as far as I understand. I'm not an expert, though. Nikolay.
From: misnomer@gmail.com Date: Tue, 7 Apr 2015 00:49:34 +0100 To: numpy-discussion@scipy.org Subject: [Numpy-discussion] Multidimensional Indexing
With the indexing example from the documentation:
y = np.arange(35).reshape(5,7)
Why does selecting an item from explicitly every row work as I’d expect:
y[np.array([0,1,2,3,4]),np.array([0,0,0,0,0])] array([ 0, 7, 14, 21, 28])
But doing so from a full slice (which, I would naively expect to mean “Every Row”) has some…other… behaviour:
y[:,np.array([0,0,0,0,0])] array([[ 0, 0, 0, 0, 0], [ 7, 7, 7, 7, 7], [14, 14, 14, 14, 14], [21, 21, 21, 21, 21], [28, 28, 28, 28, 28]])
What is going on in this example, and how do I get what I expect? By explicitly passing in an extra array with value===index? What is the rationale for this difference in behaviour?
Thanks,
Nick _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Mon, Apr 6, 2015 at 4:49 PM, Nicholas Devenish
With the indexing example from the documentation:
y = np.arange(35).reshape(5,7)
Why does selecting an item from explicitly every row work as I’d expect:
y[np.array([0,1,2,3,4]),np.array([0,0,0,0,0])] array([ 0, 7, 14, 21, 28])
But doing so from a full slice (which, I would naively expect to mean “Every Row”) has some…other… behaviour:
y[:,np.array([0,0,0,0,0])] array([[ 0, 0, 0, 0, 0], [ 7, 7, 7, 7, 7], [14, 14, 14, 14, 14], [21, 21, 21, 21, 21], [28, 28, 28, 28, 28]])
What is going on in this example, and how do I get what I expect? By explicitly passing in an extra array with value===index? What is the rationale for this difference in behaviour?
To understand this example, it is important to understand that for multi-dimensional arrays, Numpy attempts to make the index array along each dimension the same size, using broadcasting. So in your original example, y[np.array([0,1,2,3,4]),np.array([0,0,0,0,0])], the arrays are the same size, and the behavior is as you'd expect. In the second case, the first index is a slice, and the second index is an array. Documentation for this case can be found in the indexing docs under "Combining index arrays with slices". Here's the relevant portion:
In effect, the slice is converted to an [new] index array ... that is broadcast with the [other] index array
So in your case, the slice ":" is *first* being converted to np.arange(5), *then* is broadcast across the shape of the [other] index array so that it is ultimately transformed into something like np.repeat(np.arange(5)[:,np.newaxis], 5, axis=1), giving you: array([[0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3], [4, 4, 4, 4, 4]]) Now at this point you have converted your slice to an [new] index array of shape (5,5), and your [other] index array is shaped (5,). So now numpy applies broadcasting rules to the second array to get it into shape 5. This operation is identical to what just occurred, so your [other] index array *also* looks like: array([[0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3], [4, 4, 4, 4, 4]]) Which then gives the result you saw. Now, you may say: once the slice was converted to np.arange(5), why was it then broadcast to shape (5,5) rather than kept at shape (5,) which would work. The reason (I suspect at least) is to keep it consistent with other types of slices. Consider if you did something like: y[1:3, np.array([0,0,0,0,0])] Then the same operation would apply as above, except that when the slice was converted to an array, it would be converted to np.arange(1,3) which has shape (2,). Obviously this isn't compatible with the second index array of shape (5,), so it *has* to be broadcast. One final note: in this case, you can instead use either of the following: y[np.array([0,1,2,3,4]), 0] or y[:, 0] using the same steps above, the slice is converted to an np.arange(5), and then the shapes are compared, (5,) versus (). Then the integer index is broadcast to shape (5,) which gives you what you want. Hope that helps.
participants (4)
-
Chad Fulton
-
Nicholas Devenish
-
Nikolay Mayorov
-
Sebastian Berg