Mailman 3 Is this an indexing bug? - NumPy-Discussion

newer
Suppressing "nesting"...

Is this an indexing bug?

older
Re: [Numpy-discussion] Radix sort?

Tom K.

13 Jun 2007 13 Jun '07

12:56 p.m.

...

...
...
h = zeros((1, 4, 100)) h[0,:,arange(14)].shape (14, 4)

Show replies by date

Sven Schreiber

16 Jun 16 Jun

2:49 p.m.

Tom K. schrieb:

...

...
...
...
h = zeros((1, 4, 100)) h[0,:,arange(14)].shape (14, 4)

After reading section 3.4.2.1 of the numpy book, I also still don't expect this result. So if it's not a bug, I'd be glad if some expert could explain why not. Thanks, Sven

Sven Schreiber

19 Jun 19 Jun

10:19 a.m.

Sven Schreiber schrieb:

...

Tom K. schrieb:

...
...
...
...
h = zeros((1, 4, 100)) h[0,:,arange(14)].shape (14, 4)

After reading section 3.4.2.1 of the numpy book, I also still don't expect this result. So if it's not a bug, I'd be glad if some expert could explain why not.

To be more specific, I would expect shape==(4,14). I am going to file a ticket soon if nobody explains why everything is just right and it's only Tom and I who just don't get it ;-) -sven

Sturla Molden

10:14 a.m.

On 6/19/2007 12:19 PM, Sven Schreiber wrote:

...

To be more specific, I would expect shape==(4,14).

...

...
...
h = numpy.zeros((1,4,14)) h[0,:,numpy.arange(14)].shape (14, 4) h[0,:,:].shape (4, 14)

h[0,:,numpy.arange(14)] is a case of "sdvanced indexing". You can also see that

...

...
...
h[0,:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13]].shape (14, 4)

Citing from Travis' book, page 83: "Example 2: Now let X.shape be (10,20,30,40,50) and suppose ind1 and ind2 are broadcastable to the shape (2,3,4). Then X[:,ind1,ind2] has shape (10,2,3,4,40,50) because the (20,30)-shaped subspace from X has been replaced with the (2,3,4) subspace from the indices. However, X[:,ind1,:,ind2,:] has shape (2,3,4,10,30,50) because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use .transpose() to move the sups pace anywhere desired. This example cannot be replicated using take." So I think this strange behaviour is actually correct. Sturla Molden

Sturla Molden

10:35 a.m.

On 6/19/2007 12:14 PM, Sturla Molden wrote:

...

h[0,:,numpy.arange(14)] is a case of "sdvanced indexing". You can also see that

...
...
...
h[0,:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13]].shape (14, 4)

Another way to explain this is that numpy.arange(14) and [0,1,2,3,4,5,6,7,8,9,10,11,12,13] is a sequence (i.e. iterator). So when NumPy iterates the sequence, the iterator yields a single integer, lets call it I. Using this integer as an index to h, gives a = h[0,:,I] which has shape=(4,). This gives us a sequence of arrays of length 4. In other words,

...

...
...
a = numpy.zeros(4) numpy.array([a,a,a,a,a,a,a,a,a,a,a,a,a,a]).shape (14, 4)

That is analogous to array([(0,0,0,0), (0,0,0,0), ...., (0,0,0,0)]) Sturla Molden

Stefan van der Walt

11:28 a.m.

On Tue, Jun 19, 2007 at 12:35:05PM +0200, Sturla Molden wrote:

...

On 6/19/2007 12:14 PM, Sturla Molden wrote:

...
h[0,:,numpy.arange(14)] is a case of "sdvanced indexing". You can also see that

...
...
...
h[0,:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13]].shape (14, 4)

Another way to explain this is that numpy.arange(14) and [0,1,2,3,4,5,6,7,8,9,10,11,12,13] is a sequence (i.e. iterator). So when NumPy iterates the sequence, the iterator yields a single integer, lets call it I. Using this integer as an index to h, gives a = h[0,:,I] which has shape=(4,). This gives us a sequence of arrays of length 4. In

If you follow this analogy, x = N.arange(100).reshape((10,10)) x[:,N.arange(5)].shape should be (5, 10), while in reality it is (10, 5). Cheers Stéfan

Sturla Molden

12:05 p.m.

On 6/19/2007 1:28 PM, Stefan van der Walt wrote:

...

x = N.arange(100).reshape((10,10)) x[:,N.arange(5)].shape

should be (5, 10), while in reality it is (10, 5).

...

...
...
y = numpy.arange(100).reshape((10,10)) y[:,numpy.arange(5)].shape (10,5)

...

...
...
x = numpy.arange(100).reshape((1,10,10)) x[0,:,numpy.arange(5)].shape (5,10)

hm... Sturla Molden

Sturla Molden

12:15 p.m.

...

...
...
x = numpy.arange(100).reshape((1,10,10))

...

...
...
x[0,:,numpy.arange(5)].shape (5, 10)

...

...
...
x[:,:,numpy.arange(5)].shape (1, 10, 5)

It looks like a bug that needs to be squashed. S.M.

Sven Schreiber

5:13 p.m.

Sturla Molden schrieb:

...

...
...
...
x = numpy.arange(100).reshape((1,10,10))

...
...
...
x[0,:,numpy.arange(5)].shape (5, 10)

...
...
...
x[:,:,numpy.arange(5)].shape (1, 10, 5)

It looks like a bug that needs to be squashed.

S.M.

And you already had me convinced ;-) I'm still curious which one's the bug and which one is expected... Anybody? -sven

Travis Oliphant

20 Jun 20 Jun

6:43 p.m.

Sturla Molden wrote:

...

...
...
...
x = numpy.arange(100).reshape((1,10,10))

...
...
...
x[0,:,numpy.arange(5)].shape (5, 10)

...
...
...
x[:,:,numpy.arange(5)].shape (1, 10, 5)

It looks like a bug that needs to be squashed.

These are both correct. See my previous posts about the rule. The first case is exactly the example we saw before: we start with a (1,10,10)-shaped array and replace the first and last-dimension (1,10)-shaped array with a (5,)-shaped array. Not having a clear place to put the extracted (5,)-shaped subspace, it is tacked on to the front. In the second example, the last-dimension (10,)-shaped sub-space is replaced with a (5,)-shaped sub-space. There is no ambiguity in this case and the result is a (1,10,5)-shaped array. There is no bug here. Perhaps unexpected behavior with "advanced indexing" combined with single-integer indexing in separated dimensions but no bug. The result, does follow an understandable and generalizable rule. In addition, you can get what you seem to want very easily using x[0][:,numpy.arange(5)] -Travis

Sturla Molden

7:35 p.m.

...

These are both correct. See my previous posts about the rule.

The first case is exactly the example we saw before: we start with a (1,10,10)-shaped array and replace the first and last-dimension (1,10)-shaped array with a (5,)-shaped array. Not having a clear place to put the extracted (5,)-shaped subspace, it is tacked on to the front.

In the second example, the last-dimension (10,)-shaped sub-space is replaced with a (5,)-shaped sub-space. There is no ambiguity in this case and the result is a (1,10,5)-shaped array.

There is no bug here. Perhaps unexpected behavior with "advanced indexing" combined with single-integer indexing in separated dimensions but no bug. The result, does follow an understandable and generalizable rule.

Travis, I agree that there is no bug here, as the software follows the specified behaviour. But it may be debated whether the specified behaviour is sensible or not. I think the source of the confusion is the different behaviour of 'advanced indexing' in NumPy and Matlab. This is what Matlab does:

...

...
x = zeros(1,10,10); size( x(1,:,[1 2 3 4 5]) )

ans = 1 10 5

...

...
size( x(:,:,[1 2 3 4 5]) )

ans = 1 10 5 I might be worth pointing this out in the NumPy documentation (on the SciPy web site and in your book), so users don't expect similar behaviour of 'advanced indexing' in NumPy and Matlab. Regards, Sturla Molden

Sven Schreiber

8:30 p.m.

Travis Oliphant schrieb:

...

Sturla Molden wrote:

...
...
...
...
x = numpy.arange(100).reshape((1,10,10))

...
...
...
x[0,:,numpy.arange(5)].shape (5, 10)

...
...
...
x[:,:,numpy.arange(5)].shape (1, 10, 5)

It looks like a bug that needs to be squashed.

These are both correct. See my previous posts about the rule.

The first case is exactly the example we saw before: we start with a (1,10,10)-shaped array and replace the first and last-dimension (1,10)-shaped array with a (5,)-shaped array. Not having a clear place to put the extracted (5,)-shaped subspace, it is tacked on to the front.

In the second example, the last-dimension (10,)-shaped sub-space is replaced with a (5,)-shaped sub-space. There is no ambiguity in this case and the result is a (1,10,5)-shaped array.

There is no bug here. Perhaps unexpected behavior with "advanced indexing" combined with single-integer indexing in separated dimensions but no bug. The result, does follow an understandable and generalizable rule.

Thanks for the explanation. Maybe some of these examples could be added to the relevant section of the book. Although I must say I'm glad that I currently don't need to use this stuff, I'm not sure I would get it right :-) cheers, sven

Travis Oliphant

6:23 p.m.

Stefan van der Walt wrote:

...

On Tue, Jun 19, 2007 at 12:35:05PM +0200, Sturla Molden wrote:

...
On 6/19/2007 12:14 PM, Sturla Molden wrote:

...
h[0,:,numpy.arange(14)] is a case of "sdvanced indexing". You can also see that

...
...
...
h[0,:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13]].shape (14, 4)

Another way to explain this is that numpy.arange(14) and [0,1,2,3,4,5,6,7,8,9,10,11,12,13] is a sequence (i.e. iterator). So when NumPy iterates the sequence, the iterator yields a single integer, lets call it I. Using this integer as an index to h, gives a = h[0,:,I] which has shape=(4,). This gives us a sequence of arrays of length 4. In

If you follow this analogy,

x = N.arange(100).reshape((10,10)) x[:,N.arange(5)].shape

should be (5, 10), while in reality it is (10, 5).

No, in this case, there is no ambiguity regarding where to put the sub-space, so it is put in the "expected" position. It could be argued that when a single integer is used in one of the indexing dimensions then there is also no ambiguity --- but the indexing code does not check for that special case. There is no bug here as far as I can tell. It is just perhaps somewhat unexpected behavior of a general rule about how "indirect" or "advanced" indexing is handled. You can always do h[0][:,arange(14)] to get the result you seem to want. -Travis

...

Cheers Stéfan _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Travis Oliphant

6:38 p.m.

Sturla Molden wrote:

...

On 6/19/2007 12:19 PM, Sven Schreiber wrote:

...
To be more specific, I would expect shape==(4,14).

...
...
...
h = numpy.zeros((1,4,14)) h[0,:,numpy.arange(14)].shape (14, 4) h[0,:,:].shape (4, 14)

h[0,:,numpy.arange(14)] is a case of "sdvanced indexing". You can also see that

...
...
...
h[0,:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13]].shape (14, 4)

Citing from Travis' book, page 83:

"Example 2: Now let X.shape be (10,20,30,40,50) and suppose ind1 and ind2 are broadcastable to the shape (2,3,4). Then X[:,ind1,ind2] has shape (10,2,3,4,40,50) because the (20,30)-shaped subspace from X has been replaced with the (2,3,4) subspace from the indices. However, X[:,ind1,:,ind2,:] has shape (2,3,4,10,30,50) because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use .transpose() to move the sups pace anywhere desired. This example cannot be replicated using take."

So I think this strange behaviour is actually correct.

Yes, Stuart is right. Now, obviously, "advanced" indexing was not designed with this particular case in mind. But, it is the expected outcome given the rule. Let's look at the application of the rule to this particular case. h is a (1,4,10) array. Now, ind1 is 0 and ind2 is [0,1,...,13]. The rules for "advanced" indexing apply because ind2 is a list. Thus, ind1 is broadcast to ind2 which means ind1 acts as if it were [0,0,...,0]. So, the indexing implies an extraction of a (14,)-shaped array from the (1,10)-shaped array. Now, where should this (14,)-shaped array be attached in the result. The rule states that if ind1 and ind2 are next to each other then it will replace the (1,10)-shaped portion of the array. In this case, however, they are not right next to each other. Thus, there is an ambiguity regarding where to place the (14,)-shaped array. The rule states that when this kind of ambiguity arises (notice there is no special-case checking to see if ind1 or ind2 comes from a scalar), the resulting sub-space is placed at the beginning. Thus, the (1,4,10)-shaped array becomes a (14,4)-shaped array on selection using h[ind1,:,ind2] This behavior follows the rule, precisely. Admittedly it is a bit unexpected in this instance, but it does follow a specific rule that can be explained once you understand it, and generalizes to all kinds of crazy situations where it is more difficult to see what the behavior "should" be. One may complain that h[ind1][:,ind2] != h[ind1,:,ind2] but that is generally true when using slices or lists for indexing. -Travis

6154

Age (days ago)

6161

Last active (days ago)

List overview

Download

13 comments

5 participants

participants (5)

Stefan van der Walt
Sturla Molden
Sven Schreiber
Tom K.
Travis Oliphant

Is this an indexing bug?

Tom K.

Sven Schreiber

Sven Schreiber

Sturla Molden

Sturla Molden

Stefan van der Walt

Sturla Molden

Sturla Molden

Sven Schreiber

Travis Oliphant

Sturla Molden

Sven Schreiber

Travis Oliphant

Travis Oliphant

tags

participants (5)