[Numpy-discussion] Logical indexing and higher-dimensional arrays.

Wed Feb 8 09:11:55 EST 2012

Le 8 février 2012 00:01, Travis Oliphant <travis at continuum.io> a écrit :

>
> On Feb 7, 2012, at 12:24 PM, Sturla Molden wrote:
>
> > On 07.02.2012 19:17, Benjamin Root wrote:
> >
> >>>>> print x.shape
> >> (2, 3, 4)
> >>>>> print x[0, :, :].shape
> >> (3, 4)
> >>>>> print x[0, :, idx].shape
> >> (2, 3)
> >
> > That looks like a bug to me. The length of the first dimension should be
> > the same.
>
> What you are probably expecting is (3,2) for this selection, but whenever
> you have ':' dimensions in-between "fancy-indexing", the rules that govern
> fancy-indexing are ambiguous in general about how to handle this case.  In
> this specific case (with a scalar being broadcast against the idx) it is
> pretty clear what to do, and I consider it a bug that a special case for
> this situation is not there.
>
> Recall that the shape of the output with fancy indexing is determined by
> broadcasting together the indexing objects and using that as the shape of
> the output:
>
> x[ind1, ind2] will produce an output with the shape of "broadcast(ind1,
> ind2)" whose elements are selected by the broadcasted tuple.      When this
> is combined with standard slicing like so:  x[ind1, :, ind2], the question
> is what should the shape of the output me.   If ind1 is a scalar there is
> no ambiguity (and this should be special cased --- but unfortunately
> isn't).    If ind1 is not a scalar, then what should the shape be under the
> rules of "zip-based" indexing.   I don't know.   So, in fact, what happens
> is that the broadcasted shape is determined and used as the "first part" of
> the shape.  The "second part" of the shape is the shape of the slice-based
> selection.
>
> So, in this case the (0 and idx) broadcast to the (2,) part of the shape
> which is placed at the first of the result.  The last part of the shape is
> the middle dimension (3,) resulting in the final shape (2,3).
>
> It could be argued that, in fact, this is a good example of why fancy
> indexing should follow cross-product semantics, and the current zip-based
> semantics should be moved to a method --- where the difficult-to-understand
> behavior with intermediate slices is also harder to spell because you have
> to explicitly create slice objects with "slice".     What do others think?
>   Obviously this couldn't change immediately, but it could be on the
> road-map for NumPy 2.0 or later.
>

>From a user perspective, I would definitely prefer cross-product semantics
for fancy indexing. In fact, I had never used fancy indexing with more than
one array index, so the behavior described in this thread totally baffled
me. If for instance x is a matrix, I think it's intuitive to expect x[0:2,
0:2] and x[[0, 1], [0, 1]] to return the same data.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120208/dd19f149/attachment.html>