[Numpy-discussion] DEP: Deprecate boolean array indices with non-matching shape #4353

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Jun 5 08:36:02 EDT 2015


On Fri, Jun 5, 2015 at 3:16 AM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Do, 2015-06-04 at 18:04 -0700, Nathaniel Smith wrote:
> > On Thu, Jun 4, 2015 at 5:57 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > > So specifically the question is -- if you have an array with five
> > items, and
> > > a Boolean array with three items, then currently you can use the
> > later to
> > > index the former:
> > >
> > > arr = np.arange(5)
> > > mask = np.asarray([True, False, True])
> > > arr[mask] # returns array([0, 2])
> > >
> > > This is justified by the rule that indexing with a Boolean array
> > should be
> > > the same as indexing with the same array that's been passed to
> > np.nonzero().
> > > Empirically, though, this causes constant confusion and does not
> > seen very
> > > useful, so the question is whether we should deprecate it.
> >
> > One place where the current behavior is particularly baffling and
> > annoying is when you have multiple boolean masks in the same indexing
> > operation. I think everyone would expect this to index separately on
> > each axis ("outer product indexing" style, like slices do), and that's
> > really the only useful interpretation, but that's not what it does...:
>
>
> This is not being deprecated in there for the moment, it is a different
> discussion. Though maybe we can improve the error message to mention
> that the array was originally boolean, has always been bugging me a bit
> (it used to mention for some cases it is not anymore).
>
> - Sebastian
>
>
> > In [3]: a = np.arange(9).reshape((3, 3))
> >
> > In [4]: a
> > Out[4]:
> > array([[0, 1, 2],
> >        [3, 4, 5],
> >        [6, 7, 8]])
> >
> > In [6]: a[np.asarray([True, False, True]), np.asarray([False, True,
> > True])]
> > Out[6]: array([1, 8])
> >
> > In [7]: a[np.asarray([True, False, True]), np.asarray([False, False,
> > True])]
> > Out[7]: array([2, 8])
> >
> > In [8]: a[np.asarray([True, False, True]), np.asarray([True, True,
> > True])]
> >
> ---------------------------------------------------------------------------
> > IndexError                                Traceback (most recent call
> > last)
> > <ipython-input-8-30b3427bec2a> in <module>()
> > ----> 1 a[np.asarray([True, False, True]), np.asarray([True, True,
> > True])]
> >
> > IndexError: shape mismatch: indexing arrays could not be broadcast
> > together with shapes (2,) (3,)
> >
> >
> > -n
> >
> > --
> > Nathaniel J. Smith -- http://vorpus.org
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

What is actually being deprecated?
It looks like there are different examples.

wrong length: Nathaniels first example above, where the mask is not
broadcastable to original array because mask is longer or shorter than
shape[axis].
I also wouldn't have expected this to work, although I use np.nozero and
boolean mask indexing interchangeably, I would assume we need the correct
length for the mask.

The second case where the boolean mask has an extra dimension of length
one, or several boolean arrays might need more checking.
I'm pretty sure I used various version, assuming they are a feature, and
when I see arrays, I usually don't assume "outer product indexing"  (that
might lead to a similar discussion as the recent fancy versus orthogonal
indexing)


Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150605/cc892da0/attachment.html>


More information about the NumPy-Discussion mailing list