[Numpy-discussion] What is up with raw boolean indices (like a[False])?

Sebastian Berg sebastian at sipsolutions.net
Wed Aug 19 20:55:03 EDT 2020


On Wed, 2020-08-19 at 18:07 -0600, Aaron Meurer wrote:
> > > 3. If you have multiple advanced indexing you get annoying
> > > broadcasting
> > >    of all of these. That is *always* confusing for boolean
> > > indices.
> > >    0-D should not be too special there...
> 
> OK, now that I am learning more about advanced indexing, this
> statement is confusing to me. It seems that scalar boolean indices do
> not broadcast. For example:

Well, broadcasting means you broadcast the *nonzero result* unless I am
very confused... There is a reason I dismissed it. We could (and
arguably should) just deprecate it.  And I have doubts anyone would
even notice.

> 
> > > > np.arange(2)[False, np.array([True, False])]
> array([], dtype=int64)
> > > > np.arange(2)[tuple(np.broadcast_arrays(False, np.array([True,
> > > > False])))]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> IndexError: too many indices for array: array is 1-dimensional, but 2
> were indexed
> 
> And indeed, the docs even say, as you noted, "the nonzero equivalence
> for Boolean arrays does not hold for zero dimensional boolean
> arrays,"
> which I guess also applies to the broadcasting.

I actually think that probably also holds. Nonzero just behave weird
for 0D because arrays (because it returns a tuple).
But since broadcasting the nonzero result is so weird, and since 0-D
booleans require some additional logic and don't generalize 100% (code
wise), I won't rule out there are differences.

> From what I can tell, the logic is that all integer and boolean
> arrays

Did you try that? Because as I said above, IIRC broadcasting the
boolean array without first calling `nonzero` isn't really whats going
on. And I don't know how it could be whats going on, since adding
dimensions to a boolean index would have much more implications?

- Sebastian


> (and scalar ints) are broadcast together, *except* for boolean
> scalars. Then the first boolean scalar is replaced with and(all
> boolean scalars) and the rest are removed from the index. Then that
> index adds a length 1 axis if it is True and 0 if it is False.
> 
> So they don't broadcast, but rather "fake broadcast". I still contend
> that it would be much more useful, if True were a synonym for newaxis
> and False worked like newaxis but instead added a length 0 axis.
> Alternately, True and False scalars should behave exactly like all
> other boolean arrays with no exceptions (i.e., work like
> np.nonzero(),
> broadcast, etc.). This would be less useful, but more consistent.
> 
> Aaron Meurer
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200819/9e3dde59/attachment.sig>


More information about the NumPy-Discussion mailing list