[Numpy-discussion] What is up with raw boolean indices (like a[False])?
Aaron Meurer
asmeurer at gmail.com
Mon Jul 6 14:39:19 EDT 2020
I've been trying to figure out this behavior. It doesn't seem to be
documented at https://numpy.org/doc/stable/reference/arrays.indexing.html
>>> a = np.empty((2, 3))
>>> a.shape
(2, 5)
>>> a[True].shape
(1, 2, 5)
>>> a[False].shape
(0, 2, 5)
It seems like indexing with a raw boolean (True or False) adds an axis
with a dimension 1 or 0, resp.
Except it only works once:
>>> a[:,False]
array([], shape=(2, 0, 3), dtype=float64)
>>> a[:,False, False]
array([], shape=(2, 0, 3), dtype=float64)
>>> a[:,False,True].shape
(2, 0, 3)
>>> a[:,True,False].shape
(2, 0, 3)
The docs say "A single boolean index array is practically identical to
x[obj.nonzero()]". I have a hard time seeing this as an extension of
that, since indexing by `np.nonzero(False)` or `np.nonzero(True)`
*replaces* the given axis.
>>> a[np.nonzero(True)].shape
(1, 3)
>>> a[np.nonzero(False)].shape
(0, 3)
I think at best this behavior should be documented. I'm trying to
understand the motivation for it, or if it's even intentional. And in
particular, why do multiple boolean indices not insert multiple axes?
It would actually be useful to be able to generically add length 0
axes using an index, similar to how `newaxis` adds a length 1 axis.
Aaron Meurer
More information about the NumPy-Discussion
mailing list