[Numpy-discussion] boolean indexing of structured arrays
Benjamin Root
ben.root at ou.edu
Wed Jun 6 09:08:43 EDT 2012
Not sure if this is a bug or not. I am using a fairly recent master branch.
>>> # Setting up...
>>> import numpy as np
>>> a = np.zeros((10, 1), dtype=[('foo', 'f4'), ('bar', 'f4'), ('spam',
'f4')])
>>> a['foo'] = np.random.random((10, 1))
>>> a['bar'] = np.random.random((10, 1))
>>> a['spam'] = np.random.random((10, 1))
>>> a
array([[(0.8748096823692322, 0.08278043568134308, 0.2463584989309311)],
[(0.27129432559013367, 0.9645473957061768, 0.41787904500961304)],
[(0.4902191460132599, 0.6772263646125793, 0.07460898905992508)],
[(0.13542482256889343, 0.8646988868713379, 0.98673015832901)],
[(0.6527929902076721, 0.7392181754112244, 0.5919206738471985)],
[(0.11248272657394409, 0.5818713903427124, 0.9287213087081909)],
[(0.47561103105545044, 0.48848700523376465, 0.7108170390129089)],
[(0.47087424993515015, 0.6080209016799927, 0.6583810448646545)],
[(0.08447299897670746, 0.39479559659957886, 0.13520188629627228)],
[(0.7074970006942749, 0.8426893353462219, 0.19329732656478882)]],
dtype=[('foo', '<f4'), ('bar', '<f4'), ('spam', '<f4')])
>>> b = (a['bar'] > 0.4)
>>> b
array([[False],
[ True],
[ True],
[ True],
[ True],
[ True],
[ True],
[ True],
[False],
[ True]], dtype=bool)
>>> # ---- Boolean indexing of structured array with a (10,1) boolean array
----
>>> a[b]['foo']
array([ 0.27129433, 0.49021915, 0.13542482, 0.65279299, 0.11248273,
0.47561103, 0.47087425, 0.707497 ], dtype=float32)
>>> # ---- Boolean indexing of structured array with a (10,) boolean array
----
>>> a[b[:,0]]['foo']
array([[(0.27129432559013367, 0.9645473957061768, 0.41787904500961304)],
[(0.4902191460132599, 0.6772263646125793, 0.07460898905992508)],
[(0.13542482256889343, 0.8646988868713379, 0.98673015832901)],
[(0.6527929902076721, 0.7392181754112244, 0.5919206738471985)],
[(0.11248272657394409, 0.5818713903427124, 0.9287213087081909)],
[(0.47561103105545044, 0.48848700523376465, 0.7108170390129089)],
[(0.47087424993515015, 0.6080209016799927, 0.6583810448646545)],
[(0.7074970006942749, 0.8426893353462219, 0.19329732656478882)]],
dtype=[('foo', '<f4'), ('bar', '<f4'), ('spam', '<f4')])
So, if I index with a (10, 1) boolean array, I get back a (N,) shape result
(regardless of whether I am accessing a field or not). But, if I index with
a (10, ) boolean array, I get back a (N, 1) result. Note that other forms
of indexing such as slicing and fancy indexing returns (N, 1) shaped
results. Now, admittedly, this is actually consistent with boolean
indexing of regular numpy arrays. I just wanted to make sure that this is
intentional. This has caused some confusion for me recently when I
(perhaps falsely) expected that the result from a boolean index of a
structured array would result in a similarly structured array. The
use-case was to modify an existing function by removing the unwanted "rows"
with a simply boolean index statement instead of a slice.
Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120606/61421ed4/attachment.html>
More information about the NumPy-Discussion
mailing list