Choosing sub array based on values in a column
I feel kinda stupid as I think this must be easier than I am making it. Below my question you will see an answer I got to a question that I thought I would be able to complete the last steps my self I was wrong :) So if I have an array Y = np.rec.array([(1.0, 0.0, 3.0, 3.5), (0.0, 0.0, 6.0, 6.5), (1.0, 1.0, 9.0, 9.5)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8'), ('var4', '<f8')]) do this works like I would expect Y[['var3','var4']][Y['var1']==1]
array([(3.0, 3.5), (9.0, 9.5)],
dtype=[('var3', '<f8'), ('var4', '<f8')]) But I would like to do this,
Y[['var3','var4']][Y['var1']==0 and Y['var2']==0]
Traceback (most recent call last): File "<string>", line 1, in <fragment> ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() I tried some any() and all() combination but nothing worked. What s the right way to go about this? Thanks Vincent This answer was received on the mail list from Skipper Seabold If you have a rec array Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)], dtype=[('var1', ''var2','var3']) You can access the rows like, Y[['var1','var2','var3']] Note the list within []. If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the number of rows. Y[['var1','var2','var3']].view((float,3)) note the tuple for the view, if they're all floats. Taking a view might not work if var# have different types, like ints and floats. If you want the mean of the rows (mean over the columns axis = 1) Y[['var1','var2','var3']].view((float,3)).mean(1) Some shortcuts. Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1) Also, for now, the columns will given back to you in the order they're in in the array no matter which way you ask for them. A patch has been submitted for returning the order you ask that I hope gets picked up... *Vincent Davis 720-301-3003 * vincent@vincentdavis.net my blog <http://vincentdavis.net> | LinkedIn<http://www.linkedin.com/in/vincentdavis>
Y[['var3','var4']][Y['var1']==0 and Y['var2']==0] should be Y[['var3','var4']][(Y['var1']==0) & (Y['var2']==0)] Long story, explained elsewhere, but python's 'and' is expected to coerce into a single boolean True or False, which is ambiguous for an array ('any' vs. 'all' interpretation), so it just raises an error instead. To do elementwise logical operations like you want, use the bitwise operators & | ^ on boolean arrays, but note the == operator has lower precedence so parenthesization is a must. Or use logical_and(), logical_or(), etc. Zach On Mar 22, 2010, at 11:00 PM, Vincent Davis wrote:
I feel kinda stupid as I think this must be easier than I am making it. Below my question you will see an answer I got to a question that I thought I would be able to complete the last steps my self I was wrong :)
So if I have an array
Y = np.rec.array([(1.0, 0.0, 3.0, 3.5), (0.0, 0.0, 6.0, 6.5), (1.0, 1.0, 9.0, 9.5)], dtype=[('var1', '<f8'), ('var2', '<f8'), ('var3', '<f8'), ('var4', '<f8')])
do this works like I would expect
Y[['var3','var4']][Y['var1']==1]
array([(3.0, 3.5), (9.0, 9.5)],
dtype=[('var3', '<f8'), ('var4', '<f8')])
But I would like to do this,
Y[['var3','var4']][Y['var1']==0 and Y['var2']==0]
Traceback (most recent call last):
File "<string>", line 1, in <fragment>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I tried some any() and all() combination but nothing worked. What s the right way to go about this?
Thanks
Vincent
This answer was received on the mail list from Skipper Seabold If you have a rec array
Y = np.rec.array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0), (7.0, 8.0, 9.0)],dtype=[('var1', ''var2','var3'])
You can access the rows like,Y[['var1','var2','var3']]Note the list within [].
If you want a "normal" array, I like this way that Pierre recently pointed out. 3 is the number of columns, and it fills in the numberof rows.
Y[['var1','var2','var3']].view((float,3))
note the tuple for the view, if they're all floats. Taking a viewmight not work if var# have different types, like ints and floats.
If you want the mean of the rows (mean over the columns axis = 1) Y[['var1','var2','var3']].view((float,3)).mean(1)Some shortcuts.Y[list(Y.dtype.names)].view((float,len(Y.dtype))).mean(1)
Also, for now, the columns will given back to you in the order they'rein in the array no matter which way you ask for them. A patch hasbeen submitted for returning the order you ask that I hope gets picked up...
Vincent Davis 720-301-3003 vincent@vincentdavis.net
my blog | LinkedIn
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
participants (2)
-
Vincent Davis
-
Zachary Pincus