[Numpy-discussion] finding elements that match any in a set

Michael Katz michaeladamkatz at yahoo.com
Sat May 28 15:18:11 EDT 2011


Yes, thanks, np.in1d is what I needed. I didn't know how to find that.

It still seems counterintuitive to me that

    indexes = np.where( records.integer_field in values )

does not work whereas

    indexes = np.where( records.integer_field > 5 )

does.

In one case numpy is overriding the > operator; it's not checking if an array is 
greater than 5, but whether each element in the array is greater than 5.

>From a naive user's point of view, not knowing much about the difference between 
> and in from a python point of view, it seems like in would get overridden the 
same way.



________________________________
From: Christopher Barker <Chris.Barker at noaa.gov>
To: Discussion of Numerical Python <numpy-discussion at scipy.org>
Sent: Fri, May 27, 2011 5:48:37 PM
Subject: Re: [Numpy-discussion] finding elements that match any in a set

On 5/27/11 9:48 AM, Michael Katz wrote:
> I have a numpy array, records, with named fields including a field named
> "integer_field". I have an array (or list) of values of interest, and I
> want to get the indexes where integer_field has any of those values.
>
> Because I can do
>
> indexes = np.where( records.integer_field > 5 )
>
> I thought I could do
>
> indexes = np.where( records.integer_field in values )
>
> But that doesn't work. (As a side question I'm interested in why that
> doesn't work, when values is a python list.)

that doesn't work because the python list "in" operator doesn't 
understand arrays -- so it is looking ot see if the entire array is in 
the list. actually, it doesn't even get that far:

In [16]: a in l
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/Users/chris.barker/<ipython console> in <module>()

ValueError: The truth value of an array with more than one element is 
ambiguous. Use a.any() or a.all()

The ValueError results because it was decided that numpy array should 
not have a boolean value to avoid confusion -- i.e. is na array true 
whenever it is non-empty (like a list), or when all it's elements are 
true, or????

When I read this question, I thought -- hmmm, numpy needs something like 
"in", as the usual way: np.any(), would require a loop in this case. 
Then I read Skipper's message:

On 5/27/11 9:55 AM, Skipper Seabold wrote:
> Check out this recent thread. I think the proposed class does what you
> want. It's more efficient than in1d, if values is small compared to
> the length of records.

So that class may be worthwhile, but I think np.in1d is exactly what you 
are looking for:

indexes = np.in1d( records.integer_field, values )

Funny I'd never noticed that before.

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110528/143dcce1/attachment.html>


More information about the NumPy-Discussion mailing list