[Numpy-discussion] finding elements that match any in a set

Sat May 28 15:30:05 EDT 2011

On Sat, May 28, 2011 at 14:18, Michael Katz <michaeladamkatz at yahoo.com> wrote:
> Yes, thanks, np.in1d is what I needed. I didn't know how to find that.
>
> It still seems counterintuitive to me that
>
>     indexes = np.where( records.integer_field in values )
>
> does not work whereas
>
>     indexes = np.where( records.integer_field > 5 )
> does.
>
> In one case numpy is overriding the > operator; it's not checking if an
> array is greater than 5, but whether each element in the array is greater
> than 5.
>
> From a naive user's point of view, not knowing much about the difference
> between > and in from a python point of view, it seems like in would get
> overridden the same way.

The Python operators are turned into special method calls on one of
the objects. Most of the special methods that define the mathematical
operators come in pairs: __lt__ and __gt__, __add__ and __radd__, etc.
So if we have (x > y) then x.__gt__(y) is checked first. If x does not
know about the type of y, then y.__lt__(x) is checked. Similarly, for
(x + y), x.__add__(y) is checked first, then y.__radd__(x) is checked.
(myarray > 5), myarray.__gt__(5) is checked. numpy arrays do know
about ints, so that works.

However, (myarray in mylist) turns into mylist.__contains__(myarray).
Only the list object is ever checked for this method. There is no
paired method myarray.__rcontains__(mylist) so there is nothing that
numpy can override to make this operation do anything different from
what lists normally do, which is check if the given object is equal to
one of the items in the list.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco