Can I rely on...

R. David Murray rdmurray at bitdance.com
Thu Mar 19 15:19:08 EDT 2009


"Emanuele D'Arrigo" <manu3d at gmail.com> wrote:
> Thank you everybody for the informative replies.
> 
> I'll have to comb my code for all the instances of "item in sequence"
> statement because I suspect some of them are as unsafe as my first
> example. Oh well. One more lesson learned.

You may have far fewer unsafe cases than you think, depending
on how you understood the answers you got, some of which
were a bit confusing.  Just to make sure it is clear
what is going on in your example....

>From the documentation of 'in':

    x in s   True if an item of s is equal to x, else False

(http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrange)

Note the use of 'equal' there.  So for lists and tuples,

    if x in s: dosomething

is the same as

    for item in s:
        if item == x:
            do something
            break

So:

    >>> s = ['sdb*&', 'uuyh', 'foo']
    >>> x = 'sdb*&'
    >>> x is s[0]
    False
    >>> x in s
    True

(I used a string with special characters in it to avoid Python's
interning of identifier-like strings so that x and s[0] would not be
the same object).

Your problem with the regex example is that re makes no promise that
patterns compiled from the same source string will compare equal to
each other.  Thus their _equality_ is not guaranteed.  Switching to
using an equals comparison won't help you avoid your problem in
the example you showed.

Now, if you have a custom sequence type, 'in' and and an '==' loop
might produce different results, since 'in' is evaluated by the special
method __contains__ if it exists (and list iteration with equality if
it doesn't).  But the _intent_ of __contains__ is that comparison be
by equality, not object identity, so if the two are not the same something
weird is going on and there'd better be a good reason for it :)

In summary, 'in' is the thing to use if you want to know if your
sample object is _equal to_ any of the objects in the container.
As long as equality is meaningful for the objects involved, there's
no reason to switch to a loop.

--
R. David Murray           http://www.bitdance.com




More information about the Python-list mailing list