Usefulness of the "not in" operator

Sun Nov 6 18:23:02 EST 2011

On Oct 16, 12:05 am, Steven D'Aprano <steve
+comp.lang.pyt... at pearwood.info> wrote:
> On Sat, 15 Oct 2011 15:04:24 -0700, DevPlayer wrote:
> > I thought "x not in y" was later added as syntax sugar for "not x in y"
> > meaning they used the same set of tokens. (Too lazy to check the actual
> > tokens)
Stated in response to OP wanting a seperate token for "not in" verse
"is not".

> Whether the compiler has a special token for "not in" is irrelevant.
I don't know.

> Perhaps it uses one token, or two, or none at all because a
> pre-processor changes "x not in y" to "not x in y". That's
> an implementation detail.
I agree.

> What's important is whether it is valid syntax or not, and how it is
> implemented.
I agree.

> As it turns out, the Python compiler does not distinguish the two forms:
>
> >>> from dis import dis
> >>> dis(compile('x not in y', '', 'single'))
>
>   1           0 LOAD_NAME                0 (x)
>               3 LOAD_NAME                1 (y)
>               6 COMPARE_OP               7 (not in)
>               9 PRINT_EXPR
>              10 LOAD_CONST               0 (None)
>              13 RETURN_VALUE        >>> dis(compile('not x in y', '', 'single'))
>
>   1           0 LOAD_NAME                0 (x)
>               3 LOAD_NAME                1 (y)
>               6 COMPARE_OP               7 (not in)
>               9 PRINT_EXPR
>              10 LOAD_CONST               0 (None)
>              13 RETURN_VALUE

So cool! Thanks for showing how to do that.

I tried to say implementing a seperate method was not efficient.

> Also for what it is worth, "x not in y" goes back to at least Python 1.5,
> and possibly even older. (I don't have any older versions available to
> test.)
So "not in" was added as an alternative (just a long time ago).
I too am glad they added it.

> (2) Instead of writing "True if blah else False", write "bool(blah)".
Good tip! I like.

>
> > class Y(object):
> >     def __contains__(self, x):
> >         for item in y:
> >             if x == y:
> >                 return True
> >         return False
>
> You don't have to define a __contains__ method if you just want to test
> each item sequentially. All you need is to obey the sequence protocol and
> define a __getitem__ that works in the conventional way:

Didn't intend to show how to implement __contains__ using "==" and
__not_contains__ "<>" in python but to show that python didn't benefit
from the not_in loop as much as for example assembly language does
it's loop (x86 LOOPE/LOOPZ vs LOOPNZ/LOOPNE).

>
> >>> class Test:
>
> ...     def __init__(self, args):
> ...             self._data = list(args)
> ...     def __getitem__(self, i):
> ...             return self._data[i]
> ...>>> t = Test("abcde")
> >>> "c" in t
> True
> >>> "x" in t
> False
Another new thing for me.

>
> Defining a specialist __contains__ method is only necessary for non-
> sequences, or if you have some fast method for testing whether an item is
> in the object quickly. If all you do is test each element one at a time,
> in numeric order, don't bother writing __contains__.
>
> > And if you wanted "x not in y" to be a different token you'd have to ADD
>
> Tokens are irrelevant. "x not in y" is defined to be the same as "not x
> in y" no matter what.
> You can't define "not in" to do something completely different.
I agree they are not implemented differently.
I agree that they shouldn't be implemented differently.
I disagree they can not be implemented differently. I think they can.
But I see no reason too.

> > class Y(object):
> >     def __not_contained__(self, x):
> >         for item in self:
> >             if x == y:
> >                 return False
> >         return True
>
> > AND with __not_contained__() you'd always have to iterate the entire
> > sequence to make sure even the last item doesn't match.
> > SO with one token "x not in y" you DON'T have to itterate through the
> > entire sequence thus it is more effiecient.
> That's not correct.
> Steven
I tried to prove my point and failded and instead proved (to myself)
you are correct. It is not more efficient. Also I should have used if
<> y: continue to have better tried to make the point but it wouldn't
have mattered. I still would have been wrong.

But I did walk away from this topic with some goodie tips. Thanks
Steven.