[Python-ideas] Changing `Sequence.__contains__`

Steven D'Aprano steve at pearwood.info
Mon Jul 21 03:41:43 CEST 2014


On Sun, Jul 20, 2014 at 03:06:33PM -0700, Ram Rachum wrote:

> Why does the default `Sequence.__contains__` iterate through the items 
> rather than use `.index`, which may sometimes be more efficient?

Because having an index() method is not a requirement to be a sequence. 
It is optional. The implementation for Sequence.__contains__ which makes 
the least assumptions about the class is to iterate over the items.


> I suggest an implementation like this: 
> 
>     def __contains__(self, i):
>         try: self.index(i)
>         except ValueError: return False
>         else: return True
>         
> What do you think? 

That now means that sequence types will have to define an index method 
in order to be a sequence. Not only that, but the index method has to 
follow a standard API, which not all sequence types may do.

This would be marginally better:


    def __contains__(self, obj):
        try: 
            index = type(self).index
        except AttributeError:
            for o in self:
                if o is obj or o == obj:
                    return True
            return False
        else:
            try:
                index(obj)
            except ValueError:
                return False
            else:
                return True


but it has at two problems I can see:

- it's not backwards compatible with sequence types which may already 
define an index attribute which does something different, e.g.:

    class Book:
        def index(self):
            # return the index of the book
        def __getitem__(self, n):
            # return page n

- for a default implementation, it's too complicated.

If your sequence class has an efficient index method (or an efficient 
find method, or __getitem__ method, or any other efficient way of 
testing whether something exists in the sequence quickly) it's not much 
more work to define a custom __contains__ to take advantage of that. 
There's no need for the default Sequence fallback to try to guess what 
time-saving methods you might provide.

For a historical view, you should be aware that until recently, tuples 
had no index method:

[steve at ando ~]$ python2.5
Python 2.5.4 (r254:67916, Nov 25 2009, 18:45:43)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> (1, 2).index
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'index'


There's no reason to expect that all sequences will have an index 
method, and certainly no reason to demand that they do.



-- 
Steven


More information about the Python-ideas mailing list