[Python-ideas] Changing `Sequence.__contains__`
Steven D'Aprano
steve at pearwood.info
Mon Jul 21 03:41:43 CEST 2014
On Sun, Jul 20, 2014 at 03:06:33PM -0700, Ram Rachum wrote:
> Why does the default `Sequence.__contains__` iterate through the items
> rather than use `.index`, which may sometimes be more efficient?
Because having an index() method is not a requirement to be a sequence.
It is optional. The implementation for Sequence.__contains__ which makes
the least assumptions about the class is to iterate over the items.
> I suggest an implementation like this:
>
> def __contains__(self, i):
> try: self.index(i)
> except ValueError: return False
> else: return True
>
> What do you think?
That now means that sequence types will have to define an index method
in order to be a sequence. Not only that, but the index method has to
follow a standard API, which not all sequence types may do.
This would be marginally better:
def __contains__(self, obj):
try:
index = type(self).index
except AttributeError:
for o in self:
if o is obj or o == obj:
return True
return False
else:
try:
index(obj)
except ValueError:
return False
else:
return True
but it has at two problems I can see:
- it's not backwards compatible with sequence types which may already
define an index attribute which does something different, e.g.:
class Book:
def index(self):
# return the index of the book
def __getitem__(self, n):
# return page n
- for a default implementation, it's too complicated.
If your sequence class has an efficient index method (or an efficient
find method, or __getitem__ method, or any other efficient way of
testing whether something exists in the sequence quickly) it's not much
more work to define a custom __contains__ to take advantage of that.
There's no need for the default Sequence fallback to try to guess what
time-saving methods you might provide.
For a historical view, you should be aware that until recently, tuples
had no index method:
[steve at ando ~]$ python2.5
Python 2.5.4 (r254:67916, Nov 25 2009, 18:45:43)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> (1, 2).index
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'index'
There's no reason to expect that all sequences will have an index
method, and certainly no reason to demand that they do.
--
Steven
More information about the Python-ideas
mailing list