[Tutor] __getitem__

Kent Johnson kent37 at tds.net
Tue Jan 17 14:31:01 CET 2006


Kent Johnson wrote:
> Alan Gauld wrote:
>>>What does 'in' have to do with indexing?
>>
>>
>>Nothing unless its implementation uses a while loop
>>and index, but thats unlikely.
> 
> 
> But that is pretty close to what actually happens, according to the 
> language ref docs for 'in' (see my previous post).

I'm curious enough about this (OK, I admit it, I like to be right, too 
;) to dig in to the details, if anyone is interested...one of the 
benefits of Python being open-source is you can find out how it works...

First step, look at the bytecodes:

  >>> import dis
  >>> def f(x, y):
  ...   return x in y
  ...
  >>> dis.dis(f)
   2           0 LOAD_FAST                0 (x)
               3 LOAD_FAST                1 (y)
               6 COMPARE_OP               6 (in)
               9 RETURN_VALUE

So 'in' is implemented as a COMPARE_OP. Looking in ceval.c for 
COMPARE_OP, it has some optimizations for a few fast compares, then 
calls cmp_outcome() which, for 'in', calls PySequence_Contains().

PySequence_Contains() is implemented in abstract.c. If the container 
implements __contains__, that is called, otherwise 
_PySequence_IterSearch() is used.

_PySequence_IterSearch() calls PyObject_GetIter() to constuct an 
iterator on the sequence, then goes into an infinite loop (for (;;)) 
calling PyIter_Next() on the iterator until the item is found or the 
call to PyIter_Next() returns an error.

PyObject_GetIter() is also in abstract.c. If the object has an 
__iter__() method, that is called, otherwise PySeqIter_New() is called 
to construct an iterator.

PySeqIter_New() is implemented in iterobject.c. It's next() method is in 
iter_iternext(). This method calls __getitem__() on its wrapped object 
and increments an index for next time.

So, though the details are complex, I think it is pretty fair to say 
that the implementation uses a while loop (in _PySequence_IterSearch()) 
and a counter (wrapped in PySeqIter_Type) to implement 'in' on a 
container that defines __getitem__ but not __iter__.

By the way the implementation of 'for' also calls PyObject_GetIter(), so 
it uses the same mechanism to generate an iterator for a sequence that 
defines __getitem__().

Kent



More information about the Tutor mailing list