[Python-Dev] Single- vs. Multi-pass iterability
Guido van Rossum
guido@python.org
Wed, 17 Jul 2002 17:21:26 -0400
> __iter__ is a red herring. It has nothing to do with the act of
> iterating. It exists only to support the use of "for" directly
> on the iterator. Iterators that currently implement "next" but
> not "__iter__" will work in some places and not others. For
> example, given this:
>
> class Counter:
> def __init__(self, last):
> self.i = 0
> self.last = last
>
> def next(self):
> self.i += 1
> if self.i > self.last: raise StopIteration
> return self.i
>
> class Container:
> def __init__(self, size):
> self.size = size
>
> def __iter__(self):
> return Counter(self.size)
>
> This will work:
>
> >>> for x in Container(3): print x
> ...
> 1
> 2
> 3
>
> But this will fail:
>
> >>> for x in Counter(3): print x
> ...
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: iteration over non-sequence
>
> It's more accurate to say that there are two distinct protocols here.
>
> 1. An object is "for-able" if it implements __iter__ or __getitem__.
> This is a subset of the sequence protocol.
>
> 2. An object can be iterated if it implements next.
>
> The Container supports only protocol 1, and the Counter supports
> only protocol 2, with the above results.
>
> Iterators are currently asked to support both protocols. The
> semantics of iteration come only from protocol 2; protocol 1 is
> an effort to make iterators look sorta like sequences. But the
> analogy is very weak -- these are "sequences" that destroy
> themselves while you look at them -- not like any typical
> sequence i've ever seen!
>
> The short of it is that whenever any Python programmer says
> "for x in y", he or she had better be darned sure of whether
> this is going to destroy y. Whatever we can do to make this
> clear would be a good idea.
This is a very good summary of the two iterator protocols. Ping,
would you mind adding this to PEP 234?
--Guido van Rossum (home page: http://www.python.org/~guido/)