[Python-Dev] Single- vs. Multi-pass iterability
Wed, 17 Jul 2002 15:58:11 -0500 (CDT)
On Wed, 17 Jul 2002, Clark C . Evans wrote:
> On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote:
> | Naming
> | the method "next" means that any object with a "next" method
> | cannot be adapted to support the iterator protocol.
> Right, but such objects wouldn't be mis-leading beacuse they'd
> be missing a __iter__ method, correct?
__iter__ is a red herring. It has nothing to do with the act of
iterating. It exists only to support the use of "for" directly
on the iterator. Iterators that currently implement "next" but
not "__iter__" will work in some places and not others. For
example, given this:
def __init__(self, last):
self.i = 0
self.last = last
self.i += 1
if self.i > self.last: raise StopIteration
def __init__(self, size):
self.size = size
This will work:
>>> for x in Container(3): print x
But this will fail:
>>> for x in Counter(3): print x
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iteration over non-sequence
It's more accurate to say that there are two distinct protocols here.
1. An object is "for-able" if it implements __iter__ or __getitem__.
This is a subset of the sequence protocol.
2. An object can be iterated if it implements next.
The Container supports only protocol 1, and the Counter supports
only protocol 2, with the above results.
Iterators are currently asked to support both protocols. The
semantics of iteration come only from protocol 2; protocol 1 is
an effort to make iterators look sorta like sequences. But the
analogy is very weak -- these are "sequences" that destroy
themselves while you look at them -- not like any typical
sequence i've ever seen!
The short of it is that whenever any Python programmer says
"for x in y", he or she had better be darned sure of whether
this is going to destroy y. Whatever we can do to make this
clear would be a good idea.