[Python-Dev] Single- vs. Multi-pass iterability

Oren Tirosh oren-py-d@hishome.net
Fri, 12 Jul 2002 01:43:32 -0400


On Thu, Jul 11, 2002 at 11:31:47AM -0400, Andrew Koenig wrote:
> Right now, every iterator, and every object that supports
> iteration, must have an __iter__() method.  Suppose we augment
> that with the following:
> 
>         A new kind of iterator, called a multiple iterator, that
>         supports multiple iterations over the same sequence.
...
>             __copy__()      return a distinct, newly created multiple
>                             iterator that iterates over the same
>                             sequence as the original, starting from
>                             the current element.


There is no need for a new type of iterator. It's ok that iterators are
disposable.  If I need multiple iterations I don't want to copy the
iterator - I prefer to ask the original iterable object for a new iterator.
All I need is some way to know whether the iterable object (container) can 
produce multiple iterators that generate the same sequence.

  An object is re-iterable if its iterators do not modify its state.

The iterator of an iterator is itself.  Calling the next method, by
definition, modifies the internal state of an object. Therefore anything 
that has a next method is not re-iterable. 

"hasattr(obj,'__iter__') and hasattr(obj, 'next')" is a good signature of
a non re-iterable object.  Unfortunately, the opposite is not true.  One
iterable object in Python produces iterators that modify its state when 
their .next() method is called - the file object.

I have just submitted a patch that makes a file into an iterator (i.e. adds 
a .next method to files).  With this change all Python objects that have
an __iter__ method and no next method produce iterators that do not modify
the container.  Another possibility would be to make file iterators that
use seek or re-open the file to avoid modifying the file position of the
parent file object.  I don't think that would be a good idea because files
can be devices, pipes or sockets which are not seekable. 

I think it may be a good idea to add a note to the documentation pages
about the iterator protocol that the iterators of a container should not
modify the state of the container. If you think they must it's probably 
a good sign that your 'container' is not really a container and maybe it 
should be an iterator rather than produce iterators of itself.

	Oren