[Tutor] Here is newbie doc on how to implement generators

Tue Jul 17 00:13:35 CEST 2007

Dave Kuhlman wrote:
> I find iterators and generators fascinating.  So, in order to try
> to understand them better myself, I've written up some notes.  I'm
> hoping that these notes might help someone new to the generators
> and iterators in Python.  You can find it here:
> 
>     http://www.rexx.com/~dkuhlman/python_comments.html
>     http://www.rexx.com/~dkuhlman/python_comments.html#iterators-and-generators
> 
> I'll appreciate any comments and suggestions that might help me
> improve it.

In the Consumers section, the first time you mention the iterator 
protocol, you omit mention of __init__() - this is a required method of 
an iterator.

In the section "The iterator protocol", the __iter__() bullet, "(2) 
return the value returned by a generator method" is not correct. An 
iterator must return self from __iter__(). An object that returns a 
(new) generator from __iter__() is an iterable, not an iterator.

In some cases there is no need to write a separate generator method and 
call it in __iter__(); __iter__() itself can be written as a generator 
method using yield. This works if the generator method doesn't need any 
arguments.

Your Doubler class will not behave correctly because of the 
re-initialization of the index in __iter__(). Calling __iter__() should 
not have any side effects. Here is an example of why this is a problem, 
using Doubler as you have written it:

In [9]: d=Doubler(range(5))
In [10]: d.next()
Out[10]: 0
In [11]: d.next()
Out[11]: 2
In [12]: for i in d: print i
    ....:
0
2
4
6
8

Notice how the for loop resets the iterator (because it calls __iter__() 
to make sure it has an iterator). Compare with a correctly implemented 
iterator:

In [13]: it = iter(range(5))
In [14]: it.next()
Out[14]: 0
In [15]: it.next()
Out[15]: 1
In [16]: for i in it: print i
    ....:
2
3
4

Double would actually be easier to implement with __iter__() as a 
generator method.

I already commented on the pitfalls of making an object its own 
iterator. In the standard library, a file is its own iterator. That 
makes sense because it is a wrapper around a singleton state - the seek 
position of the file. I don't think it makes much sense in general to 
have an object be its own iterator unless the object is just an iterator.

Another reference is the What's New doc for Python 2.2:
http://www.python.org/doc/2.2.3/whatsnew/node4.html
and the following page.

Kent