[Python-ideas] should there be a difference between generators and iterators?

Bruce Frederiksen dangyogi at gmail.com
Fri Sep 5 17:24:14 CEST 2008


Bruce Leban wrote:
>
> I don't think I like the other suggestions. Having an exception in 
> some random part of a loop throw *into* the iterator of the loop, just 
> seems weird. For the examples, you give, couldn't break_ do the throw 
> itself?
First, the easy part.  Yes break could also do a throw (see my response 
to Guido).

There are two ways to try to visualize this.

First explanation:

A function, say 'bar', has input and produces output.  When it raises an 
exception, there could be two reasons for this:

1.  It doesn't like its input (in which case the input might be fixed 
and new values provided), or
2.  It's unable to produce its output.

With traditional functions, the caller of 'bar' is responsible for 
providing its input and also receives its output:

    input = foo(...)
    output = bar(input)

or, simply:

    output = bar(foo(...))

Since the function called to produce the input ('foo') is no longer 
around when 'bar' is called, the distinction above hasn't been 
important, because either way the caller has to deal with the problem.  
In the first case, the caller might produce some other input value and 
call 'bar' again.  In the second case, the caller must proceed without 
the output.

But when generators provide input to a function, the generator is still 
around when the function is run.  So it makes sense, in the first case, 
to raise the exception in the generator and give it a chance to fix the 
input value.  And this is exactly how the new (in 2.5) 'throw' method is 
defined to act on the generator side (in PEP 342).

If we knew which exceptions meant "bad input" vs "output not possible", 
we could only raise the first kind in the generator.  But we don't know 
this.

So it makes sense to first raise all exceptions on the input side in the 
generator.  If the generator recognizes the exception (i.e., as an 
'input error' exception) and can fix the problem, then 'bar' may still 
be able to produce output.  If not, then forward the exception on to the 
output side of 'bar' (as an 'output not possible' exception).

Applying this logic to the 'for' statement is what leads to my point #2:

    for input in foo(...):
        output = bar(input)

If 'bar' raises an exception, it should first go to 'foo' (if 'foo' has 
a 'throw' method), and then to the outer block containing the 'for' 
statement.  If the generator's 'throw' method returns a value, then the 
'for' statement would assign this value to 'input' and run its body 
again, proceeding normally (the exception has been taken care of).  If 
the generator's 'throw' method does not handle the exception, then it is 
re-raised in the outer block containing the 'for' statement.

Second explanation:

Reading the definition of the 'throw' method for generators in PEP 342, 
I naturally thought that the 'for' statement would abide by this new 
protocol.  I was surprised to learn that it didn't.  Since generators 
are nearly always used in a 'for' statement, how is this new method to 
be utilized?  This isn't easily done.  The code ends up looking like:

    g = foo(...)
    for input in g:
        while True:
            try:
                output = bar(input)
                break   # from 'while', can't easily break from 'for' 
anymore...
            except Exception:
                input = g.throw(*sys.exc_info())

Yikes!
>
>
>
> To get __enter__ and __exit__ behavior for an iterator, can't you just 
> wrap it in class that provides that capability and calls close?
Sure, contextlib.closing.  But, just as it's nice that files support 
__enter__ and __exit__, it would be nice if other objects that need to 
be closed (sockets, generators, etc) did too.  And, with the example set 
by 'file', one is lead to expect this support in these other cases...

Since there is no need to clean up after iterators in general, but only 
for generators specifically; and since the BDFL has nixed my point #5, 
it makes sense to only add the __enter__ and __exit__ to generators.  
(And, by extension, itertools).
> You might need itertools to have some support that extended iterator 
> class but that seems simpler.
I don't follow you here.


-bruce



More information about the Python-ideas mailing list