[Python-ideas] should there be a difference between generators and iterators?
dangyogi at gmail.com
Fri Sep 5 17:24:14 CEST 2008
Bruce Leban wrote:
> I don't think I like the other suggestions. Having an exception in
> some random part of a loop throw *into* the iterator of the loop, just
> seems weird. For the examples, you give, couldn't break_ do the throw
First, the easy part. Yes break could also do a throw (see my response
There are two ways to try to visualize this.
A function, say 'bar', has input and produces output. When it raises an
exception, there could be two reasons for this:
1. It doesn't like its input (in which case the input might be fixed
and new values provided), or
2. It's unable to produce its output.
With traditional functions, the caller of 'bar' is responsible for
providing its input and also receives its output:
input = foo(...)
output = bar(input)
output = bar(foo(...))
Since the function called to produce the input ('foo') is no longer
around when 'bar' is called, the distinction above hasn't been
important, because either way the caller has to deal with the problem.
In the first case, the caller might produce some other input value and
call 'bar' again. In the second case, the caller must proceed without
But when generators provide input to a function, the generator is still
around when the function is run. So it makes sense, in the first case,
to raise the exception in the generator and give it a chance to fix the
input value. And this is exactly how the new (in 2.5) 'throw' method is
defined to act on the generator side (in PEP 342).
If we knew which exceptions meant "bad input" vs "output not possible",
we could only raise the first kind in the generator. But we don't know
So it makes sense to first raise all exceptions on the input side in the
generator. If the generator recognizes the exception (i.e., as an
'input error' exception) and can fix the problem, then 'bar' may still
be able to produce output. If not, then forward the exception on to the
output side of 'bar' (as an 'output not possible' exception).
Applying this logic to the 'for' statement is what leads to my point #2:
for input in foo(...):
output = bar(input)
If 'bar' raises an exception, it should first go to 'foo' (if 'foo' has
a 'throw' method), and then to the outer block containing the 'for'
statement. If the generator's 'throw' method returns a value, then the
'for' statement would assign this value to 'input' and run its body
again, proceeding normally (the exception has been taken care of). If
the generator's 'throw' method does not handle the exception, then it is
re-raised in the outer block containing the 'for' statement.
Reading the definition of the 'throw' method for generators in PEP 342,
I naturally thought that the 'for' statement would abide by this new
protocol. I was surprised to learn that it didn't. Since generators
are nearly always used in a 'for' statement, how is this new method to
be utilized? This isn't easily done. The code ends up looking like:
g = foo(...)
for input in g:
output = bar(input)
break # from 'while', can't easily break from 'for'
input = g.throw(*sys.exc_info())
> To get __enter__ and __exit__ behavior for an iterator, can't you just
> wrap it in class that provides that capability and calls close?
Sure, contextlib.closing. But, just as it's nice that files support
__enter__ and __exit__, it would be nice if other objects that need to
be closed (sockets, generators, etc) did too. And, with the example set
by 'file', one is lead to expect this support in these other cases...
Since there is no need to clean up after iterators in general, but only
for generators specifically; and since the BDFL has nixed my point #5,
it makes sense to only add the __enter__ and __exit__ to generators.
(And, by extension, itertools).
> You might need itertools to have some support that extended iterator
> class but that seems simpler.
I don't follow you here.
More information about the Python-ideas