[Python-ideas] Proto-PEP on a 'yield from' statement
Bruce Frederiksen
dangyogi at gmail.com
Fri Feb 13 07:27:20 CET 2009
Raymond Hettinger wrote:
>> I would think that in addition to forwarding send values to the
>> subgenerator, that throw exceptions sent to the delegating generator
>> also be forwarded to the subgenerator. If the subgenerator does not
>> handle the exception, then it should be re-raised in the delegating
>> generator. Also, the subgenerator close method should be called by
>> the delegating generator.
>
> I recommend dropping the notion of forwarding from the proposal.
> The idea is use-case challenged, complicated, and should not be
> hidden behind new syntax.
>
> Would hate for this to become a trojan horse proposal
> when most folks just want a fast iterator pass-through mechasism:
I don't really understand your objection. How does adding the ability
to forward send/throw values and closing the subgenerator in any way
whatsoever get in the way of you using this as a fast iterator
pass-through mechanism?
I agree that 98% of the time the simple pass-through mechanism is all
that will be required of this new feature. And I agree that this alone
is sufficient motivation to want to see this feature added. But I have
done quite a bit of work with nested generators and end up having to use
itertools.chain, which also doesn't support the full generator
behavior. Specifically, in my case, I needed itertools.chain to close
the subgenerator so that finally clauses in the subgenerator get run
when they should on jython and ironpython.
I put in a request of this and was turned down. I found an alternative
way to do it, but it's somewhat ugly:
class chain_context(object):
def __init__(self, outer_it):
self.outer_it = outer_iterable(outer_it)
def __enter__(self):
return itertools.chain.from_iterable(self.outer_it)
def __exit__(self, type, value, tb): self.outer_it.close()
class outer_iterable(object):
def __init__(self, outer_it):
self.outer_it = iter(outer_it)
self.inner_it = None
def __iter__(self): return self
def close(self):
if hasattr(self.inner_it, '__exit__'):
self.inner_it.__exit__(None, None, None)
elif hasattr(self.inner_it, 'close'): self.inner_it.close()
if hasattr(self.outer_it, 'close'): self.outer_it.close()
def next(self):
ans = self.outer_it.next()
if hasattr(ans, '__enter__'):
self.inner_it = ans
return ans.__enter__()
ans = iter(ans)
self.inner_it = ans
return ans
and then use as:
with chain_context(gen(x) for x in iterable) as it:
for y in it:
...
So from my own experience, I would strongly argue that the new yield
from should at least honor the generator close method. Perhaps some
people here have never run python with a different garbage collector
that doesn't immediately reclaim garbage objects, so they don't
understand the need for this. Jython and ironpython are both just
coming out with their 2.5 support; so expect to hear more of these
complaints in the not to distant future from that crowd...
But I am baffled why the python community adopts these extra methods on
generators and then refuses to support them anywhere else (for loops,
itertools)? Is this a case of "well, I didn't vote for them, so I'm not
going to play ball"? If that's the case, then perhaps send and throw
should be retracted. I know that close is necessary when you move away
from the reference counting collector, so I'll fight to keep that; as
well as fight to get the rest of python to play ball with it. I haven't
seen a need for send or throw myself. I've played a lot with send and
it always seems to get too complicated, so I wouldn't fight for that
one. I can imagine possible uses for throw, but haven't hit them yet
myself in actual practice; so I'd only fight somewhat for throw. If
send/throw were mistakes, let's document that and urge people not to use
them and make a plan for deprecating them and removing them from the
language; and figure out what the right answers are.
But if send/throw/close were not mistakes and are done deals, then let's
support them! In all of these cases, adding full support for
send/throw/close does not require that you use any of them. It does not
prevent using simple iterators rather than full blown generators. It
does not diminish in any way the current capabilities of these other
language features. It simply supports and allows the use of
send/throw/close when needed. Otherwise, why did we put
send/throw/close into the language in the first place?
I would dearly love to see the for statement fully support close and
throw, since that's where you use generators 99% of the time. Maybe
this one needs different syntax to not break existing code. I'm not
very good with clever syntax, so you may be able to improve on these:
for i from gen(x):
for i finally in gen(x):
for i in gen(x) closing throwing:
for i in final gen(x):
for gen(x) yielding i:
for gen(x) as i:
The idea is that close should be called when the for loop terminates
(for any reason), and uncaught exceptions in the for body should be sent
to the generator using throw, and then only propagated outside of the
for statement if they are not handled by throw. And, yes, the for
statement should not do these things if a simple iterator is used rather
than a generator.
If you wanted to support the send method too, then maybe something like:
for gen1(x) | gen2(y) as i:
where the values yielded by gen1 are sent to gen2 with send, and then
the values yielded by gen2 are bound to i.
If this were adopted, I would also recommend that if gen2 were a
function rather than a generator, then the function be called on each
value yielded by gen1 and the results of the function bound to i. Then
for gen(x) | fun as i:
would be like:
for map(fun, gen(x)) as i:
Of course, this leads to simply using map rather | to combine generators
by making map use send if passed a generator as it's first argument:
for map(gen2(y), gen1(x)) as i:
But this doesn't scale as well syntactically when you want to chain
several generators together.
for map(gen3(z), map(gen2(y), gen1(x))) as i:
vs
for gen1(x) | gen2(y) | gen3(z) as i:
Unfortunately, the way that send is currently defined, gen2 can't skip
values to act as a filter or generate multiple values for one value sent
in. To do this would require that the operations of getting another
value sent in and yielding values be separated, rather than combined as
they are for send. One way to do this is to use callbacks for getting
another value. This could be done using the current next semantics by
simply treating the callback as an iterator and passing it as another
parameter to the generator:
for gen2(y, gen1(x)) as i:
This is exactly what's currently being done by the itertools functions.
But this also doesn't scale well syntactically when stacking up several
generators.
A better way would be to allow send and next to raise a new NextValue
exception when the generator wants another value sent in. Then a new
receive expression would be used in the generator to get the value.
This would act like an iterator within the generator:
def filter(pred):
for var in receive:
if pred(var):
yield var
which would be used like this down at the basic iterator level:
it = filter(some_pred)
for x in some_iterable:
try:
value = it.send(x)
while True:
process(value)
value = next(it)
except NextValue:
pass
and this would done automatically by the new for statement:
for some_iterable | filter(some_pred) as value:
process(value)
this also allows generators to generate multiple values for each value
received:
def repeat(n):
for var in receive:
for i in range(n):
yield var
for some_iterable | repeat(3) as value:
process(value)
With the new yield from syntax, your threesomes example becomes:
def threesomes():
yield from receive | repeat(3)
Or even just:
def threesomes():
return repeat(3)
Other functions can be done in this style too:
def map(fn):
for var in receive:
yield fn(var)
So that stacking these all up is much more readable syntactically:
for gen1(x) | filter(some_pred) | map(add_1) | threesomes() as i:
You have to admit that this is much more readable than:
for threesomes(map(add_1, filter(some_pred, gen1(x)))) as i:
-bruce frederiksen
More information about the Python-ideas
mailing list