[Python-ideas] PEP 380 alternative: A yielding function
Anders J. Munch
2010 at jmunch.dk
Tue Jul 27 19:18:30 CEST 2010
Looking at PEP 380 (http://www.python.org/dev/peps/pep-0380/), the
need for yield forwarding syntax comes from the inability to delegate
yielding functionality in the usual way. For example, if you have a
recurring pattern including yields, like this (this is a toy example,
please don't take it for more than that):
if a:
yield a
if b:
yield b
you cannot do the extract function refactoring in the usual way:
def yield_if_true(x):
if x:
yield x
yield_if_true(a)
yield_if_true(b)
because yield_if_true would become a generator.
PEP 380 addresses this by making the workaround - "for x in
yield_if_true(a): yield x" - easier to write.
But suppose you could address the source instead? Suppose you could
write yield_if_true in such a way that it did not become a generator
despite yielding?
Syntactically this could be done with a yielding *function* in
addition to the yield statement/expression, either as a builtin or a
function in the sys module. Let's call it 'yield_' , for lack of a
better name. The function would yield the nearest generator on the
call stack.
Now the example would work with a slight modifiction:
def yield_if_true(x):
if x:
yield_(x)
yield_if_true(a)
yield_if_true(b)
The real benefits from a yield_ function come with recursive
functions. A recursive tree traversal that yields from the leaf nodes
currently suffers from a performance penalty: Every yield is repeated
as many times as the depth of the tree, turning a O(n) traversal
algorithm into an O(n lg(n)) algorithm. PEP 380 does not change that.
But a yield_ function could be O(1), regardless of the forwarding
depth.
To be fair, a clever implementation might be able to short-circuit a
'yield from' chain and achieve the same thing.
Two main drawbacks of yield_:
- Difficulty of implementation. Generators would need to keep an
entire stack branch alive, instead of merely a single frame, and if
that somehow affects the efficiency of simple generators, that would
be bad.
- 'the nearest generator on the call stack' is sort of a dynamic
scoping thing, which has its problems. For example, if you forget
to make the relevant function a generator (the "if 0: yield None"
trick might have been needed but was forgotten), then the yield
would trickle up to some random generator higher up, with confusing
results. And if you use yield_ in a callback, well, let's just say
that an interesting case too.
All the same, if a yield_ function is practical, I think it is a
better way to address the problems that motivate PEP 380.
I'm guessing you could implement 'yield from' as a pure-Python
function using yield_, making yield_ strictly more powerful, although
I couldn't say for sure as I haven't studied the enhanced generator
protocol.
regards, Anders
More information about the Python-ideas
mailing list