[Python-ideas] Revised**5 PEP on yield-from
Jacob Holm
jh at improva.dk
Sat Feb 21 03:00:51 CET 2009
Hi Greg
A few comments. First, I think the suggested expansion is not quite right:
> _i = iter(expr)
> try:
> _u = _i.next()
> while 1:
> try:
> _v = yield _u
> except Exception, _e:
> if hasattr(_i, 'throw'):
> _i.throw(_e)
Shouldn't this be "_u = _i.throw(e)" instead?
> else:
> raise
> else:
> if hasattr(_i, 'send'):
> _u = _i.send(_v)
> else:
> _u = _i.next()
> except StopIteration, _e:
> _a = _e.args
> if len(_a) > 0:
> result = _a[0]
> else:
> result = None
> finally:
> if hasattr(_i, 'close'):
> _i.close()
Second, I spend a looong time today thinking about how to make this fast.
>
> Optimisations
> -------------
>
> Using a specialised syntax opens up possibilities for optimisation
> when there is a long chain of generators. Such chains can arise, for
> instance, when recursively traversing a tree structure. The overhead
> of passing ``next()`` calls and yielded values down and up the chain
> can cause what ought to be an O(n) operation to become O(n\*\*2).
True, but as long as it is only chains of generators I can sketch a
solution
that would give amortized constant time for finding the generator to resume.
Unfortunately it is easy to construct a situation where what we have is
actually
a tree of generators, each using "yield from" its parent. Simple example:
def foo():
yield 1
yield 2
yield 3
def bar(it):
yield from it
root = foo()
child1 = bar(root)
child2 = bar(root)
print child1.next() # prints 1, yielded from root
print child2.next() # prints 2, yielded from root
In this example, both "child1" and "child2" are yielding from "root".
So what we are dealing with is a set of dynamic trees under the operations
"find_root" (to find the target when calling the generator methods), "link"
(when doing yield from) and "delete_root" (each time a generator is
exhausted).
I strongly suspect that the lower bound for this problem is O(logn/loglogn)
per operation, and I can see a number of ways to do it in O(logn) time.
I am
not sure any of them are fast enough to beat the simple O(n) solution in
real
code though. (I will see if I can find a version that has a chance).
A thought just occurred to me... would it be acceptable to disallow
"yield from"
with a generator that is not "fresh"? That would make "child2.next()"
above illegal
and remove the problem with dealing with trees of generators.
>
> A possible strategy is to add a slot to generator objects to hold a
> generator being delegated to. When a ``next()`` or ``send()`` call is
> made on the generator, this slot is checked first, and if it is
> nonempty, the generator that it references is resumed instead. If it
> raises StopIteration, the slot is cleared and the main generator is
> resumed.
>
> This would reduce the delegation overhead to a chain of C function
> calls involving no Python code execution. A possible enhancement would
> be to traverse the whole chain of generators in a loop and directly
> resume the one at the end, although the handling of StopIteration is
> more complicated then.
I hope you can get that second version to work. All further
optimization would
probably have to start with something like that anyway.
Best regards
Jacob
More information about the Python-ideas
mailing list