[Python-ideas] Revised**5 PEP on yield-from

Jacob Holm jh at improva.dk
Sat Feb 21 03:00:51 CET 2009


Hi Greg

A few comments.  First, I think the suggested expansion is not quite right:
>     _i = iter(expr)
>     try:
>         _u = _i.next()
>         while 1:
>             try:
>                 _v = yield _u
>             except Exception, _e:
>                 if hasattr(_i, 'throw'):
>                     _i.throw(_e)
Shouldn't this be  "_u = _i.throw(e)" instead?

>                 else:
>                     raise
>             else:
>                 if hasattr(_i, 'send'):
>                     _u = _i.send(_v)
>                 else:
>                     _u = _i.next()
>     except StopIteration, _e:
>         _a = _e.args
>         if len(_a) > 0:
>             result = _a[0]
>         else:
>             result = None
>     finally:
>         if hasattr(_i, 'close'):
>             _i.close()


Second, I spend a looong time today thinking about how to make this fast.
>
> Optimisations
> -------------
>
> Using a specialised syntax opens up possibilities for optimisation
> when there is a long chain of generators.  Such chains can arise, for
> instance, when recursively traversing a tree structure.  The overhead
> of passing ``next()`` calls and yielded values down and up the chain
> can cause what ought to be an O(n) operation to become O(n\*\*2).
True, but as long as it is only chains of generators I can sketch a 
solution
that would give amortized constant time for finding the generator to resume.
Unfortunately it is easy to construct a situation where what we have is 
actually
a tree of generators, each using "yield from" its parent.  Simple example:

    def foo():
        yield 1
        yield 2
        yield 3

    def bar(it):
        yield from it

    root = foo()
    child1 = bar(root)
    child2 = bar(root)
    print child1.next() # prints 1, yielded from root
    print child2.next() # prints 2, yielded from root

In this example, both "child1" and "child2" are yielding from "root".

So what we are dealing with is a set of dynamic trees under the operations
"find_root" (to find the target when calling the generator methods), "link"
(when doing yield from) and "delete_root" (each time a generator is 
exhausted).
I strongly suspect that the lower bound for this problem is O(logn/loglogn)
per operation, and I can see a number of ways to do it in O(logn) time. 
I am
not sure any of them are fast enough to beat the simple O(n) solution in 
real
code though. (I will see if I can find a version that has a chance).

A thought just occurred to me...  would it be acceptable to disallow 
"yield from"
with a generator that is not "fresh"?  That would make "child2.next()" 
above illegal
and remove the problem with dealing with trees of generators.

>
> A possible strategy is to add a slot to generator objects to hold a
> generator being delegated to.  When a ``next()`` or ``send()`` call is
> made on the generator, this slot is checked first, and if it is
> nonempty, the generator that it references is resumed instead.  If it
> raises StopIteration, the slot is cleared and the main generator is
> resumed.
>
> This would reduce the delegation overhead to a chain of C function
> calls involving no Python code execution.  A possible enhancement would
> be to traverse the whole chain of generators in a loop and directly
> resume the one at the end, although the handling of StopIteration is
> more complicated then.
I hope you can get that second version to work.  All further 
optimization would
probably have to start with something like that anyway.

Best regards

  Jacob





More information about the Python-ideas mailing list