[Python-ideas] Revised**5 PEP on yield-from

Jacob Holm jh at improva.dk
Mon Mar 2 01:09:14 CET 2009


Greg Ewing wrote:
> Jacob Holm wrote:
>>
>> Arnaud Delobelle wrote:
>>
>>> * What you are measuring the time complexity of.
>>
>> The time for a single 'next', 'send', 'throw' or 'close' call to a 
>> generator or a single "yield from" expression,
>
> How are you measuring 'time', though? Usually when discussing
> time complexities we're talking about the number of some
> fundamental operation being performed, and assuming that
> all such operations take the same time. What operations
> are you counting here?
All operations are simple attribute lookups, assignments, addition and 
the like. No hashing, no dynamic arrays, no magic, just plain' ol' 
fundamental operations as they would be in a C program :)

>> One interesting thing to note is that all non-contrived examples
> > I have seen only build simple chains of iterators, and only do
> > that by adding one at a time in the deepest nested one. This is
> > the best possible case for my algorithm. If you stick to that,
> > the time per operation is O(1).
>
> I don't understand that limitation. If you keep a stack of
> active generators, you always have constant-time access to the
> one to be resumed next. 
But that is exactly the point. This is not a simple stack. After doing 
"yield from R" in generator A, you can go ahead and do "yield from R" in 
generator B as well. If R is also using "yield from" you will have the 
following situation:

A
  \
    R --- (whatever R is waiting for)
  /
B

Yes, A sees it as a stack [A,R,...], and B sees it as a stack [B,R,...], 
but the part of the stack following R is the same in the two.
If R does a "yield from", the new iterator must appear at the top of 
both "stacks". Efficiently getting to the top of this "shared stack", 
and doing the equivalent of push, pop, ... is what I am trying to 
achieve. And I think I have succeeded.

> There's some overhead for pushing and
> popping the stack, but it follows exactly the same pattern
> as the recursive calls you'd be making if you weren't using
> some kind of yield-from, so it's irrelevant when comparing the
> two.
>
This would be correct if it was somehow forbidden to create the scenario 
I sketched above. As long as that scenario is possible I can construct 
an example where treating it as a simple stack will either do the wrong 
thing, or do the right thing but slower than a standard "for v in it: 
yield v".

Of course, you could argue that these examples are contrived and we 
don't have to care about them. I just think that we should do better if 
we can.

FWIW, here is the example from above in more detail...

def R():
    yield 'A'
    yield 'B'
    yield from xrange(3)
    # we could do anything here really...

r = R()

def A():
    yield from r

a = A()

def B()
    yield from r

b = B()

a.next() # returns 'A'
b.next() # returns 'B'
r.next() # returns 0
a.next() # returns 1
b.next() # returns 2


This is clearly legal based on the PEP, and it generalizes to a way of 
building an arbitrary tree with suspended generators as nodes. What my 
algorithm does is break this tree down into what is essentially a set of 
stacks. Managed such that the common case has just a single stack, and 
no matter how twisted your program is, the path from any generator to 
the root is split across at most O(logN) stacks (where N is the size of 
the tree).

I hope this helps to explain what I think I have done...

Best regards

Jacob



More information about the Python-ideas mailing list