[Python-ideas] Revised**5 PEP on yield-from

Sun Mar 1 13:30:36 CET 2009

Hi Arnaud

Arnaud Delobelle wrote:
>> I have now worked out the details, and it is indeed possible to get 
>> O(1) for simple cases and amortized O(logN)
>> in general, all with fairly low constants.
>
> I'm sorry if I'm missing something obvious, but there are two things I 
> can't work out:
I am glad you asked. Reading it again, I can see that this is definitely 
not obvious.

>
> * What you are measuring the time complexity of.
The time for a single 'next', 'send', 'throw' or 'close' call to a 
generator or a single "yield from" expression, excluding the time spent 
running the user-defined code in it. (Does that make sense?) In other 
words, the total overhead of finding the actual user code to run and 
handling the return values/exceptions according to the PEP.

> * What N stands for.
>
> I suspect that N is the 'delegation depth': the number of yield-from 
> that have to be gone through. I imagine that you are measuring the 
> time it takes to get the next element in the generator. These are 
> guesses - can you correct me?
N is the total number of suspended generators in the tree(s) of 
generators involved in the operation. Remember that it is possible to 
have multiple generators yield from the same 'parent' generator. The 
'delegation depth' would be the height of that tree.

One interesting thing to note is that all non-contrived examples I have 
seen only build simple chains of iterators, and only do that by adding 
one at a time in the deepest nested one. This is the best possible case 
for my algorithm. If you stick to that, the time per operation is O(1). 
If we decided to only allow that, the algorithm can be simplified 
significantly.

>
>> I have implemented the tree structure as a python module and added a 
>> trampoline-based pure-python implementation of "yield-from" to try it 
>> out.
>>
>> It seems that this version beats a normal "for v in it: yield v" when 
>> the delegation chains get around 90 generators deep.
>
> Can you give an example?
>
Sure, here is the simple code I used for timing the 'next' call

import itertools, time
from yieldfrom import uses_from, from_ # my module...

@uses_from
def child(it):
yield from_(it)

def child2(it):
for i in it:
yield i

def longchain(N):
it = itertools.count()
for i in xrange(N):
it = child(it) # replace this with child2 to test the current
# "for v in it: yield v" pattern.
it.next() # we are timing the 'next' calls (not the setup of
# the chain) so skip the setup by calling next once.
return it

it = longchain(90)

times = []
for i in xrange(10):
t1 = time.time()
for i in xrange(100000):
it.next()
t2 = time.time()
times.append(t2-t1)
print min(times)

This version takes about the same time whether you use child or child2 
to build the chain. However, the version using yield..from takes the 
same time no matter how long the chain is, while the time for the "for v 
in it: yield v" version is linear in the length of the chain, and so 
would lose big-time for longer chains.

I hope this answers your questions.

Regards

Jacob