[Python-ideas] Revised**5 PEP on yield-from
Jacob Holm
jh at improva.dk
Sun Mar 1 13:30:36 CET 2009
Hi Arnaud
Arnaud Delobelle wrote:
>> I have now worked out the details, and it is indeed possible to get
>> O(1) for simple cases and amortized O(logN)
>> in general, all with fairly low constants.
>
> I'm sorry if I'm missing something obvious, but there are two things I
> can't work out:
I am glad you asked. Reading it again, I can see that this is definitely
not obvious.
>
> * What you are measuring the time complexity of.
The time for a single 'next', 'send', 'throw' or 'close' call to a
generator or a single "yield from" expression, excluding the time spent
running the user-defined code in it. (Does that make sense?) In other
words, the total overhead of finding the actual user code to run and
handling the return values/exceptions according to the PEP.
> * What N stands for.
>
> I suspect that N is the 'delegation depth': the number of yield-from
> that have to be gone through. I imagine that you are measuring the
> time it takes to get the next element in the generator. These are
> guesses - can you correct me?
N is the total number of suspended generators in the tree(s) of
generators involved in the operation. Remember that it is possible to
have multiple generators yield from the same 'parent' generator. The
'delegation depth' would be the height of that tree.
One interesting thing to note is that all non-contrived examples I have
seen only build simple chains of iterators, and only do that by adding
one at a time in the deepest nested one. This is the best possible case
for my algorithm. If you stick to that, the time per operation is O(1).
If we decided to only allow that, the algorithm can be simplified
significantly.
>
>> I have implemented the tree structure as a python module and added a
>> trampoline-based pure-python implementation of "yield-from" to try it
>> out.
>>
>> It seems that this version beats a normal "for v in it: yield v" when
>> the delegation chains get around 90 generators deep.
>
> Can you give an example?
>
Sure, here is the simple code I used for timing the 'next' call
import itertools, time
from yieldfrom import uses_from, from_ # my module...
@uses_from
def child(it):
yield from_(it)
def child2(it):
for i in it:
yield i
def longchain(N):
it = itertools.count()
for i in xrange(N):
it = child(it) # replace this with child2 to test the current
# "for v in it: yield v" pattern.
it.next() # we are timing the 'next' calls (not the setup of
# the chain) so skip the setup by calling next once.
return it
it = longchain(90)
times = []
for i in xrange(10):
t1 = time.time()
for i in xrange(100000):
it.next()
t2 = time.time()
times.append(t2-t1)
print min(times)
This version takes about the same time whether you use child or child2
to build the chain. However, the version using yield..from takes the
same time no matter how long the chain is, while the time for the "for v
in it: yield v" version is linear in the length of the chain, and so
would lose big-time for longer chains.
I hope this answers your questions.
Regards
Jacob
More information about the Python-ideas
mailing list