On Fri, Oct 19, 2012 at 2:15 PM, Mark Shannon <mark@hotpy.org> wrote:
On 19/10/12 13:55, Christian Tismer wrote:
Hi Nick,
On 16.10.12 03:49, Nick Coghlan wrote:
On Tue, Oct 16, 2012 at 10:44 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
My original implementation of yield-from actually *did* avoid this, by keeping a C-level pointer chain of yielding-from frames. But that part was ripped out at the last minute when someone discovered that it had a detrimental effect on tracebacks.
There are probably other ways the traceback problem could be fixed, so maybe we will get this optimisation back one day.
Ah, I thought I remembered something along those lines. IIRC, it was a bug report on one of the alphas that prompted us to change it.
I was curious and searched quite a lot. It was v3.3.0a1 from 2012-03-15 as a reaction to #14230 and #14220 from Marc Shannon, patched by Benjamin.
Now I found the original implementation. That looks very much as I'm thinking it should be.
Quite a dramatic change which works well, but really seems to remove what I would call "now I can emulate most of Stackless" efficiently.
Maybe I should just try to think it would be implemented as before, build an abstraction and just use it for now.
I will spend my time at PyCon de for sprinting on "yield from".
The current implementation may not be much slower than Greg's original version. One of the main costs of making a call is the creation of a new frame. But calling into a generator does not need a new frame, so the cost will be reduced. Unless anyone has evidence to the contrary :)
Rather than increasing the performance of this special case, I would suggest that improving the performance of calls & returns in general would be a more worthwhile goal. Calls and returns ought to be cheap.
I did a basic timing test using a simple recursive function and a recursive PEP-380 coroutine computing the same value (see attachment). The coroutine version is a little over twice as slow as the function version. I find that acceptable. This went 20 deep, making 2 recursive calls at each level (except at the deepest level). Output on my MacBook Pro: plain 2097151 0.5880069732666016 coro. 2097151 1.2958409786224365 This was a Python 3.3 built a few days ago from the 3.3 branch. -- --Guido van Rossum (python.org/~guido)