[Python-Dev] Re: Speed of test_sort.py
Thu, 01 Aug 2002 15:05:16 -0400
[Guido, pins the blame on PyFrame_New -- cool!]
> Suggestion: doesn't test_longexp create some frames with a very large
> number of local variables? Then PyFrame_New could spend a lot of time
> in this loop:
> while (--extras >= 0)
> f->f_localsplus[extras] = NULL;
In my poor man's profiling <wink>, I ran the self-contained test case posted
eariler under the debugger with REPS=120000, and since the "sort" part takes
20 seconds then, there was lots of opportunity to break at random times (the
MSVC debugger lets you do that, i.e. click a button that means "I don't care
where you are, break *now*"). It was always in that loop when it broke, and
extras always started life at 120000 before that loop. Yikes!
> There's a free list of frames, and PyFrame_New picks the first frame
> on the free list. It grows the space for locals if necessary, but it
> never shrinks it.
> Back to Tim -- does this make sense? Should we attempt to fix it?
I can't make sufficient time to think about this, but I suspect a principled
fix is simply to delete these two lines:
extras = f->ob_size;
The number of extras the code object actually needs was already computed
correctly earlier, via
extras = code->co_stacksize + code->co_nlocals + ncells + nfrees;
and there's no point clearing any more than that original value. IOW, I
don't think it hurts to have a big old frame left on the freelist, the pain
comes from clearing out more slots in it than the *current* code object
A quick test of this showed it cured the test_longexp + test_sort speed
problem, and the regression suite ran without problems.
If someone understands this code well enough to finish thinking about
whether that's a correct thing to do, please do!