[Python-Dev] Store x Load x --> DupStore

Sun Feb 20 20:00:13 CET 2005

"Phillip J. Eby" <pje at telecommunity.com> writes:

> At 08:17 AM 2/20/05 -0800, Guido van Rossum wrote:
>>Where are the attempts to speed up function/method calls? That's an
>>area where we could *really* use a breakthrough...
>
> Amen!
>
> So what happened to Armin's pre-allocated frame patch?  Did that get into 2.4?

No, because it slows down recursive function calls, or functions that
happen to be called at the same time in different threads.  Fixing
*that* would require things like code specific frame free-lists and
that's getting a bit convoluted and might waste quite a lot of memory.

Eliminating the blockstack would be nice (esp. if it's enough to get
frames small enough that they get allocated by PyMalloc) but this
seemed to be tricky too (or at least Armin, Samuele and I spent a
cuple of hours yakking about it on IRC and didn't come up with a clear
approach).  Dynamically allocating the blockstack would be simpler,
and might acheive a similar win.  (This is all from memory, I haven't
thought about specifics in a while).

> Also, does anybody know where all the time goes in a function call,
> anyway?

I did once...

> I assume that some of the pieces are:
>
> * tuple/dict allocation for arguments (but some of this is bypassed on
>   the fast branch for Python-to-Python calls, right?)

All of it, in easy cases.  ISTR that the fast path could be a little
wider -- it bails when the called function has default arguments, but
I think this case could be handled easily enough.

> * frame allocation and setup (but Armin's patch was supposed to
>   eliminate most of this whenever a function isn't being used
>   re-entrantly)

Ah, you remember the wart :) I think even with the patch, frame setup
is a significant amount of work.  Why are frames so big?

> * argument "parsing" (check number of args, map kwargs to their
>   positions, etc.; but isn't some of this already fast-pathed for
>   Python-to-Python calls?)

Yes.  With some effort you could probably avoid a copy (and incref) of
the arguments from the callers to the callees stack area.  BFD.

> I suppose the fast branch fixes don't help special methods like
> __getitem__ et al, since those don't go through the fast branch, but I
> don't think those are the majority of function calls.

Indeed.  I suspect this fails the effort/benefit test, but I could be
wrong.

> And whatever happened to CALL_METHOD?

It didn't work as an optimization, as far as I remember.  I think the
patch is on SF somewhere.  Or is a branch in CVS?  Oh, it's patch
#709744.

> Do we need a tp_callmethod that takes an argument array, length, and
> keywords, so that we can skip instancemethod allocation in the
> common case of calling a method directly?

Hmm, didn't think of that, and I don't think it's how the CALL_ATTR
attempt worked.  I presume it would need to take a method name too :)

I already have a patch that does this for regular function calls (it's
a rearrangement/refactoring not an optimization though).

Cheers,
mwh

-- 
  I think perhaps we should have electoral collages and construct
  our representatives entirely of little bits of cloth and papier 
  mache.          -- Owen Dunn, ucam.chat, from his review of the year