[Python-Dev] speeding function calls (a little)

Michael Hudson mwh at python.net
Thu Dec 18 10:45:52 EST 2003

On Thursday, Oct 30, 2003, at 05:24 Europe/Stockholm, Jeremy Hylton 

[function calls]

> There is an optimization that depends on having no default arguments 
> (or
> keyword arguments or free variables).

Why does it depend on not having default arguments?  If you supply the 
right number of arguments (something that's obviously already checked) 
why does the function having defaults make a jot of difference?

>  It copies the arguments directly
> from the caller's frame into the callee's frame without creating an
> argument tuple.
> It's interesting to avoid the copy from caller to callee, but I don't
> think it's a big cost relative to everything else we're doing to set up
> a frame for calling.  (I expect the number of arguments is usually
> small.)  You would need some way to encode what variables are loaded
> from the caller stack and what variables are loaded from the current
> frame.  Either a different opcode or some kind of flag in the current
> LOAD/STORE argument.

As I think Phillip managed to convince himself recently, some kind of 
JIT functionality seems to be needed to do function calls really 

I wonder if libffi does enough... it would be nice if the body of 
CALL_FUNCTION could look a bit like this:

  x = POP()
PUSH(((some_cast_or_other)x)(f, stack_pointer, oparg))

Gah, this doesn't really seem to work out, on thinking about it.  Wins 
seem more likely to come from knowing with some certainly at the call 
site that you've not messed the arguments up (and so we're back to 
wanting a JIT, it seems to me).

> One other possibility for optimization is this XXX comment in
> fast_function():
> 		/* XXX Perhaps we should create a specialized
> 		   PyFrame_New() that doesn't take locals, but does
> 		   take builtins without sanity checking them.
> 		*/
> 		f = PyFrame_New(tstate, co, globals, NULL);
> PyFrame_New() does a fair amount of work that is unnecessary in the
> common case.

Fair amount?  I have a patch that gets ~1.5% on pystone along these 
lines, but it's a bit scary (makes a "lightweight-frame" subclass that 
assumes more about things on it's freelist, and so on).  I'm not sure 
the modest gains are worth the complexity, but I'll post it to SF...


More information about the Python-Dev mailing list