[Python-Dev] Idea for a fast calling convention

Phillip J. Eby pje at telecommunity.com
Sat Feb 28 10:53:54 EST 2004


At 12:39 PM 2/28/04 +0000, Michael Hudson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> writes:
>
> > Maybe there should instead be a tp_call_stack slot.  Then the various
> > CALL opcodes would call that slot instead of tp_call.  C API calls
> > would still go through tp_call.
>
>People *really* should look at the patch I mentioned...

Ah, I see you've implemented this already.  :)


> > In practice, though, I expect it would be faster to do as Jython and
> > IronPython have done, and define a set of tp_call1, tp_call2,
> > etc. slots that are optimized for specific calling situations,
> > allowing C API calls to be sped up as well, provided you used things
> > like PyObject_Call1(ob,arg), PyObject_Call2(ob,arg1,arg2), and so on.
>
>I think this only really helps when you have a JIT compiler of some
>sort?

No, all you need is a mechanism similar to the one in your patch, where the 
creation of a method or function object includes the selecting of a 
function pointer for each of the call signatures.  So, when creating these 
objects, you'd select a call0 pointer, a call1 pointer, and so on.  If the 
function requires more arguments than provided for that slot, you just 
point it to an "insufficient args" version of the function.  If the 
function has defaults that would be filled in, you point it to a version 
that fills in the defaults, and so on.

So, there's not a JIT required, but making the patch could be tedious, 
repetitive and error prone.  Macros could possibly help a bit with the 
repetition and tedium, although not necessarily the error-prone part.  :)


> > Perhaps there is some information that can be gleaned from the Jython
> > research as to what are the most common number of positional
> > parameters for calls.
>
>That's easy: 0 then 1 then 2 then 3 then insignificant.

Actually, I'd find that a bit surprising, as I would expect 1-arg calls to 
be more popular than 0-argument calls.  But I guess if you look at a method 
call, it could be a 0-argument call to the method, that maps to a 
1-argument call on the function.  But, I guess that would at any rate put 
0-argument calls in the running.

Anyway, I guess a 0-argument call for an instancemethod would look 
something like:

PyObject *
method_call0(PyObject *self) {
     if (self->im_self) {
         return PyObject_Call1(self->im_func, self->im_self);
     } else {
         return PyObject_Call0(self->im_func);
     }
}

And then you'd repeat this pattern for call1 and call2, but call3 would 
have to fall back to creating an argument tuple if we're only going up to 
call3 slots.

Interestingly, the patterns involved in making these functions are 
basically left currying (methods) and right currying (default args).  It 
may be that there's some way to generalize this, perhaps by using wrapping 
objects.  On the other hand, that would tend to go against the intended 
speedup.



>   ARTHUR:  Don't ask me how it works or I'll start to whimper.
>                    -- The Hitch-Hikers Guide to the Galaxy, Episode 11

Heh.  Did you pick this quote intentionally?  :)




More information about the Python-Dev mailing list