[Python-3000] Function call speed (Was: Cleaning up argument listparsing)

Guido van Rossum guido at python.org
Wed Apr 19 12:35:03 CEST 2006


[I don't want to lose this message, but I don't have the time to read
it and do the complementary source research right now. So I'm
recommending you do some more research yourself.]

You're right that the API is very complex. However have you tried
stepping through some typical calls? There's a "fast path" that avoids
most of the overhead at least if no keywords or */** notation are
used, and I believe there's even a semi-fast path for when there are
keyword args or defaults but no */** in either the call or the def.

I recommend profiling. I seem to recall that last time we did this, we
found that creating the frame object is a significant cost factor.

--Guido

On 4/19/06, Talin <talin at acm.org> wrote:
> I wanted to follow up a bit on the issue of speeding up function calls with
> some more specifics.
>
> Currently, the function call mechanism has to do a lot of conversion
> operations on the function arguments. Specifically, it takes the various
> positional arguments, keyword arguments, varargs argument, and kwargs, and
> combines them into a single tuple and a single dict. It then calls the target
> function via PyObject_Call.
>
> On the recieving side, a different conversion takes place, depending on the
> type of the callable. For user-defined functions, it maps the various
> arguments in the tuple and dict into the function's parameter slots,
> along with considerations of default values and such.
>
> One way to speed this up would be to determine cases where the conversion
> steps can be skipped. There is already a method for this in the case where
> there are only positional arguments, no varargs or keywords.
>
> Ideally, we would like to be able to pass the caller's argument stack pretty
> much as-is to the reciever. At the moment, that consists of an array of
> positional arguments, followed by an array of key/value pairs for the keyword
> arguments, followed by an optional vararg or kwarg.
>
> Assume for a moment, then, that we had a variation of PyObject_Call which had
> a function signature that was exactly that:
>
>     PyObject_CallFast(
>         func,
>         PyObject **args, int numArgs,
>         PyObject **kargs, int numKwArgs,
>         PyObject *starargs,
>         PyObject *kwarg );
>
> The reciever would have to do extra work, because now it has to check two
> places for each argument. For a keyword argument, for example, it has to check
> both the key/value array and the kwargs dict argument.
>
> However, this is far less work than building a new dictionary. In addition, we
> know that the keywords are interned strings, not arbitrary objects, so we can
> speed up the comparison operation by avoiding type tests.
>
> Also, because the keyword args are in a linear list, they will exhibit better
> cache behavior, and the lookup will be very fast - a simple loop and test. (If
> you want to write optimal code for today's processors, you pretty much have to
> throw out everything you learned about unrolling loops and such, and learn to
> think in terms of cache lines, not bytes and words.)
>
> Now, it may be that in some cases, the reciever really does require a tuple
> and a dict. However, in such a case, the onus should be on the reciever to
> convert the argument list into the preferred form.
>
> So in other words, it would be the reciever's job to convert the arguments
> into whatever data structure the reciever required. For recievers with
> relatively simple calling signatures, it may mean no conversion is done at
> all, and the caller's argument list can be used directly. The most complex
> part would be validating the arguments - making sure that a keyword argument
> didn't get assigned more than once, and so on.
>
> What about older code? Well, I suppose you could add an additional field to
> the recieving object, indicating whether it accepted the traditional
> PyObject_Call-style args, or the newer format. Essentially you would have two
> function pointers in the object, one for newstyle calling, and one for
> oldstyle. If the newstyle was present, then the interpreter would use it,
> otherwise it would fall back to the older means of transforming the arguments
> into a tuple and a dict.
>
> In any case, the reason why this is "Py3K" material is that it involves
> busting the existing C API to some extent.
>
> -- Talin
>
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list