[Python-Dev] Speed up function calls
Raymond Hettinger
python at rcn.com
Mon Jan 24 09:11:05 CET 2005
[Neal Norwitz]
> I would like feedback on whether the approach is desirable.
>
> The patch adds a new method type (flags) METH_ARGS that is used in
> PyMethodDef. METH_ARGS means the min and max # of arguments are
> specified in the PyMethodDef by adding 2 new fields. This information
> can be used in ceval to
> call the method. No tuple packing/unpacking is required since the C
> stack is used.
>
> The benefits are:
> * faster function calls
> * simplify function call machinery by removing METH_NOARGS, METH_O,
> and possibly METH_VARARGS
> * more introspection info for C functions (ie, min/max arg count)
> (not implemented)
An additional benefit would be improving the C-API by allowing C calls
without creating temporary argument tuples. Also, some small degree of
introspection becomes possible when a method knows its own arity.
Replacing METH_O and METH_NOARGS seems straight-forward, but
METH_VARARGS has much broader capabilities. How would you handle the
simple case of "O|OO"? How could you determine useful default values
(NULL, 0, -1, -909, etc.)?
If you solve the default value problem, then please also try to come up
with a better flag name than METH_ARGS which I find to be indistinct
from METH_VARARGS and also not very descriptive of its functionality.
Perhaps something like METH_UNPACKED would be an improvement.
> The drawbacks are:
> * the defn of the MethodDef (# args) is separate from the function
defn
> * potentially more error prone to write C methods???
No worse than with METH_O or METH_NOARGS.
> I've measured between 13-22% speed improvement (debug build on
> Operton) when doing simple tests like:
>
> ./python ./Lib/timeit.py -v 'pow(3, 5)'
>
> I think the difference tends to be fairly constant at about .3 usec
per
> loop.
If speed is the main advantage being sought, it would be worthwhile to
conduct more extensive timing tests with a variety of code and not using
a debug build. Running test.test_decimal would be a useful overall
benchmark.
In theory, I don't see how you could improve on METH_O and METH_NOARGS.
The only saving is the time for the flag test (a predictable branch).
Offsetting that savings is the additional time for checking min/max args
and for constructing a C call with the appropriate number of args. I
suspect there is no savings here and that the timings will get worse.
In all likelihood, the only real opportunity for savings is replacing
METH_VARARGS in cases that have already been sped-up using
PyTuple_Unpack(). Those can be further improved by eliminating the time
to build and unpack the temporary argument tuple.
Even then, I don't see how to overcome the need to set useful default
values for optional object arguments.
Raymond Hettinger
More information about the Python-Dev
mailing list