[capi-sig] running C method, benchmarks
python_capi at behnel.de
Wed Dec 24 07:46:58 CET 2008
Campbell Barton wrote:
> Discussing if its worth moving Py/C functions from METH_VARARGS to
> METHO when they only recieve 1 argument on the PyGame mailing list.
> tested different ways to evaluate args to see how much speed
> difference there was
> * 10,000,000 tests, python 2.6 on 32bit arch linux
> * included a pass and NOARGS metrhod to see the difference in overhead
> of the loop and parsing an arg compared to running a method with no
> ---- output
> pass 1.85659885406
> METH_NOARGS 3.24079704285
> METH_O 3.66321516037
> METH_VARARGS 6.09881997108
> METH_KEYWORDS 6.037307024
> METH_KEYWORDS (as keyword) 10.9263861179
I tried doing something similar in Cython, but it's not directly
comparable. Cython uses optimised code instead of a generic call to
ParseTupleAndKeywords and it will always give you a METH_O function when
you only use one argument. Anyway, here are the numbers. I used the latest
Cython developer version with gcc 4.1.3 on Linux.
def f0(): pass # METH_NOARGS
def f1(a): pass # METH_O
def f1opt(a=1): pass # METH_VARARGS|METH_KEYWORDS
def f2(a,b): pass # METH_VARARGS|METH_KEYWORDS
def f2opt(a=1,b=2): pass # METH_VARARGS|METH_KEYWORDS
$ python2.5 -m timeit -s '...; from calltest import f0' 'f0()'
10000000 loops, best of 3: 0.126 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt()'
10000000 loops, best of 3: 0.14 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt()'
10000000 loops, best of 3: 0.141 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)'
10000000 loops, best of 3: 0.145 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,2)'
1000000 loops, best of 3: 0.225 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,b=2)'
1000000 loops, best of 3: 0.489 usec per loop
I used Python 2.5.1 as the ihooks module in 2.6.1 is still broken, so
pyximport doesn't work (and I was too lazy to build the module by hand).
Note how f2opt is not much slower than f1opt (both METH_KEYWORDS), which in
turn is still faster then the METH_O function f1.
So my suggestion is that the main reasons for your METH_O function being
faster above are a) that you actually /pass/ arguments, i.e. Python's own
argument passing overhead, and b) the use of ParseTupleAndKeywords() in
your other functions above, which is very fast, but also very generic.
Cython's dedicated argument parsing code is a lot faster in most cases.
Could you repeat your benchmarks using timeit on 2.5 as I do above? That
would give us comparable numbers.
More information about the capi-sig