
Hi,
Campbell Barton wrote:
Discussing if its worth moving Py/C functions from METH_VARARGS to METHO when they only recieve 1 argument on the PyGame mailing list.
tested different ways to evaluate args to see how much speed difference there was
- 10,000,000 tests, python 2.6 on 32bit arch linux
- included a pass and NOARGS metrhod to see the difference in overhead of the loop and parsing an arg compared to running a method with no args.
---- output pass 1.85659885406 METH_NOARGS 3.24079704285 METH_O 3.66321516037 METH_VARARGS 6.09881997108 METH_KEYWORDS 6.037307024 METH_KEYWORDS (as keyword) 10.9263861179
I tried doing something similar in Cython, but it's not directly comparable. Cython uses optimised code instead of a generic call to ParseTupleAndKeywords and it will always give you a METH_O function when you only use one argument. Anyway, here are the numbers. I used the latest Cython developer version with gcc 4.1.3 on Linux.
Benchmarked code:
def f0(): pass # METH_NOARGS def f1(a): pass # METH_O def f1opt(a=1): pass # METH_VARARGS|METH_KEYWORDS def f2(a,b): pass # METH_VARARGS|METH_KEYWORDS def f2opt(a=1,b=2): pass # METH_VARARGS|METH_KEYWORDS
Benchmarks:
$ python2.5 -m timeit -s '...; from calltest import f0' 'f0()' 10000000 loops, best of 3: 0.126 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt()' 10000000 loops, best of 3: 0.14 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt()' 10000000 loops, best of 3: 0.141 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)' 10000000 loops, best of 3: 0.145 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,2)' 1000000 loops, best of 3: 0.225 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,b=2)' 1000000 loops, best of 3: 0.489 usec per loop
I used Python 2.5.1 as the ihooks module in 2.6.1 is still broken, so pyximport doesn't work (and I was too lazy to build the module by hand).
Note how f2opt is not much slower than f1opt (both METH_KEYWORDS), which in turn is still faster then the METH_O function f1.
So my suggestion is that the main reasons for your METH_O function being faster above are a) that you actually /pass/ arguments, i.e. Python's own argument passing overhead, and b) the use of ParseTupleAndKeywords() in your other functions above, which is very fast, but also very generic. Cython's dedicated argument parsing code is a lot faster in most cases.
Could you repeat your benchmarks using timeit on 2.5 as I do above? That would give us comparable numbers.
Stefan