running C method, benchmarks

Discussing if its worth moving Py/C functions from METH_VARARGS to METHO when they only recieve 1 argument on the PyGame mailing list. Crossposting here since some of you might be interested in the results.
tested different ways to evaluate args to see how much speed difference there was
- 10,000,000 tests, python 2.6 on 32bit arch linux
- included a pass and NOARGS metrhod to see the difference in overhead of the loop and parsing an arg compared to running a method with no args.
---- output pass 1.85659885406 METH_NOARGS 3.24079704285 METH_O 3.66321516037 METH_VARARGS 6.09881997108 METH_KEYWORDS 6.037307024 METH_KEYWORDS (as keyword) 10.9263861179
------- Python Script, (used blender module for testing)
import time
from Blender.sys import test_METHO, test_METH_VARARGS, test_METH_KEYWORDS, test_METH_NOARGS
RUN = 10000000
t = time.time() for i in xrange(RUN): pass print 'pass', time.time()-t
t = time.time() for i in xrange(RUN): test_METH_NOARGS() print 'METH_NOARGS', time.time()-t
t = time.time() for i in xrange(RUN): test_METHO(1) print 'METH_O', time.time()-t
t = time.time() for i in xrange(RUN): test_METH_VARARGS(1) print 'METH_VARARGS', time.time()-t
t = time.time() for i in xrange(RUN): test_METH_KEYWORDS(1) print 'METH_KEYWORDS', time.time()-t
t = time.time() for i in xrange(RUN): test_METH_KEYWORDS(val=1) print 'METH_KEYWORDS (as keyword)', time.time()-t
------------------- C functions
static PyObject *test_METHO( PyObject * self, PyObject * value ) { int val = (int)PyInt_AsLong(value);
if( val==-1 && PyErr_Occurred() ) {
PyErr_SetString(PyExc_AttributeError, "not an int");
return NULL;
}
Py_RETURN_NONE;
}
static PyObject *test_METH_VARARGS( PyObject * self, PyObject * args ) { int val;
if( !PyArg_ParseTuple( args, "i", &val ) )
return NULL;
Py_RETURN_NONE;
}
static PyObject *test_METH_KEYWORDS( PyObject * self, PyObject * args, PyObject *kwd) { int val; static char *kwlist[] = {"val", NULL};
if( !PyArg_ParseTupleAndKeywords(args, kwd, "i", kwlist, &val) )
return NULL;
Py_RETURN_NONE;
}
static PyObject *test_METH_NOARGS( PyObject * self, PyObject * args ) { Py_RETURN_NONE; }
struct PyMethodDef M_sys_methods[] = { {"test_METHO", test_METHO, METH_O, ""}, {"test_METH_KEYWORDS", test_METH_KEYWORDS, METH_KEYWORDS, ""}, {"test_METH_NOARGS", test_METH_NOARGS, METH_NOARGS, ""}, {"test_METH_VARARGS", test_METH_VARARGS, METH_VARARGS, ""}, {NULL, NULL, 0, NULL} };
--
- Campbell

Hi,
Campbell Barton wrote:
Discussing if its worth moving Py/C functions from METH_VARARGS to METHO when they only recieve 1 argument on the PyGame mailing list.
tested different ways to evaluate args to see how much speed difference there was
- 10,000,000 tests, python 2.6 on 32bit arch linux
- included a pass and NOARGS metrhod to see the difference in overhead of the loop and parsing an arg compared to running a method with no args.
---- output pass 1.85659885406 METH_NOARGS 3.24079704285 METH_O 3.66321516037 METH_VARARGS 6.09881997108 METH_KEYWORDS 6.037307024 METH_KEYWORDS (as keyword) 10.9263861179
I tried doing something similar in Cython, but it's not directly comparable. Cython uses optimised code instead of a generic call to ParseTupleAndKeywords and it will always give you a METH_O function when you only use one argument. Anyway, here are the numbers. I used the latest Cython developer version with gcc 4.1.3 on Linux.
Benchmarked code:
def f0(): pass # METH_NOARGS def f1(a): pass # METH_O def f1opt(a=1): pass # METH_VARARGS|METH_KEYWORDS def f2(a,b): pass # METH_VARARGS|METH_KEYWORDS def f2opt(a=1,b=2): pass # METH_VARARGS|METH_KEYWORDS
Benchmarks:
$ python2.5 -m timeit -s '...; from calltest import f0' 'f0()' 10000000 loops, best of 3: 0.126 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt()' 10000000 loops, best of 3: 0.14 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt()' 10000000 loops, best of 3: 0.141 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)' 10000000 loops, best of 3: 0.145 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,2)' 1000000 loops, best of 3: 0.225 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,b=2)' 1000000 loops, best of 3: 0.489 usec per loop
I used Python 2.5.1 as the ihooks module in 2.6.1 is still broken, so pyximport doesn't work (and I was too lazy to build the module by hand).
Note how f2opt is not much slower than f1opt (both METH_KEYWORDS), which in turn is still faster then the METH_O function f1.
So my suggestion is that the main reasons for your METH_O function being faster above are a) that you actually /pass/ arguments, i.e. Python's own argument passing overhead, and b) the use of ParseTupleAndKeywords() in your other functions above, which is very fast, but also very generic. Cython's dedicated argument parsing code is a lot faster in most cases.
Could you repeat your benchmarks using timeit on 2.5 as I do above? That would give us comparable numbers.
Stefan

Stefan Behnel wrote:
Campbell Barton wrote:
Discussing if its worth moving Py/C functions from METH_VARARGS to METHO when they only recieve 1 argument on the PyGame mailing list.
tested different ways to evaluate args to see how much speed difference there was
- 10,000,000 tests, python 2.6 on 32bit arch linux
- included a pass and NOARGS metrhod to see the difference in overhead of the loop and parsing an arg compared to running a method with no args.
---- output pass 1.85659885406 METH_NOARGS 3.24079704285 METH_O 3.66321516037 METH_VARARGS 6.09881997108 METH_KEYWORDS 6.037307024 METH_KEYWORDS (as keyword) 10.9263861179
I tried doing something similar in Cython, but it's not directly comparable. Cython uses optimised code instead of a generic call to ParseTupleAndKeywords and it will always give you a METH_O function when you only use one argument. Anyway, here are the numbers. I used the latest Cython developer version with gcc 4.1.3 on Linux.
Benchmarked code:
def f0(): pass # METH_NOARGS def f1(a): pass # METH_O def f1opt(a=1): pass # METH_VARARGS|METH_KEYWORDS def f2(a,b): pass # METH_VARARGS|METH_KEYWORDS def f2opt(a=1,b=2): pass # METH_VARARGS|METH_KEYWORDS
Benchmarks:
$ python2.5 -m timeit -s '...; from calltest import f0' 'f0()' 10000000 loops, best of 3: 0.126 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt()' 10000000 loops, best of 3: 0.14 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt()' 10000000 loops, best of 3: 0.141 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)' 10000000 loops, best of 3: 0.145 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,2)' 1000000 loops, best of 3: 0.225 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,b=2)' 1000000 loops, best of 3: 0.489 usec per loop
I noticed that I forgot to run one interesting test, which is calling f2opt() with a single argument in comparison to calling f1() with one argument:
$ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)' 10000000 loops, best of 3: 0.145 usec per loop $ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt(1)' 1000000 loops, best of 3: 0.204 usec per loop $ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt(1)' 1000000 loops, best of 3: 0.204 usec per loop
So, yes, this actually is slower than the METH_O function f1(). It's not 66% as in your example, more like 40%, but it definitely is a lot slower. So I would say that the overhead of calling a METH_VARARGS|METH_KEYWORDS function versus a METH_O function is somewhere in the order of 40% for the case that only positional arguments are involved.
Stefan
participants (2)
-
Campbell Barton
-
Stefan Behnel