[New-bugs-announce] [issue27128] Add _PyObject_FastCall()
STINNER Victor
report at bugs.python.org
Thu May 26 06:15:56 EDT 2016
New submission from STINNER Victor:
Since the issue #26814 proved that avoiding the creation of temporary tuples to call Python and C functions makes Python faster (between 2% and 29% depending on the benchmark), I extracted a first "minimal" patch to start merging this work.
The first patch adds new functions:
* PyObject_CallNoArg(func) and PyObject_CallArg1(func, arg): public functions
* _PyObject_FastCall(func, args, nargs, kwargs): private function
I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read once that using "int" can cause performance issues on a loop using "i++" and "data[i]" because the compiler has to handle integer overflow of the int type.
The "int" type is also annoying on Windows 64-bit, it causes compiler warnings on downcast like PyTuple_GET_SIZE(co->co_argcount) stored into a C int.
_PyObject_FastCall() avoids the creation of tuple for:
* All Python functions (PyFunction_Check)
* C functions using METH_NOARGS or METH_O
The patch removes the "cache tuple" optimization from property_descr_get(), it uses PyObject_CallArg1() instead. It means that the optimization is (currently) missed in some cases compared to the current code, but the code is safer and simpler.
The patch adds Python/pystack.c which currently only contains _PyStack_AsTuple(), but will contain more code later.
I tried to write the smallest patch, but I started to use PyObject_CallNoArg() and PyObject_CallArg1() when the code already created a tuple at each call: PyObject_CallObject(), call_function_tail() and PyEval_CallObjectWithKeywords().
In the patch, keywords are not used in fast calls. But they will be used later. I prefer to start directly with keywords than changing the calling convention once again later.
--
Later, I will propose other patches to:
* add METH_FASTCALL calling convention for C functions
* modify Argument Clinic to use METH_FASTCALL
So the fast call will be taken in more cases.
--
The long term plan is to slowly use the new FASTCALL calling convention "everywhere". The tricky point are tp_new, tp_init and tp_call attributes of type objects. In the issue #26814, I wrote a patch adding Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT and Py_TPFLAGS_FASTCALL flags to use the FASTCALL calling convention for tp_new, tp_init and tp_call. The problem is that calling directly these methods looks common. If we can the calling convention of these methods, it will break the C API, I propose to discuss that later ;-)
An alternative is to add a tp_fastcall method to PyTypeObject and use a wrapper for tp_call for backward compatibility. This option has also drawbacks. Again, I propose to discuss this later, and first start to focus on the changes that don't break anything ;-)
----------
files: fastcall.patch
keywords: patch
messages: 266422
nosy: haypo, serhiy.storchaka, yselivanov
priority: normal
severity: normal
status: open
title: Add _PyObject_FastCall()
type: performance
versions: Python 3.6
Added file: http://bugs.python.org/file43011/fastcall.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27128>
_______________________________________
More information about the New-bugs-announce
mailing list