[New-bugs-announce] [issue29259] Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects

STINNER Victor report at bugs.python.org
Fri Jan 13 07:24:34 EST 2017

New submission from STINNER Victor:

A new FASTCALL calling convention was added to Python 3.6. It allows to avoid the creation of a temporary tuple to pass positional arguments and a temporary dictionary to pass keyword arguments. A new METH_FASTCALL calling convention was added for C functions. Most functions now support fastcall, except objects with a __call__() method which have to go through slot_tp_call() which still requires a tuple and dictionary.

I tried multiple implementations to support fast calls to call the __call__() method, but I had practical and technical issues.

First, I tried to reuse the tp_call field to PyTypeObject: it can be a regular call (tuple/dict for arguments) or a fast call. I added a flag to the tp_flags field. It was tricky to support class inheritance, decide to set or clear the flag. But the real blocker issue is fAthat it is obviously breaks the backward compatibility: existing code calling directly tp_call with the regular calling convention will crash immediatly, and the error is not catched during compilation, even if the code is recompiled.

I propose a different design: add a new tp_fastcall field to PyTypeObject and use a wrapper for tp_call when tp_fastcall is defined. If a type defines tp_fastcall but not, the tp_call wrapper "simply" calls tp_fastcall. Advantages:

* The wrapper is trivial
* Minor changes to PyType_Ready() to support inheritance (simple logic)
* Fully backward compatible
* If tp_call is called directly without keyword arguments, there is no overhead but a speedup!


* If a type only defines tp_call, tp_fastcall is not inherited from the parent: tp_fastcall is set to NULL.
* If a type only defines tp_fastcall: tp_fastcall is always use (tp_call uses the wrapper)
* If a type defines tp_call and tp_fastcall, PyObject_Call() uses tp_call whereas _PyObject_FastCallDict() uses tp_fastcall.

Functions of the C API will be modified to use tp_fastcall if available.

The plan is then to patch most Python types to replace their tp_call with tp_fastcall. First, most important (common) types like Python and C functions, descriptors, and the various kinds of wrappers should be patched. Later, we should maybe discuss on a case by case basis to decide if it's worth it.

I will try to run benchmark before any kind.

messages: 285388
nosy: haypo, inada.naoki, serhiy.storchaka
priority: normal
pull_requests: 17
severity: normal
status: open
title: Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects
type: performance
versions: Python 3.7

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list