[Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)
vstinner at redhat.com
Fri Nov 9 19:30:37 EST 2018
The current C API of Python is both a strength and a weakness of the
Python ecosystem as a whole. It's a strength because it allows to
quickly reuse a huge number of existing libraries by writing a glue
for them. It made numpy possible and this project is a big sucess!
It's a weakness because of its cost on the maintenance, it prevents
optimizations, and more generally it prevents to experiment modifying
For example, CPython cannot use tagged pointers, because the existing
C API is heavily based on the ability to dereference a PyObject*
object and access directly members of objects (like PyTupleObject).
For example, Py_INCREF() modifies *directly* PyObject.ob_refcnt. It's
not possible neither to use a Python compiled in debug mode on C
extensions (compiled in release mode), because the ABI is different in
debug mode. As a consequence, nobody uses the debug mode, whereas it
is very helpful to develop C extensions and investigate bugs.
I also consider that the C API gives too much work to PyPy (for their
"cpyext" module). A better C API (not leaking implementation) details
would make PyPy more efficient (and simplify its implementation in the
long term, when the support for the old C API can be removed). For
example, PyList_GetItem(list, 0) currently converts all items of the
list to PyObject* in PyPy, it can waste memory if only the first item
of the list is needed. PyPy has much more efficient storage than an
array of PyObject* for lists.
I wrote a website to explain all these issues with much more details:
I identified "bad APIs" like using borrowed references or giving
access to PyObject** (ex: PySequence_Fast_ITEMS).
I already wrote an (incomplete) implementation of a new C API which
doesn't leak implementation details:
It uses an opt-in option (Py_NEWCAPI define -- I'm not sure about the
name) to get the new API. The current C API is unchanged.
Ah, important points. I don't want to touch the current C API nor make
it less efficient. And compatibility in both directions (current C API
<=> new C API) is very important for me. There is no such plan as
"Python 4" which would break the world and *force* everybody to
upgrade to the new C API, or stay to Python 3 forever. No. The new C
API must be an opt-in option, and current C API remains the default
and not be changed.
I have different ideas for the compatibility part, but I'm not sure of
what are the best options yet.
My short term for the new C API would be to ease the experimentation
of projects like tagged pointers. Currently, I have to maintain the
implementation of a new C API which is not really convenient.
Today I tried to abuse the Py_DEBUG define for the new C API, but it
seems to be a bad idea:
A *new* define is needed to opt-in for the new C API.
More information about the Python-Dev