I converted Py_TYPE() and Py_SIZE() macros to static inline functions
in the future Python 3.11. It's a backward incompatible change. For
example, "Py_TYPE(obj) = type;" must be replaced with
You can use the upgrade_pythoncapi.py script of my pythoncapi_compat
project which does these changes for you: you just have to copy
pythoncapi_compat.h to your project. This header file provides new C
API functions like Py_NewRef() and Py_SET_TYPE() to old Python
versions, Python 2.7-3.11.
I already converted Py_TYPE() and Py_SIZE() macros in Python 3.10, but
it broke too many C extensions and so I had to revert the change. In
the meanwhile, I updated many C extensions and created the
pythoncapi_compat project. For example, Cython and numpy have been
updated to use Py_SET_TYPE() and Py_SET_SIZE(). Mercurial and
immutables projects now use pythoncapi_compat.
I'm interested by feedback on my pythoncapi_compat project ;-)
Tell me if you need help to update your project for Python 3.11 C API changes:
Night gathers, and now my watch begins. It shall not end until my death.
I would like to change the Python C API. I failed to write a single
document listing all constraints and proposing all changes that I
would to do. For example, my previous PEP 620 contains too many
changes and is too long.
Here is my attempt to focus on the bare minimum and (what I consider
as) the least controversial part: list current usages of C API and
constraints of these usages. This *informal* PEP should be the base of
future PEP changing the C API.
The current draft lives at:
My PEP is based on HPy Next Level Manifesto written by Simon Cross:
To reach most users of the C API, I cross-posted this email to
python-dev, capi-sig and hpy-dev.
Taking the Python C API to the Next Level
Title: Taking the Python C API to the Next Level
Author: Victor Stinner <vstinner(a)python.org>
While the C API is a key of the Python popularity, it causes multiple
subtle and complex issues. There are different ways to use the C API,
each usage has its own constraints, and some constraints are exclusive.
This document lists constraints but doesn't propose changes, it only
gives vague ideas how to solve some issues. More concrete C API changes
will require writing separated PEPs.
C extensions are a key component of the Python popularity
The Python popularity comes from its great programming language and from
its wide collection of modules freely available on PyPI. Many of the
most popular Python modules rely directly or indirectly on C extensions
written with the C API. The Python C API is a key component of the
For example, the numpy project is now a common dependency on many
scientific projects and a large part of the project is written by hand
with the C API.
**Abandoning or removing the C API** is out of question. Years ago, the
incomplete C API support was the main drawback of PyPy, since PyPy only
supported a minority of C extensions.
Today, CPython still have a similar issue. **When Cython or numpy don't
support a new Python version** (because of incompatible C API changes),
many Python projects depending on them are cannot be installed,
especially during the development phase of the next Python version.
Backward compatibility and unmaintained C extensions
One important property of the C API is the backward compatibility.
Developers expect that if their C extension works on Python 3.10, it
will work unmodified in Python 3.11: building the C extension with
Python 3.11 should be enough.
This property is even more important for unmaintained C extensions.
Sometimes, unmaintained just means that the only maintainer is busy or
overwhelmed for a few months. Sometimes, the project has no activity for
longer than 5 years.
When an incompatible change is introduced in the C API, like removing a
function or changing a function behavior, there is a **risk of breaking
an unknown number of C extensions**.
One option can be to update old C extensions when they are built on
recent Python versions, to adapt them to incompatible changes. This
conversion is non trivial and cannot handle all kinds of incompatible
Migration plan for incompatible changes
There should be a **sensible migration path** for large C extensions
(e.g. numpy) when incompatible changes are introduced. Whenever
possible, it should be possible to write a **single code base** compatible
with old and new Python versions.
A **compatibility layer** can be maintained externally. Cython and
numpy have their own internal compatibility layer.
There should be a way to easily pick up common errors introduced by
One practical way to **minimize the number of broken projects** is to
attempt to check in advance if an incompatible change is going to break
popular C extensions. For broken C extensions, propose a fix and wait
until a new release includes the fix, before introducing the change in
Python. Obviously, it doesn't solve the problem of less popular C
extensions and private C extensions.
Obtain the best possible performance
There are two main reasons for writing a C extension: implement a
function which cannot be written in pure Python, or write a **C
accelerator**: rewrite the 10% of an application in C where 90% of the
CPU time is spent. About the former use case, the intent is to obtain
the best possible performance. Tradeoffs are made with portability: it
is acceptable to only support a limited number of Python versions and to
only support a limited number of Python implementations (usually only
Cython is a good example of accelerator. It is able to support a large
number of Python versions and multiple Python implementation with
compatibility layers and ``#ifdef``. The main drawback is that it is
common that Cython is **broken by incompatible changes made at each
Python release**. It happens because Cython relies on many
On the other side, the **limited C API** is a small as possible,
excludes implementation details on purpose, and provides a stable ABI.
Building a C extension with the limited C API only once produces a
binary wheel package usable on many Python versions, but each platform
still requires its own binary wheel package.
Emulating the current C API is inefficient
The PyPy project is a Python implementation written from scratch, it was
not created as a CPython fork. It made many implementation choices
different than CPython: no reference counting, moving garbage collector,
JIT compiler, etc.
To support C extensions, PyPy emulates the Python C API in its cpyext
module. When the C API access an object, cpyext has to convert the PyPy
object to a CPython object (``PyObject``). CPython objects are less
efficient than PyPy objects with the PyPy JIT compiler and conversions
from PyPy objects to CPython objects are also inefficient. PyPy has to
reimplement every single detail of the CPython implementation to be as
much compatible as possible.
The C API exposes multiple implementation details:
* Reference counting, borrowed references, stealing references.
* Objects location in memory.
* Rely on pointers for object identity: Python 3.10 adds the ``Py_Is()``
function to solve this problem.
* Expose the memory layout of Python objects as part of the API.
* Expose static types.
* Implicit execution context.
The C API of Python 3.10 is made of around 15 000 lines of C header
files, 1500 functions and 100 structures. Supporting the full C API is a
significant amount of work.
**Freezing the C API** for a few Python releases would help other Python
implementations to catch up with the latest Python version, but it
doesn't solve the efficiency problem. Moreover, it is common that adding
a new feature to Python requires to change the C API, even if it is just
to add new functions. Not adding new features to Python for a few Python
releases is out of question.
The C API prevents optimizing CPython
It is challenging to evolve the C API to optimize CPython without
breaking the backward compatibility. Emulating the old C API is an
option, but it is inefficient.
If everything above is achievable -- and we believe it is! -- we'll
arrive in a wonderful new future where Python implementations can
experiment with all sorts of amazing new features:
* tracing garbage collectors;
* nurseries for short-lived objects;
* sub-interpreters with separate contexts;
* specialised implementations of lists;
* removing the GIL;
* avoiding the boxing of primitive types;
* just-in-time compilation;
* ... and many other things you can imagine that we haven't!
No one can guarantee that a particular new idea will work out, but
exposing fewer implementation details via the C API will make it
possible to try many new things.
Night gathers, and now my watch begins. It shall not end until my death.
I'm trying to understand what is the best/safest/recommended way to
implement tp_dealloc on a heap type created by PyType_FromSpec.
The official docs don't say much on the topic. The only note I could find
> Finally, if the type is heap allocated (Py_TPFLAGS_HEAPTYPE), the
deallocator should decrement the reference count for its type object after
calling the type deallocator.
My doubts came after reading the source code. If I don't specify a
tp_dealloc, its default value depends on heap vs static types:
- for static types, the default is object_dealloc, which simply does a
- for heap types created by PyType_FromSpecWithBases, the default
value is subtype_dealloc, which seems to do a lot of complex logic which
I don't fully understand.
This means that if I create a heap type with a custom tp_dealloc, all the
logic implemented by subtype_dealloc will not be executed and that my type
will probably behave subtly differently. I also found BPO 26979  where
Christian Tismer claims that "The default of PyType_FromSpec for tp_dealloc
is wrong!", but it seems that nothing has been done for that.
Another interesting data point is that PyType_FromSpec+tp_dealloc does not
seem to be used a lot in the wild. I tried to grep for Py_tp_dealloc in the
top4000 PyPI packages and I found only a match, in Cython-generated code
; but it's code which is behind an "#if
CYTHON_COMPILING_IN_LIMITED_API", which makes me to suspect which is not
actually used a lot in practice.
So, back to my original problem:
1. Is the default value of tp_dealloc actually correct?
2. Is it actually possible to write a custom tp_dealloc which behaves
3. If (2) is true, what is the simplest way to do it?
at the language summit many people told me that the HPy team should try to
communicate more with the CPython developers, so let's try :).
In HPy we want to design an API to build bytes/str objects in two steps, to
avoid the problem that currently in CPython they are not really immutable.
Before making any proposal, I spent quite a lot of time in researching how
the current API are used to construct bytes/str objects, and I summarized
my results here:
I think that my survey could be interesting for the people in this ML,
independently of HPy.
That said, I also opened an issue where to discuss concrete proposals for
the HPy API to do that: https://github.com/hpyproject/hpy/issues/214
I would be glad to receive comments and suggestions about that, and
especially to know whether I missed some important use case in my analysis.
Also, if you think that these kind of mails are off-topic in this ML,
please let me know and I'll stop.
For my research, IBM wrote a tracing GC for CPython and I was trying out
some ideas on how we would support the CAPI.
I know about handles used in HPy but I felt they can actually incur
allocation overhead and use more memory.
Instead, I thought of changing the semantics of the union type (PyObject)
to not point to internal structures and use a stack for sharing data
between Python and C. There can be one push function for each Python type
with a direct representation in C: Py_pushInteger for ints, etc. When a C
function returns, all values in the stack are returned to Python as the
results of the C function. Assuming we can have a way of returning multiple
values in Python.
typedef struct Object *PyObject;
typedef unsigned int PyObject;
Where now PyObject becomes an index into an internal array that stored all
values that had to be given to. This means that when a value is in that
array, it would not be collected by Python. When the C function returns its
whole array is erased, and the values used by the function are collected.
This setup gets us a reliable Union type (PyObject), the garbage collector
can also move objects. I think that backward compatibility can easily be
implemented using macros.
What is some feedback on this approach and am I overconfident of having
reasonable backward compatibility? Also, can this experiment uncover any
insights that CPython would find useful?
*"You think you know when you learn, are more sure when you can write, even
more when you can teach, but certain when you can program." Alan J. Perlis*
I tried getting Py_FrozenMain from ctypes.pythonapi, but I get
"undefined symbol". This is the only symbol from the Stable ABI that
behaves this way. (It was re-added in [bpo-42591].)
So far, I haven't found what makes the symbol different, but I assume
this is deliberate, since it happens on Windows as well (see a [test
PR]). It seems to be included when Python is compiled as a shared library.
Is this something we want as part of the stable ABI? If we do, it would
be good to always export it.
AFAICS, Py_FrozenMain is only used with freeze.py for a custom Python
build with frozen modules included.
Python 3.11.0a0 (heads/pep652-ctypes:315d97b64aa, May 5 2021, 11:49:17)
[GCC 11.1.1 20210428 (Red Hat 11.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ctypes
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pviktori/dev/cpython/Lib/ctypes/__init__.py", line 387,
func = self.__getitem__(name)
File "/home/pviktori/dev/cpython/Lib/ctypes/__init__.py", line 392,
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: ./python: undefined symbol: Py_FrozenMain
I'm working on a C extension interface of Python. I want to create a new
interpreter by using the function Py_NewInterpreter() in a new thread,
which is created by pthread_create (my test files are in attachment), but
there are always errors when calling Py_NewInterpreter() such as "failed:
object already tracked by the garbage collector".
I would like to ask how to solve the problem and create a new interpreter
in multi thread in Python C extension?
since it is very relevant to the topic of this mailing list, I'd like to
announce the new HPy blog (and its first post):
Any feedback (both on the blog post and on hpy itself) is appreciated!
After the e-mails from the previous week, I set up a call with Eric to
sync on Limited API/Stable ABI issues.
The higher-bandwidth (and more emotion-friendly) medium worked great for
us, but unfortunately, everyone else was excluded.
Here are some rather unstructued notes from my point of view. They might
be good discussion starters :)
Please read PEP 652 “Maintaining the Stable ABI” for my thoughts, plans
and rationales for the short term.
For the long term, there are some more vague plans and ideas:
There's not a 1:1 mapping between “Limited API”, “Stable ABI” and
“C-API, the Good Parts™”, *but* unless you're deep in this space, it
makes sense to conflate them. I aim to make them converge.
I intend to focus my CPython time on Limited API/Stable ABI for the
next... few years, probably.
The work I'm doing (extension isolation & now stable ABI) is similar to
the subinterpreters effort in several ways:
- They are both incomplete, but useful *today* for some limited cases,
and have lots of potential
- They need long-term effort, but most of the many steps are useful in
their own right.
In addition to the use cases in PEP 652, the stable ABI *should* ideally
be useful for:
- bindings for non-Python software, where shoehorning Python into the
buildsystem is not straightforward and building for several Python
versions at once is not practical. Also, raw speed tends to not matter.
- GUI apps, whose scripting plugins could use any Python the
user/scripter has installed. (Think Blender, if it was a smaller project
that couldn't afford to bundle Python.)
And here's some more concrete stuff. (Remember that these are still
Static `PyObject` and `PyObject*` are the main thing in the Limited API
that's blocking subinterpreters. We see no other blockers.
A possible way to solve this is to make isolated subinterpreters support
*opt-in* for extension modules:
- Introduce a macro similar to the old PY_SSIZE_T_CLEAN that removes the
problematic items from the headers.
- This macro will give you access to a flag you can use to tell the
runtime the extension is subinterp-safe.
- Python will support both the current, GIL-sharing subinterpreters, and
isolated subinterpreters. (All signs say implementing this will be easy,
relative to making subinterpreters always isolated and breaking existing
- The macro will affect both the full API and the limited subset.