Last months, I was busy to fill https://pythoncapi.readthedocs.io/
website with random notes. Many discussions occurred on this list and
python-dev, but I was only able to make the most simple and least
controversal changes in Python upstream. I didn't write a PEP because
CPython had a governance crisis. Since a new Steering Committee has
been elected, it's time to see how concrete PEP can be written.
IMHO we need to split the giant "C API" problem into multiple PEPs. I
propose 4 PEPs:
PEP A: Ecosystem of C extensions in 2019
PEP B: Define good and bad C APIs
PEP C: Plan to enhance the existing C API
PEP D: Completely new C API
There is also an ongoing discussion about embedded Python and Python
initialization API, but I'm scared by this topic so I don't even
propose to write a new PEP which would supersed PEP 432 :-)
== PEP A: Ecosystem of C extensions in 2019 ==
Discuss cffi, Cython, PyQt usage of the stable ABI, CPython C API,
etc. The goal is not to solve any problem, mostly to list existing
It sounds like an unsual PEP, but I think that a PEP is needed since
the same discussions happened multiple times.
This PEP can describe what are the kind of "C extensions" and maybe
suggest which tools are the best depending on the kind. cffi doesn't
cover all cases, the C API isn't always the right answer, etc.
== PEP B: Define good and bad C APIs ==
https://pythoncapi.readthedocs.io/bad_api.html can be used as a
starting point. It should be an informal PEP which evolves as PEP 7
and PEP 8 are evolving.
== PEP C: Plan to enhance the existing C API ==
This one sounds like the most controversial PEP :-) I see different things:
* Plan to deprecate and remove bad APIs
* Plan to help C extensions maintainers to move away from these bad APIs
* Plan to test the stability of the API
* Plan to test the stability of the ABI
The even more controversial idea: provide multiple Python runtimes for
CPython, not only one:
== PEP D: Completely new C API ==
Well, that's the obvious alternative to PEP C.
Armin Rigo's PyHandle idea may be a good start?
Night gathers, and now my watch begins. It shall not end until my death.
I've had enough ideas bouncing around in my head that I had to get them
written up :)
So I'm proposing to produce an informational PEP to describe what a
"good" C API looks like and act as guidance as we implement new APIs or
change existing ones.
This is a rough, incomplete first draft that nonetheless I think is
enough to trigger useful discussions. It's a brain dump, but I've
already dumped most of this before.
They're in the text below, but I'll repeat here:
* this is NOT a brand-new API
* this is NOT exactly what we currently have implemented
* this is NOT a proposal to stop shipping half the standard library
* this IS meant to provide context for discussing both the issues with
our current API and to help drive discussions of any new API or API changes
I don't have any particular desire to own the entire doc, so if anyone
wants to become a co-author I'm very open to that. However, I do have
strong opinions on this topic after a number of years working with
*excellent* API designs, designers and processes. If you want to propose
a _totally_ different vision from this, please consider writing an
alternative rather than trying to co-opt this one :)
(Doc in approximate Markdown, automatically wrapped to 72 cols for email
and I haven't checked if that broke stuff. Sorry if it did)
CPython C API Design Guidelines
This document is intended to be a set of guiding principles for
development of the current CPython C API. Future additions and
enhancements to the CPython C API should follow, or at least be
influenced by, the principles described here. At a minimum, any new or
modified C APIs should be able to be categorised according to the
terminology defined here, even if exceptions have to be made.
# Things this document is NOT
This document is NOT a design of a completely new API (though a
hypothetical new API should follow this design).
This document is NOT documentation of the current API (though the
current API should come to resemble it over time).
This document is NOT a set of binding rules in the same sense as PEP 7
and PEP 8 (though designs should be tested against it and exceptions
should be rare).
This document is NOT permission to make backwards-incompatible
modifications to the current API (though backwards-incompatible
modifications should still be made where warranted).
A common understanding of certain terms is necessary to talking about
the CPython C API. This section has two goals: to clarify existing
common terminology, and to introduce new terminology. Terms are
presented in a logical order, rather than alphabetically.
## Existing terms
**Application**: Any independent program that can be launched directly.
Compare and contrast with *extension*. CPython is normally considered an
**Extension**: A program that integrates into an application, and cannot
be launched directly but must be loaded by that application. Python
modules, native or otherwise, are considered extenions. When embedded
into another application, CPython is considered an extension.
**Native extension**: A subset of all extensions that are compiled to
the same language as the application they integrate with. When embedded
into an application that is written in C or uses C-compatible
conventions, CPython is considered a native extension.
**API**: Application Programming Interface. The set of interactions
defined by an application to allow extensions to extend, control, and
interact with the first. Typically refers to OOP objects and functions
in the abstract. CPython has one API that applies for all scenarios in
all contexts, though each scenario will likely only use a subset of this
**ABI**: Application Binary Interface. The implementation of an API such
that its interactions can be realized by a digital computer. Typically
includes memory layouts and binary representations, and is a function of
the build tools used to compile CPython. CPython has different ABIs in
different contexts, and a different ABI for native extensions compared
**Stdlib**: Standard library. Components that build upon the Python
language in order to provide useful building blocks and pre-written
functionality for users.
## New terms
These terms are introduced briefly here and described in much greater
**API ring**: One subset of an API for the purpose of extension
compatibility. Extensions to CPython care about rings. Extensions choose
to target a particular ring to trade off between deeper integration and
tighter coupling. Targeting one ring includes access to all rings
outside of that one. Rings are orthogonal to layers.
**API layer**: One subset of an API for the purpose of application and
internal compatibility. Applications that embed CPython, and the CPython
implementation itself, cares about layers. Applications choose to adopt
or implement a particular layer, implicitly including all lower layers.
Layers are orthogonal to rings.
# Quick Overview
For context as you continue reading, these are the API **rings**
provided by CPython:
* Python ring (equivalent of the Python language)
* CPython ring (CPython-specific APIs)
* Internal ring (intended for internal use only)
These are the API **layers** provided by CPython:
* Optional stdlib layer (dependencies that must be explicitly required)
* Required stdlib layer (dependencies that can be assumed)
* Platform adaption layer (ability to interact with the platform)
* Core layer ("pure" mode with no platform interactivity)
(Reminder that this document does not reflect the current state of
CPython, but is both aspirational and defining terms for the purposes of
discussion. This is not a proposal to remove anything from the standard
# API Rings
CPython provides three API rings, listed here from outermost to
* Python ring
* CPython ring
* Internal ring
An extension that targets the Python ring does not have access to the
CPython or Internal rings. Likewise, an extension that targets the
CPython ring does not have access to the Internal ring, but does use the
When CPython is an extension of another application, that application
can also select which ring to target.
The expectation is that all Python implementations can provide an
equivalent Python ring, CPython officially supports extensions using the
CPython ring when targeting CPython, and the Internal ring is available
## Python API ring
The Python ring provides functionality that should be equivalent across
all Python implementations - in essence, the Python language itself
defines this ring.
The C implementation of the Python API allows native code to interact
with Python objects as if it were written in Python. The Python API
supports duck-typing and should correctly handle the substitution of
For a concrete example, `PyObject_GetItem` is part of the Python ring
while `PyDict_GetItem` is in the CPython ring.
Compatibility requirements for the Python API match the language
version. Specifically, code relying on the Python API should only break
or change behaviour if the equivalent code written in Python would also
break or change behaviour.
For CPython, including `Python.h` should only provide access to the
Python ring. Accessing any other rings should produce a compile error.
## CPython API ring
The CPython ring provides functionality that is specific to CPython.
Extensions that opt in to the CPython ring are tied directly to CPython,
but have access to functions that are specific to CPython.
Functions in the CPython ring may require the caller to be using C or be
able to provide C structures allocated in memory.
In general, most applications that embed CPython will use the CPython
ring. Also, native extensions in the Optional stdlib layer
For a concrete example, the `PyCapsule` type belongs in the CPython ring
(that is, other implementations are not required to provide this
particular way to smuggle C pointers through Python objects).
As a second concrete example, `PyType_FromSpec` belongs in the CPython
ring. (The equivalent in the Python ring would be to call the `type`
object, while the equivalent in the internal ring would be to define a
Compatibility requirements for the CPython API match the CPython
major.minor version. Specifically, code relying on the CPython API
should only break or change behaviour if the major.minor version
For CPython, as well as `Python.h`, also include `cpython/<header>.h` to
obtain access to APIs in the CPython ring.
## Internal API ring
The Internal ring provides functionality that is used to implement
CPython. Extensions that opt in to the Internal ring may need to rebuild
for every CPython build.
In general, most of the Required stdlib layer will use the Internal
For CPython, as well as `Python.h`, also include `internal/<header>.h`
to obtain access to APIs in the Internal ring.
# API Layers
CPython provides four API layers, listed here from top to bottom:
* Optional stdlib layer
* Required stdlib layer
* Platform adaptation layer
* Core layer
An application embedding Python targets one layer and all those below
it, which affects the functionality available in Python.
Higher layers may depend on the APIs provided by lower layers, but not
the other way around. In general, layers should aim to maximise
interaction with the next layer down and avoid skipping it, but this is
not a strict requirement.
Lower layers are required to maintain backwards compatibility more
strictly than the layers above them.
Components within a layer that depend on other components within that
layer must be treated as a single component for determining whether it
may be included or omitted.
Standard Python distributions (that is, anything that may be launched
with the `python` command) will depend upon most components in the
Optional stdlib layer, and hence will require _everything_ from the
Required stdlib layer and below. Only embedders and potentially
deployment tools will use reduced layers.
(Reminder: this document does not present the current state of CPython.)
## Core layer
This layer is the core language and evaluation engine. By adopting this
layer, an application can provide platform-independent Python execution.
However, it may require providing implementations of a number of
callbacks in order to be functional (e.g. for dynamic memory
Examples of current components that fit into the core layer:
* Most of most built-in types (str, int, list, dict, etc.)
* compile, exec, eval
* read-only members of the sys module
Important but potentially non-obvious implications of relying only on
the core layer:
* Dynamic memory allocation/deallocation is part of the Platform
adaptation layer, but there is no way to avoid it here. So any user of
the core API will need to provide allocators and deallocators. The
CPython Platform adaptation layer provides the "default"
implementations, but if an embedder does not want to use these then
targeting the Core layer will omit them.
* File system and standard streams are part of the Platform adaptation
layer, which leaves `open` and `sys.stdout` (among others) without a
default implementation. An application that wants to support these
without adding more layers needs to provide its own implementations
* The core layer only exposes UTF-8 APIs. Encoding and decoding for the
current platform requires the Platform adaptation layer, while arbitrary
encoding and decoding requires the Optional stdlib layer.
* Imports in the core layer are satisfied by a "blind" callback. The
Platform adaptation layer provides the support for frozen, bytecode and
natively-encoded source imports, while the Optional stdlib layer is
required for arbitrary encodings in source files
## Platform adaptation layer
This layer provides the CPython implementation of platform-specific
adapters to support the core layer.
* Memory allocation/deallocation
* File system access
* Standard input/output streams
* Cryptographic random number generation
* os module
* CPython imports
Important but potentially non-obvious implications of relying only on
the platform adaptation layer:
* File system access generally requires text encodings, but the full set
of codecs are in the optional stdlib layer. To fully separate these
layers, an implementation of the current file system encoding would be
required in the Platform adaptation layer. (But arbitrarily
encoding/decoding the _contents_ of a file may require higher layers.)
* Importing from source code may also require arbitrary encodings, but
imports that can be fully satisfied without this are provided here (e.g.
native extension modules, precompiled bytecode, frozen modules, natively
encoded source files)
## Required stdlib layer
This layer provides common APIs for interactions between other modules.
All components in the Optional stdilib layer may assume that if _they_
are present, everything in this layer is also present.
* standard ABCs
* compiler services (e.g. `copy`, `functools`, `traceback`)
* standard interop types (e.g. `pathlib`, `enum`, `dataclasses`)
## Optional stdlib layer
This layer provides modules that fundamentally stand alone. None of the
lower levels may depend on these components being present, and
components in this layer should explicitly declare dependencies on
others in the same layer.
This layer is valuable for embedders and distributors that want to omit
certain functionality. For example, omitting `socket` should be possible
when that functionality is not required, as it is in the Optional stdlib
layer, and omitting it should only affect those components in the
Optional stdlib layer that have explicitly required it.
* platform-independent algorithms (e.g. `itertools`, `statistics`)
* application-specific functionality (e.g. `email`, `socket`, `ftplib`,
* additional compiler services (e.g. `ast`)
* text codecs (e.g. `base64`, `codecs`, `encodings`)
* Python-level FFI (e.g. `ctypes`)
* tools (e.g. ``idlelib``, ``pynche``, ``distutils``, ``msilib``)
* configuration/information (e.g. ``site``, ``sysconfig``, ``platform``)
Components in the Optional stdlib layer may be independently versioned.
Not sure this is the right forum for this, but on the Cython bug
tracker, the problem of managing to use the C API for importing modules
If Stefan Behnel, who is one of the most experimented persons in the
world with CPython's C API, has trouble getting it working, then
probably it deserves improving. ;-)
(I also realize that the C API is more or less married to the dubious
semantics of the standard "__import__" function, which is itself
difficult to change, so perhaps a better C API isn't very easy)
CC'ing Stefan in case he isn't subscribed and wants to chime in.
I modified Include/ header files to have a way more explicit
separation between the different levels of Python C API:
* Include/*.h is the "portable Python API" supposed to be efficient on
CPython and PyPy, and ideally support a maximum number of Python
* Include/cpython/*.h is the "portable CPython API" which should only
be available on CPython (but PyPy had to emulate it for best
compatibility with CPython) and so should be avoided if possible.
* Include/internal/*.h is the "CPython internal API" which should not
be used outside CPython code base. In practice, Cython wants to use it
to emit the most efficient code. Debuggers and profiles also want to
use it to able to *inspect* a Python process without executing code
(don't modify the process). You need to access all structures,
especially internal ones, for that.
"Portable" here means "should work on multiple Python versions" like
support Python 3.6, 3.7 and 3.8.
... There is also an unclear "non-portable CPython API" which is
somewhere between "portable CPython API" and "CPython internal API". I
would prefer that this weird API simply goes away if possible: move it
into the internal API?
A big change of Python 3.8 is that the "CPython internal API" is now
installed by "make install", but accessing it requires to define
Py_BUILD_CORE (you have to opt-in for this API). This change allows to
move APIs as internal, since they remain available, whereas in Python
3.7 this API was not installed at all. It's a small step towards
making more APIs internal.
For the 3 levels of API: I'm writing what we should get **in the long
term**. Right now... it's a mess. Previously, we never seriously
really looked how a function should be exposed.
For example, some API are exposed as "portable Python API" whereas
they really must belong to the "CPython internal API". I recently
moved "PyXXX_Fini()" APIs to the "CPython internal API". Outside
CPython, it doesn't make any sense to call directly these functions:
you really must call Py_Finalize() or Py_FinalizeEx(). In Python 3.7,
some of these functions are surrounded by "#ifdef Py_BUILD_CORE", some
others are surrounded by "#ifndef Py_LIMITED_API" and their name are
not prefixed by "_Py" as if they are public functions... Extract of
Python 3.7 Include/pylifecycle.h:
#endif /* Py_BUILD_CORE */
#endif /* !Py_LIMITED_API */
The overall project is still heavily in Work-in-Progress state. It's
moving slowly and steadily at least!
For example, Eric Snow is working on making more structures opaque in
the "portable CPython API":
[Python-Dev] Making PyInterpreterState an opaque type
Night gathers, and now my watch begins. It shall not end until my death.