Hello. I am writing a PEP to describe how I'd like to "resurrect" and maintain the Stable ABI & Limited API going forward. I assume this'll be my main focus for CPython in 2021. I'm not as far along as I hoped to be at the end of 2020, but I guess it's time to request comments.
If you have any thoughts, arguments or improvements I'd be happy to hear them!
I'm including the text below; a rendered and forkable/pull-request-able version is at https://github.com/encukou/abi3/blob/main/PEP.rst
Have a great start of 2021!
PEP: 9999 Title: Maintaining the Stable ABI Author: Petr Viktorin <encukou@gmail.com> Discussions-To: Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Dec-2020
Abstract
[A short (~200 word) description of the technical issue being addressed.]
XXX: Abstract should be written last
Motivation
:pep:384
defined a limited API and stable ABI, which allows extenders and
embedders of CPython to compile extension modules that are binary-compatible
with any subsequent version of 3.x.
In theory, this brings many advantages:
- A module can be built only once per platform and support multiple versions of Python, reducing time, power and maintainer attention needed for builds.
- Binary wheels using the stable ABI work with new versions of CPython throughout the pre-release period, and can be tested in environments where building from source is not practical.
As a welcome side effect of the stable ABI's hiding of implementation details is that it is becoming a viable target for alternate Python implementations that need to implement (parts of) the C API.
However, in hindsignt, PEP 384 and its implementation has several issues:
- There is no process keep the ABI up to date.
- Contents of the limited API are not listed explicitly, making it unclear if a particular member (e.g. function, structure) is a part of it.
- There is no way to deprecate parts of the limited API.
This PEP defines the limited API more clearly and introducess process designed to make the API more useful.
Additionally, PEP 384 defines a *limited API* as a way to build against the stable ABI. This PEP defines the limited API more robustly.
Rationale
This PEP contains a lot of clarifications and definitions, but just one big technical change: the stable ABI will be explicitly listed in a human-maintained “manifest” file.
There have been efforts to collect such lists automatically, e.g. by scanning the symbols exported from Python. This might seem to be easier to maintain by our volunteer team.
However, designing a future-proof API is not a trivial task. The cost of updating an explicit manifest is small compared to the overall work that should go into changing API that will need to be suppported forever (or until Python 3 reaches end of life, if that comes sooner).
This PEP proposes automatically generating things *from* the manifest: initially documentation and DLL contents, with later possibilities for also automating tests.
Stable ABI vs. Limited API
:pep:384
and this document deal with the *Limited API* and the *Stable
ABI*,
two related but distinct concepts.
This section clarifies what they mean and defines some of their semantics
(either pre-existing or newly proposed here).
The word “Extensions” is used as a shorthand for all code that uses the Python API, e.g. extension modules or software that embeds Python.
Stable ABI
The CPython *Stable ABI* is a promise that extensions built with a specific Cpython version will be usable with any newer interpreter of the same major version, on the same platform and with the same compiler & settings. For example, a extension built with CPython 3.10 Stable ABI will be usable with CPython 3.11, 3.12, and so on, but not necessarily with 4.0.
The Stable ABI is not generally forward-compatible: an extension built and tested with CPython 3.10 will not generally be compatible with CPython 3.9.
.. note::
For example, starting in Python 3.10, the Py_tp_doc
slot may be set to
NULL
, while in older versions, a NULL
value will likely crash the
interpreter.
The Stable ABI trades performance for its stability. For example, many functions in the stable ABI are available as faster macros to extensions that are built for a specific CPython version.
Future Python sversions may deprecate some members of the Stable ABI. Such members will still work, but may suffer from issues like reduced performance or, in the most extreme cases, memory/resource leaks.
Limited API
Stable ABI guarantee holds for extensions compiled from code that restricts itself to the *Limited API*, a subset of CPython's C API.
The limited API is used when preprocessor macro Py_LIMITED_API
is defined
to either 3
or the current PYTHON_API_VERSION
.
The Limited API is not guaranteed to be *stable*.
In the future, parts of the limited API may be deprecated.
They may even be removed, as long as the *stable ABI* is kept
stable and Python's general backwards compatibility policy, :pep:387
,
is followed.
.. note::
For example, a function declaration might be removed from public header
files but kept in the library.
This is currently a possibility for the future; this PEP does not to
propose a concrete process for deprecations and removals.
The goal is for the limited API to cover everything needed to interact with the interpreter. There main reasons to not include a public API in the limited subset should be that it needs implementation details that change between CPython versions, like struct memory layouts, for performance reasons.
The limited API is not limited to CPython; other implementations are encouraged to implement it and help drive its design.
Specification
To make the stable ABI more useful and stable, the following changes are proposed.
Stable ABI Manifest
All members of the stable ABI – functions, typedefs, structs, struct fields,
data values etc. – will be explicitly listed in a single "manifest" file,
along with the Limited API version they were added in.
Struct fields that users of the stable ABI are allowed to access will be
listed explicitly.
Members that are not part of the Limited API, but are part of the Stable ABI
(e.g. PyObject.ob_type
, which is accessible by the Py_TYPE
macro),
will be annotated as such.
Notes saying “Part of the stable ABI” will be added to Python's documentation automatically, in a way similar to the notes on functions that return borrowed references.
Source for the Windows shared library python3.dll
will be generated
from the
stable ABI definition.
The format of the manifest will be subject to change whenever needed. It should be consumed only by scripts in the CPython repository. If a more public list is needed, a script can be added to generate it.
Contents of the Stable ABI
The initial stable ABI manifest will include:
- The Stable ABI specified in :pep:
384
. - All functions listed in
PC/python3dll.c
. - All structs (struct typedefs) which these functions return or take as arguments. (Fields of such structs will not necessarily be added.)
- New type slots, such as
Py_am_aiter
. - The type flags
Py_TPFLAGS_DEFAULT
,Py_TPFLAGS_BASETYPE
,Py_TPFLAGS_HAVE_GC
,Py_TPFLAGS_METHOD_DESCRIPTOR
. - The calling conventions
METH_*
(except deprecated ones). - All API needed by macros is the stable ABI (usually annotated as not being part of the limited API).
Additional items may be aded to the initial manifest according to the checklist below.
Testing the Stable ABI
An automatically generated test module will be added to ensure that all members of the stable ABI are available at compile time
For each function in the stable ABI, a test will be added that calls the
function using ctypes
. (Where calling is not practical, such as with
functions related to intepreter initialization and shutdown, the test will
only look the function up.)
This should prevent regressions when a function is converted to a macro,
which keeps the same API but breaks the ABI.
An check will be added to ensure all functions in the stable ABI are tested
this way.
Changing the Limited API
A checklist for changing the limited API, including new members (structs,
functions or values), will be added to the Devguide
_.
The checklist will 1) mention best practices and common pitfalls in Python
C API design and 2) guide the developer around the files that need
changing and
scripts that need running when the limited API is changed.
Below is the initial proposal for the checklist. After the PEP is accepted, see the Devguide for the current version.
Note that the checklist applies to new additions; not the existing limited API.
Design considerations:
Make sure the change does not break the Stable ABI of any version of Python since 3.5.
Make sure no exposed names are private (i.e. begin with an underscore).
Make sure the new API is well documented.
Make sure the types of all parameters and return values of the added function(s) and all fields of the added struct(s) are be part of the limited API (or standard C).
Make sure the new API and its intended use follows standard C, not just features of currently suppoerted platforms.
- Do not cast a function pointer to
void*
(a data pointer) or vice versa.
- Do not cast a function pointer to
Make sure the new API follows reference counting conventions. (Following them makes the API easier to reason about, and easier use in other Python implementations.)
- Do not return borrowed references from functions.
- Do not steal references to function arguments.
Make sure the ownership rules and lifetimes of all applicable struct fields, arguments and return values are well defined.
Think about ease of use for the user. (In C, ease of use itself is not very important; what *is* important is reducing boilerplate code needed to use the API. Bugs like to hide in boiler plates.)
- If a function will be often called with specific value for an argument,
consider making it default (assumed when
NULL
is passed in).
- If a function will be often called with specific value for an argument,
consider making it default (assumed when
Think about future extensions: for example, if it's possible that future Python versions will need to add a new field to your struct, how will that be done?
Make as few assumptions as possible about details that might change in future CPython versions or differ across C API implementations:
- The GIL
- Garbage collection
- Layout of PyObject and other structs
If following these guidelines would hurt performance, add a fast function (or macro) to the non-limited API and a stable equivalent to the limited API.
If anything is unclear, or you have a good reason to break the guidelines,
consider discussing the change at the capi-sig
_ mailing list.
.. _capi-sig: https://mail.python.org/mailman3/lists/capi-sig.python.org/
Procedure:
- Move the declaration to a header file directly under
Include/
and#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x03yy0000
(with theyy
corresponding to Python version). - Make an entry in the stable ABI list. (XXX: mention filename)
- For functions, add a test that calls the function using ctypes (XXX: mention filename).
- Regenerate the autogenerated files. (XXX: specific instructions)
Advice for Extenders and Embedders
The following notes will be added to documentation.
Extension authors should test with all Python versions they support, and preferably build with the lowest such version.
Compiling with Py_LIMITED_API
defined is *not* a guarantee that your
code
conforms to the limited API or the stable ABI.
It only covers definitions, but an API also includes other issues,
such as expected semantics.
Examples of issues that Py_LIMITED_API
does not guard against are:
- Calling a function with invalid arguments
- An function that started accepting
NULL
values for an argument in Python 3.9 will fail ifNULL
is passed to it under Python 3.8. Only testing with 3.8 (or lower versions) will uncover this issue. - Some structs include a few fields that are part of the stable ABI and
other
fields that aren't.
Py_LIMITED_API
does not filter out such “private” fields. - Using something that is not documented as part of the stable ABI,
but exposed even with
Py_LIMITED_API
defined. Despite the team's best efforts, such issues may happen.
Backwards Compatibility
The PEP aims at full compatibility with the existing stable ABI and limited API, but defines them terms more explicitly. It might not be consistent with some interpretations of what the existing stable ABI/limited API is.
Security Implications
None known.
How to Teach This
Technical documentation will be provided. It will be aimed at experienced users familiar with C.
Reference Implementation
None so far.
Rejected Ideas
While this PEP acknowledges that parts of the limited API might be deprecated or removed in the future, a process to do this is not in scope, and is left to a possible future PEP.
Open Issues
None so far.
References
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
.. _Devguide: https://devguide.python.org/
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
Hi,
Recently I had started some effort to make sure the symbols are listed somewhere and that they are exported (as some of them were mistakenly removed in the past). You can see my initial PR here:
https://github.com/python/cpython/pull/23616
This is currently active in the CPython CI.
Have a great start of 2021!
On Thu, 31 Dec 2020, 18:18 Petr Viktorin, <encukou@gmail.com> wrote:
Hello. I am writing a PEP to describe how I'd like to "resurrect" and maintain the Stable ABI & Limited API going forward. I assume this'll be my main focus for CPython in 2021. I'm not as far along as I hoped to be at the end of 2020, but I guess it's time to request comments.
If you have any thoughts, arguments or improvements I'd be happy to hear them!
I'm including the text below; a rendered and forkable/pull-request-able version is at https://github.com/encukou/abi3/blob/main/PEP.rst
Have a great start of 2021!
PEP: 9999 Title: Maintaining the Stable ABI Author: Petr Viktorin <encukou@gmail.com> Discussions-To: Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Dec-2020
Abstract
[A short (~200 word) description of the technical issue being addressed.]
XXX: Abstract should be written last
Motivation
:pep:
384
defined a limited API and stable ABI, which allows extenders and embedders of CPython to compile extension modules that are binary-compatible with any subsequent version of 3.x. In theory, this brings many advantages:
- A module can be built only once per platform and support multiple versions of Python, reducing time, power and maintainer attention needed for builds.
- Binary wheels using the stable ABI work with new versions of CPython throughout the pre-release period, and can be tested in environments where building from source is not practical.
As a welcome side effect of the stable ABI's hiding of implementation details is that it is becoming a viable target for alternate Python implementations that need to implement (parts of) the C API.
However, in hindsignt, PEP 384 and its implementation has several issues:
- There is no process keep the ABI up to date.
- Contents of the limited API are not listed explicitly, making it unclear if a particular member (e.g. function, structure) is a part of it.
- There is no way to deprecate parts of the limited API.
This PEP defines the limited API more clearly and introducess process designed to make the API more useful.
Additionally, PEP 384 defines a *limited API* as a way to build against the stable ABI. This PEP defines the limited API more robustly.
Rationale
This PEP contains a lot of clarifications and definitions, but just one big technical change: the stable ABI will be explicitly listed in a human-maintained “manifest” file.
There have been efforts to collect such lists automatically, e.g. by scanning the symbols exported from Python. This might seem to be easier to maintain by our volunteer team.
However, designing a future-proof API is not a trivial task. The cost of updating an explicit manifest is small compared to the overall work that should go into changing API that will need to be suppported forever (or until Python 3 reaches end of life, if that comes sooner).
This PEP proposes automatically generating things *from* the manifest: initially documentation and DLL contents, with later possibilities for also automating tests.
Stable ABI vs. Limited API
:pep:
384
and this document deal with the *Limited API* and the *Stable ABI*, two related but distinct concepts. This section clarifies what they mean and defines some of their semantics (either pre-existing or newly proposed here).The word “Extensions” is used as a shorthand for all code that uses the Python API, e.g. extension modules or software that embeds Python.
Stable ABI
The CPython *Stable ABI* is a promise that extensions built with a specific Cpython version will be usable with any newer interpreter of the same major version, on the same platform and with the same compiler & settings. For example, a extension built with CPython 3.10 Stable ABI will be usable with CPython 3.11, 3.12, and so on, but not necessarily with 4.0.
The Stable ABI is not generally forward-compatible: an extension built and tested with CPython 3.10 will not generally be compatible with CPython 3.9.
.. note:: For example, starting in Python 3.10, the
Py_tp_doc
slot may be set toNULL
, while in older versions, aNULL
value will likely crash the interpreter.The Stable ABI trades performance for its stability. For example, many functions in the stable ABI are available as faster macros to extensions that are built for a specific CPython version.
Future Python sversions may deprecate some members of the Stable ABI. Such members will still work, but may suffer from issues like reduced performance or, in the most extreme cases, memory/resource leaks.
Limited API
Stable ABI guarantee holds for extensions compiled from code that restricts itself to the *Limited API*, a subset of CPython's C API.
The limited API is used when preprocessor macro
Py_LIMITED_API
is defined to either3
or the currentPYTHON_API_VERSION
.The Limited API is not guaranteed to be *stable*. In the future, parts of the limited API may be deprecated. They may even be removed, as long as the *stable ABI* is kept stable and Python's general backwards compatibility policy, :pep:
387
, is followed... note::
For example, a function declaration might be removed from public header files but kept in the library. This is currently a possibility for the future; this PEP does not to
propose a concrete process for deprecations and removals.
The goal is for the limited API to cover everything needed to interact with the interpreter. There main reasons to not include a public API in the limited subset should be that it needs implementation details that change between CPython versions, like struct memory layouts, for performance reasons.
The limited API is not limited to CPython; other implementations are encouraged to implement it and help drive its design.
Specification
To make the stable ABI more useful and stable, the following changes are proposed.
Stable ABI Manifest
All members of the stable ABI – functions, typedefs, structs, struct fields, data values etc. – will be explicitly listed in a single "manifest" file, along with the Limited API version they were added in. Struct fields that users of the stable ABI are allowed to access will be listed explicitly. Members that are not part of the Limited API, but are part of the Stable ABI (e.g.
PyObject.ob_type
, which is accessible by thePy_TYPE
macro), will be annotated as such.Notes saying “Part of the stable ABI” will be added to Python's documentation automatically, in a way similar to the notes on functions that return borrowed references.
Source for the Windows shared library
python3.dll
will be generated from the stable ABI definition.The format of the manifest will be subject to change whenever needed. It should be consumed only by scripts in the CPython repository. If a more public list is needed, a script can be added to generate it.
Contents of the Stable ABI
The initial stable ABI manifest will include:
- The Stable ABI specified in :pep:
384
.- All functions listed in
PC/python3dll.c
.- All structs (struct typedefs) which these functions return or take as arguments. (Fields of such structs will not necessarily be added.)
- New type slots, such as
Py_am_aiter
.- The type flags
Py_TPFLAGS_DEFAULT
,Py_TPFLAGS_BASETYPE
,Py_TPFLAGS_HAVE_GC
,Py_TPFLAGS_METHOD_DESCRIPTOR
.- The calling conventions
METH_*
(except deprecated ones).- All API needed by macros is the stable ABI (usually annotated as not being part of the limited API).
Additional items may be aded to the initial manifest according to the checklist below.
Testing the Stable ABI
An automatically generated test module will be added to ensure that all members of the stable ABI are available at compile time
For each function in the stable ABI, a test will be added that calls the function using
ctypes
. (Where calling is not practical, such as with functions related to intepreter initialization and shutdown, the test will only look the function up.) This should prevent regressions when a function is converted to a macro, which keeps the same API but breaks the ABI. An check will be added to ensure all functions in the stable ABI are tested this way.Changing the Limited API
A checklist for changing the limited API, including new members (structs, functions or values), will be added to the
Devguide
_. The checklist will 1) mention best practices and common pitfalls in Python C API design and 2) guide the developer around the files that need changing and scripts that need running when the limited API is changed.Below is the initial proposal for the checklist. After the PEP is accepted, see the Devguide for the current version.
Note that the checklist applies to new additions; not the existing limited API.
Design considerations:
Make sure the change does not break the Stable ABI of any version of Python since 3.5.
Make sure no exposed names are private (i.e. begin with an underscore).
Make sure the new API is well documented.
Make sure the types of all parameters and return values of the added function(s) and all fields of the added struct(s) are be part of the limited API (or standard C).
Make sure the new API and its intended use follows standard C, not just features of currently suppoerted platforms.
- Do not cast a function pointer to
void*
(a data pointer) or vice versa.Make sure the new API follows reference counting conventions. (Following them makes the API easier to reason about, and easier use in other Python implementations.)
- Do not return borrowed references from functions.
- Do not steal references to function arguments.
Make sure the ownership rules and lifetimes of all applicable struct fields, arguments and return values are well defined.
Think about ease of use for the user. (In C, ease of use itself is not very important; what *is* important is reducing boilerplate code needed to use the API. Bugs like to hide in boiler plates.)
- If a function will be often called with specific value for an argument, consider making it default (assumed when
NULL
is passed in).Think about future extensions: for example, if it's possible that future Python versions will need to add a new field to your struct, how will that be done?
Make as few assumptions as possible about details that might change in future CPython versions or differ across C API implementations:
- The GIL
- Garbage collection
- Layout of PyObject and other structs
If following these guidelines would hurt performance, add a fast function (or macro) to the non-limited API and a stable equivalent to the limited API.
If anything is unclear, or you have a good reason to break the guidelines, consider discussing the change at the
capi-sig
_ mailing list... _capi-sig: https://mail.python.org/mailman3/lists/capi-sig.python.org/
Procedure:
- Move the declaration to a header file directly under
Include/
and#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x03yy0000
(with theyy
corresponding to Python version).- Make an entry in the stable ABI list. (XXX: mention filename)
- For functions, add a test that calls the function using ctypes (XXX: mention filename).
- Regenerate the autogenerated files. (XXX: specific instructions)
Advice for Extenders and Embedders
The following notes will be added to documentation.
Extension authors should test with all Python versions they support, and preferably build with the lowest such version.
Compiling with
Py_LIMITED_API
defined is *not* a guarantee that your code conforms to the limited API or the stable ABI. It only covers definitions, but an API also includes other issues, such as expected semantics.Examples of issues that
Py_LIMITED_API
does not guard against are:
- Calling a function with invalid arguments
- An function that started accepting
NULL
values for an argument in Python 3.9 will fail ifNULL
is passed to it under Python 3.8. Only testing with 3.8 (or lower versions) will uncover this issue.- Some structs include a few fields that are part of the stable ABI and other fields that aren't.
Py_LIMITED_API
does not filter out such “private” fields.- Using something that is not documented as part of the stable ABI, but exposed even with
Py_LIMITED_API
defined. Despite the team's best efforts, such issues may happen.Backwards Compatibility
The PEP aims at full compatibility with the existing stable ABI and limited API, but defines them terms more explicitly. It might not be consistent with some interpretations of what the existing stable ABI/limited API is.
Security Implications
None known.
How to Teach This
Technical documentation will be provided. It will be aimed at experienced users familiar with C.
Reference Implementation
None so far.
Rejected Ideas
While this PEP acknowledges that parts of the limited API might be deprecated or removed in the future, a process to do this is not in scope, and is left to a possible future PEP.
Open Issues
None so far.
References
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
.. _Devguide: https://devguide.python.org/
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org https://mail.python.org/mailman3/lists/capi-sig.python.org/ Member address: pablogsal@gmail.com
Thanks! I'll mention it in the next draft.
A big reason I started writing this is to get everyone on the same and coordinate, as I'm having trouble keeping track of everything happening around the API. I'm glad to learn that you're one step ahead of me in the automated checking!
On 12/31/20 7:23 PM, Pablo Galindo Salgado wrote:
Hi,
Recently I had started some effort to make sure the symbols are listed somewhere and that they are exported (as some of them were mistakenly removed in the past). You can see my initial PR here:
https://github.com/python/cpython/pull/23616 <https://github.com/python/cpython/pull/23616>
This is currently active in the CPython CI.
Have a great start of 2021!
On Thu, 31 Dec 2020, 18:18 Petr Viktorin, <encukou@gmail.com <mailto:encukou@gmail.com>> wrote:
Hello. I am writing a PEP to describe how I'd like to "resurrect" and maintain the Stable ABI & Limited API going forward. I assume this'll be my main focus for CPython in 2021. I'm not as far along as I hoped to be at the end of 2020, but I guess it's time to request comments. If you have any thoughts, arguments or improvements I'd be happy to hear them! I'm including the text below; a rendered and forkable/pull-request-able version is at https://github.com/encukou/abi3/blob/main/PEP.rst <https://github.com/encukou/abi3/blob/main/PEP.rst> Have a great start of 2021! --- PEP: 9999 Title: Maintaining the Stable ABI Author: Petr Viktorin <encukou@gmail.com <mailto:encukou@gmail.com>> Discussions-To: Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Dec-2020 Abstract ======== [A short (~200 word) description of the technical issue being addressed.] XXX: Abstract should be written last Motivation ========== :pep:`384` defined a limited API and stable ABI, which allows extenders and embedders of CPython to compile extension modules that are binary-compatible with any subsequent version of 3.x. In theory, this brings many advantages: * A module can be built only once per platform and support multiple versions of Python, reducing time, power and maintainer attention needed for builds. * Binary wheels using the stable ABI work with new versions of CPython throughout the pre-release period, and can be tested in environments where building from source is not practical. As a welcome side effect of the stable ABI's hiding of implementation details is that it is becoming a viable target for alternate Python implementations that need to implement (parts of) the C API. However, in hindsignt, PEP 384 and its implementation has several issues: * There is no process keep the ABI up to date. * Contents of the limited API are not listed explicitly, making it unclear if a particular member (e.g. function, structure) is a part of it. * There is no way to deprecate parts of the limited API. This PEP defines the limited API more clearly and introducess process designed to make the API more useful. Additionally, PEP 384 defines a *limited API* as a way to build against the stable ABI. This PEP defines the limited API more robustly. Rationale ========= This PEP contains a lot of clarifications and definitions, but just one big technical change: the stable ABI will be explicitly listed in a human-maintained “manifest” file. There have been efforts to collect such lists automatically, e.g. by scanning the symbols exported from Python. This might seem to be easier to maintain by our volunteer team. However, designing a future-proof API is not a trivial task. The cost of updating an explicit manifest is small compared to the overall work that should go into changing API that will need to be suppported forever (or until Python 3 reaches end of life, if that comes sooner). This PEP proposes automatically generating things *from* the manifest: initially documentation and DLL contents, with later possibilities for also automating tests. Stable ABI vs. Limited API ========================== :pep:`384` and this document deal with the *Limited API* and the *Stable ABI*, two related but distinct concepts. This section clarifies what they mean and defines some of their semantics (either pre-existing or newly proposed here). The word “Extensions” is used as a shorthand for all code that uses the Python API, e.g. extension modules or software that embeds Python. Stable ABI ---------- The CPython *Stable ABI* is a promise that extensions built with a specific Cpython version will be usable with any newer interpreter of the same major version, on the same platform and with the same compiler & settings. For example, a extension built with CPython 3.10 Stable ABI will be usable with CPython 3.11, 3.12, and so on, but not necessarily with 4.0. The Stable ABI is not generally forward-compatible: an extension built and tested with CPython 3.10 will not generally be compatible with CPython 3.9. .. note:: For example, starting in Python 3.10, the `Py_tp_doc` slot may be set to `NULL`, while in older versions, a `NULL` value will likely crash the interpreter. The Stable ABI trades performance for its stability. For example, many functions in the stable ABI are available as faster macros to extensions that are built for a specific CPython version. Future Python sversions may deprecate some members of the Stable ABI. Such members will still work, but may suffer from issues like reduced performance or, in the most extreme cases, memory/resource leaks. Limited API ----------- Stable ABI guarantee holds for extensions compiled from code that restricts itself to the *Limited API*, a subset of CPython's C API. The limited API is used when preprocessor macro `Py_LIMITED_API` is defined to either `3` or the current `PYTHON_API_VERSION`. The Limited API is not guaranteed to be *stable*. In the future, parts of the limited API may be deprecated. They may even be removed, as long as the *stable ABI* is kept stable and Python's general backwards compatibility policy, :pep:`387`, is followed. .. note:: For example, a function declaration might be removed from public header files but kept in the library. This is currently a possibility for the future; this PEP does not to propose a concrete process for deprecations and removals. The goal is for the limited API to cover everything needed to interact with the interpreter. There main reasons to not include a public API in the limited subset should be that it needs implementation details that change between CPython versions, like struct memory layouts, for performance reasons. The limited API is not limited to CPython; other implementations are encouraged to implement it and help drive its design. Specification ============= To make the stable ABI more useful and stable, the following changes are proposed. Stable ABI Manifest ------------------- All members of the stable ABI – functions, typedefs, structs, struct fields, data values etc. – will be explicitly listed in a single "manifest" file, along with the Limited API version they were added in. Struct fields that users of the stable ABI are allowed to access will be listed explicitly. Members that are not part of the Limited API, but are part of the Stable ABI (e.g. ``PyObject.ob_type``, which is accessible by the ``Py_TYPE`` macro), will be annotated as such. Notes saying “Part of the stable ABI” will be added to Python's documentation automatically, in a way similar to the notes on functions that return borrowed references. Source for the Windows shared library `python3.dll` will be generated from the stable ABI definition. The format of the manifest will be subject to change whenever needed. It should be consumed only by scripts in the CPython repository. If a more public list is needed, a script can be added to generate it. Contents of the Stable ABI -------------------------- The initial stable ABI manifest will include: * The Stable ABI specified in :pep:`384`. * All functions listed in ``PC/python3dll.c``. * All structs (struct typedefs) which these functions return or take as arguments. (Fields of such structs will not necessarily be added.) * New type slots, such as ``Py_am_aiter``. * The type flags ``Py_TPFLAGS_DEFAULT``, ``Py_TPFLAGS_BASETYPE``, ``Py_TPFLAGS_HAVE_GC``, ``Py_TPFLAGS_METHOD_DESCRIPTOR``. * The calling conventions ``METH_*`` (except deprecated ones). * All API needed by macros is the stable ABI (usually annotated as not being part of the limited API). Additional items may be aded to the initial manifest according to the checklist below. Testing the Stable ABI ---------------------- An automatically generated test module will be added to ensure that all members of the stable ABI are available at compile time For each function in the stable ABI, a test will be added that calls the function using `ctypes`. (Where calling is not practical, such as with functions related to intepreter initialization and shutdown, the test will only look the function up.) This should prevent regressions when a function is converted to a macro, which keeps the same API but breaks the ABI. An check will be added to ensure all functions in the stable ABI are tested this way. Changing the Limited API ------------------------ A checklist for changing the limited API, including new members (structs, functions or values), will be added to the `Devguide`_. The checklist will 1) mention best practices and common pitfalls in Python C API design and 2) guide the developer around the files that need changing and scripts that need running when the limited API is changed. Below is the initial proposal for the checklist. After the PEP is accepted, see the Devguide for the current version. Note that the checklist applies to new additions; not the existing limited API. Design considerations: * Make sure the change does not break the Stable ABI of any version of Python since 3.5. * Make sure no exposed names are private (i.e. begin with an underscore). * Make sure the new API is well documented. * Make sure the types of all parameters and return values of the added function(s) and all fields of the added struct(s) are be part of the limited API (or standard C). * Make sure the new API and its intended use follows standard C, not just features of currently suppoerted platforms. * Do not cast a function pointer to ``void*`` (a data pointer) or vice versa. * Make sure the new API follows reference counting conventions. (Following them makes the API easier to reason about, and easier use in other Python implementations.) * Do not return borrowed references from functions. * Do not steal references to function arguments. * Make sure the ownership rules and lifetimes of all applicable struct fields, arguments and return values are well defined. * Think about ease of use for the user. (In C, ease of use itself is not very important; what *is* important is reducing boilerplate code needed to use the API. Bugs like to hide in boiler plates.) * If a function will be often called with specific value for an argument, consider making it default (assumed when ``NULL`` is passed in). * Think about future extensions: for example, if it's possible that future Python versions will need to add a new field to your struct, how will that be done? * Make as few assumptions as possible about details that might change in future CPython versions or differ across C API implementations: * The GIL * Garbage collection * Layout of PyObject and other structs If following these guidelines would hurt performance, add a fast function (or macro) to the non-limited API and a stable equivalent to the limited API. If anything is unclear, or you have a good reason to break the guidelines, consider discussing the change at the `capi-sig`_ mailing list. .. _capi-sig: https://mail.python.org/mailman3/lists/capi-sig.python.org/ <https://mail.python.org/mailman3/lists/capi-sig.python.org/> Procedure: * Move the declaration to a header file directly under ``Include/`` and ``#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x03yy0000`` (with the ``yy`` corresponding to Python version). * Make an entry in the stable ABI list. (XXX: mention filename) * For functions, add a test that calls the function using ctypes (XXX: mention filename). * Regenerate the autogenerated files. (XXX: specific instructions) Advice for Extenders and Embedders ---------------------------------- The following notes will be added to documentation. Extension authors should test with all Python versions they support, and preferably build with the lowest such version. Compiling with ``Py_LIMITED_API`` defined is *not* a guarantee that your code conforms to the limited API or the stable ABI. It only covers definitions, but an API also includes other issues, such as expected semantics. Examples of issues that ``Py_LIMITED_API`` does not guard against are: * Calling a function with invalid arguments * An function that started accepting ``NULL`` values for an argument in Python 3.9 will fail if ``NULL`` is passed to it under Python 3.8. Only testing with 3.8 (or lower versions) will uncover this issue. * Some structs include a few fields that are part of the stable ABI and other fields that aren't. ``Py_LIMITED_API`` does not filter out such “private” fields. * Using something that is not documented as part of the stable ABI, but exposed even with ``Py_LIMITED_API`` defined. Despite the team's best efforts, such issues may happen. Backwards Compatibility ======================= The PEP aims at full compatibility with the existing stable ABI and limited API, but defines them terms more explicitly. It might not be consistent with some interpretations of what the existing stable ABI/limited API is. Security Implications ===================== None known. How to Teach This ================= Technical documentation will be provided. It will be aimed at experienced users familiar with C. Reference Implementation ======================== None so far. Rejected Ideas ============== While this PEP acknowledges that parts of the limited API might be deprecated or removed in the future, a process to do this is not in scope, and is left to a possible future PEP. Open Issues =========== None so far. References ========== Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. .. _Devguide: https://devguide.python.org/ <https://devguide.python.org/> .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: _______________________________________________ capi-sig mailing list -- capi-sig@python.org <mailto:capi-sig@python.org> To unsubscribe send an email to capi-sig-leave@python.org <mailto:capi-sig-leave@python.org> https://mail.python.org/mailman3/lists/capi-sig.python.org/ <https://mail.python.org/mailman3/lists/capi-sig.python.org/> Member address: pablogsal@gmail.com <mailto:pablogsal@gmail.com>
Thanks for putting the proposal together. I will read it carefully as well to see if I can be of help.
One note regarding the current checks: the current script was mainly aimed at avoiding symbols to be removed mistakenly because that has happened 3 or 4 times already but there are many things that are not checked currently. For instance we don't check currently that the API or the ABI are fully preserved so changing the type of a field in a struct or a parameter in a function will not be detected with this initial check. For the CI to fail you need to remove an entire symbol. That's why I think your proposal of getting something automated using ctypes is much better and will certainly be the future of these checks.
On Thu, 31 Dec 2020, 18:29 Petr Viktorin, <encukou@gmail.com> wrote:
Thanks! I'll mention it in the next draft.
A big reason I started writing this is to get everyone on the same and coordinate, as I'm having trouble keeping track of everything happening around the API. I'm glad to learn that you're one step ahead of me in the automated checking!
Hi,
Recently I had started some effort to make sure the symbols are listed somewhere and that they are exported (as some of them were mistakenly removed in the past). You can see my initial PR here:
https://github.com/python/cpython/pull/23616 <https://github.com/python/cpython/pull/23616>
This is currently active in the CPython CI.
Have a great start of 2021!
On Thu, 31 Dec 2020, 18:18 Petr Viktorin, <encukou@gmail.com <mailto:encukou@gmail.com>> wrote:
Hello. I am writing a PEP to describe how I'd like to "resurrect" and
On 12/31/20 7:23 PM, Pablo Galindo Salgado wrote: maintain
the Stable ABI & Limited API going forward. I assume this'll be my
main
focus for CPython in 2021. I'm not as far along as I hoped to be at the end of 2020, but I guess it's time to request comments. If you have any thoughts, arguments or improvements I'd be happy to hear them! I'm including the text below; a rendered and
forkable/pull-request-able
version is at https://github.com/encukou/abi3/blob/main/PEP.rst <https://github.com/encukou/abi3/blob/main/PEP.rst> Have a great start of 2021! --- PEP: 9999 Title: Maintaining the Stable ABI Author: Petr Viktorin <encukou@gmail.com <mailto:encukou@gmail.com>> Discussions-To: Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 08-Dec-2020 Abstract ======== [A short (~200 word) description of the technical issue being addressed.] XXX: Abstract should be written last Motivation ========== :pep:`384` defined a limited API and stable ABI, which allows extenders and embedders of CPython to compile extension modules that are binary-compatible with any subsequent version of 3.x. In theory, this brings many advantages: * A module can be built only once per platform and support multiple versions of Python, reducing time, power and maintainer attention needed
for
builds. * Binary wheels using the stable ABI work with new versions of
CPython
throughout the pre-release period, and can be tested in environments where building from source is not practical. As a welcome side effect of the stable ABI's hiding of implementation details is that it is becoming a viable target for alternate Python implementations that need to implement (parts of) the C API. However, in hindsignt, PEP 384 and its implementation has several issues: * There is no process keep the ABI up to date. * Contents of the limited API are not listed explicitly, making it unclear if a particular member (e.g. function, structure) is a part of
it.
* There is no way to deprecate parts of the limited API. This PEP defines the limited API more clearly and introducess process designed to make the API more useful. Additionally, PEP 384 defines a *limited API* as a way to build against the stable ABI. This PEP defines the limited API more robustly. Rationale ========= This PEP contains a lot of clarifications and definitions, but just one big technical change: the stable ABI will be explicitly listed in a human-maintained “manifest” file. There have been efforts to collect such lists automatically, e.g. by scanning the symbols exported from Python. This might seem to be easier to maintain by our volunteer team. However, designing a future-proof API is not a trivial task. The cost of updating an explicit manifest is small compared to the overall work that should go into changing API that will need
to
be suppported forever (or until Python 3 reaches end of life, if that comes sooner). This PEP proposes automatically generating things *from* the
manifest:
initially documentation and DLL contents, with later possibilities for also automating tests. Stable ABI vs. Limited API ========================== :pep:`384` and this document deal with the *Limited API* and the *Stable ABI*, two related but distinct concepts. This section clarifies what they mean and defines some of their semantics (either pre-existing or newly proposed here). The word “Extensions” is used as a shorthand for all code that uses
the
Python API, e.g. extension modules or software that embeds Python. Stable ABI ---------- The CPython *Stable ABI* is a promise that extensions built with a specific Cpython version will be usable with any newer interpreter of the same major version, on the same platform and with the same compiler & settings. For example, a extension built with CPython 3.10 Stable ABI will be usable with CPython 3.11, 3.12, and so on, but not necessarily with 4.0. The Stable ABI is not generally forward-compatible: an extension built and tested with CPython 3.10 will not generally be compatible with CPython 3.9. .. note:: For example, starting in Python 3.10, the `Py_tp_doc` slot may be set to `NULL`, while in older versions, a `NULL` value will likely crash the interpreter. The Stable ABI trades performance for its stability. For example, many functions in the stable ABI are available as faster macros to extensions that are built for a specific CPython version. Future Python sversions may deprecate some members of the Stable ABI. Such members will still work, but may suffer from issues like reduced performance or, in the most extreme cases, memory/resource leaks. Limited API ----------- Stable ABI guarantee holds for extensions compiled from code that restricts itself to the *Limited API*, a subset of CPython's C API. The limited API is used when preprocessor macro `Py_LIMITED_API` is defined to either `3` or the current `PYTHON_API_VERSION`. The Limited API is not guaranteed to be *stable*. In the future, parts of the limited API may be deprecated. They may even be removed, as long as the *stable ABI* is kept stable and Python's general backwards compatibility policy,
:pep:
387
,is followed. .. note:: For example, a function declaration might be removed from public header files but kept in the library. This is currently a possibility for the future; this PEP does not to propose a concrete process for deprecations and removals. The goal is for the limited API to cover everything needed to
interact
with the interpreter. There main reasons to not include a public API in the limited subset should be that it needs implementation details that change between CPython versions, like struct memory layouts, for performance reasons. The limited API is not limited to CPython; other implementations are encouraged to implement it and help drive its design. Specification ============= To make the stable ABI more useful and stable, the following changes are proposed. Stable ABI Manifest ------------------- All members of the stable ABI – functions, typedefs, structs, struct fields, data values etc. – will be explicitly listed in a single "manifest" file, along with the Limited API version they were added in. Struct fields that users of the stable ABI are allowed to access
will be
listed explicitly. Members that are not part of the Limited API, but are part of the Stable ABI (e.g. ``PyObject.ob_type``, which is accessible by the ``Py_TYPE`` macro), will be annotated as such. Notes saying “Part of the stable ABI” will be added to Python's documentation automatically, in a way similar to the notes on functions that return borrowed references. Source for the Windows shared library `python3.dll` will be generated from the stable ABI definition. The format of the manifest will be subject to change whenever needed. It should be consumed only by scripts in the CPython repository. If a more public list is needed, a script can be added to generate
it.
Contents of the Stable ABI -------------------------- The initial stable ABI manifest will include: * The Stable ABI specified in :pep:`384`. * All functions listed in ``PC/python3dll.c``. * All structs (struct typedefs) which these functions return or take
as
arguments. (Fields of such structs will not necessarily be
added.)
* New type slots, such as ``Py_am_aiter``. * The type flags ``Py_TPFLAGS_DEFAULT``, ``Py_TPFLAGS_BASETYPE``, ``Py_TPFLAGS_HAVE_GC``, ``Py_TPFLAGS_METHOD_DESCRIPTOR``. * The calling conventions ``METH_*`` (except deprecated ones). * All API needed by macros is the stable ABI (usually annotated as not being part of the limited API). Additional items may be aded to the initial manifest according to the checklist below. Testing the Stable ABI ---------------------- An automatically generated test module will be added to ensure that
all
members of the stable ABI are available at compile time For each function in the stable ABI, a test will be added that calls
the
function using `ctypes`. (Where calling is not practical, such as
with
functions related to intepreter initialization and shutdown, the test will only look the function up.) This should prevent regressions when a function is converted to a
macro,
which keeps the same API but breaks the ABI. An check will be added to ensure all functions in the stable ABI are tested this way. Changing the Limited API ------------------------ A checklist for changing the limited API, including new members (structs, functions or values), will be added to the `Devguide`_. The checklist will 1) mention best practices and common pitfalls in Python C API design and 2) guide the developer around the files that need changing and scripts that need running when the limited API is changed. Below is the initial proposal for the checklist. After the PEP is accepted, see the Devguide for the current version. Note that the checklist applies to new additions; not the existing limited API. Design considerations: * Make sure the change does not break the Stable ABI of any version
of
Python since 3.5. * Make sure no exposed names are private (i.e. begin with an underscore). * Make sure the new API is well documented. * Make sure the types of all parameters and return values of the
added
function(s) and all fields of the added struct(s) are be part of
the
limited API (or standard C). * Make sure the new API and its intended use follows standard C, not just features of currently suppoerted platforms. * Do not cast a function pointer to ``void*`` (a data pointer) or vice versa. * Make sure the new API follows reference counting conventions. (Following them makes the API easier to reason about, and easier use in other
Python
implementations.) * Do not return borrowed references from functions. * Do not steal references to function arguments. * Make sure the ownership rules and lifetimes of all applicable
struct
fields, arguments and return values are well defined. * Think about ease of use for the user. (In C, ease of use itself is not very important; what *is* important is reducing boilerplate code needed to use the API. Bugs like to hide in boiler plates.) * If a function will be often called with specific value for an argument, consider making it default (assumed when ``NULL`` is passed
in).
* Think about future extensions: for example, if it's possible that future Python versions will need to add a new field to your struct, how will that be done? * Make as few assumptions as possible about details that might
change in
future CPython versions or differ across C API implementations: * The GIL * Garbage collection * Layout of PyObject and other structs If following these guidelines would hurt performance, add a fast function (or macro) to the non-limited API and a stable equivalent to the limited API. If anything is unclear, or you have a good reason to break the guidelines, consider discussing the change at the `capi-sig`_ mailing list. .. _capi-sig: https://mail.python.org/mailman3/lists/capi-sig.python.org/ <https://mail.python.org/mailman3/lists/capi-sig.python.org/> Procedure: * Move the declaration to a header file directly under ``Include/``
and
``#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >=
0x03yy0000``
(with the ``yy`` corresponding to Python version). * Make an entry in the stable ABI list. (XXX: mention filename) * For functions, add a test that calls the function using ctypes (XXX: mention filename). * Regenerate the autogenerated files. (XXX: specific instructions) Advice for Extenders and Embedders ---------------------------------- The following notes will be added to documentation. Extension authors should test with all Python versions they support, and preferably build with the lowest such version. Compiling with ``Py_LIMITED_API`` defined is *not* a guarantee that your code conforms to the limited API or the stable ABI. It only covers definitions, but an API also includes other issues, such as expected semantics. Examples of issues that ``Py_LIMITED_API`` does not guard against
are:
* Calling a function with invalid arguments * An function that started accepting ``NULL`` values for an argument in Python 3.9 will fail if ``NULL`` is passed to it under Python 3.8. Only testing with 3.8 (or lower versions) will uncover this
issue.
* Some structs include a few fields that are part of the stable ABI
and
other fields that aren't. ``Py_LIMITED_API`` does not filter out such “private” fields. * Using something that is not documented as part of the stable ABI, but exposed even with ``Py_LIMITED_API`` defined. Despite the team's best efforts, such issues may happen. Backwards Compatibility ======================= The PEP aims at full compatibility with the existing stable ABI and limited API, but defines them terms more explicitly. It might not be consistent with some interpretations of what the existing stable ABI/limited API is. Security Implications ===================== None known. How to Teach This ================= Technical documentation will be provided. It will be aimed at experienced users familiar with C. Reference Implementation ======================== None so far. Rejected Ideas ============== While this PEP acknowledges that parts of the limited API might be deprecated or removed in the future, a process to do this is not in scope, and is left to a possible future PEP. Open Issues =========== None so far. References ========== Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. .. _Devguide: https://devguide.python.org/ <https://devguide.python.org/> .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: _______________________________________________ capi-sig mailing list -- capi-sig@python.org <mailto:capi-sig@python.org> To unsubscribe send an email to capi-sig-leave@python.org <mailto:capi-sig-leave@python.org> https://mail.python.org/mailman3/lists/capi-sig.python.org/ <https://mail.python.org/mailman3/lists/capi-sig.python.org/> Member address: pablogsal@gmail.com <mailto:pablogsal@gmail.com>
On Fri, 1 Jan 2021 at 04:18, Petr Viktorin <encukou@gmail.com> wrote:
Hello. I am writing a PEP to describe how I'd like to "resurrect" and maintain the Stable ABI & Limited API going forward. I assume this'll be my main focus for CPython in 2021. I'm not as far along as I hoped to be at the end of 2020, but I guess it's time to request comments.
It looks like a well-thought out and justified proposal to me.
I did have some questions while reading the early sections, but you had answered them by the end of the PEP (mainly the point about why code compiled and tested with an older "Py_LIMITED_API" value on a newer Python may still crash when run on an older version of Python, even if that version exports all the symbols referenced).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Hi,
Yeah, the stable ABI is not tested, maintained manually and errors are very common :-( Something should be done to simply its maintenance and reduce the risk of manual errors. This PEP looks like a great step forward!
If we want to optimize CPython (which remains an open question for 5 years at least!), the limited C API and the stable ABI must evolve. I tried to write down technical reasons to explain why: https://pythoncapi.readthedocs.io/optimize_python.html
For example, the ability to dereference a PyObject* to access PyObject members prevents implementing a moving garbage collector.
"Evolving" here means "introducing incompatible changes on purpose" which goes against the principle of a "stable" API :-( If we break the API once, maybe we should try take in account all currently known issues and try to address all of them at once. The problem is that the current C API is based on "PyObject*". Moving away from PyObject* would be a radical change (more or less rewrite the whole C API). So it doesn't seem possible to fix the API "at once". (I'm not talking about the stable ABI here, but the API.)
The HPy project tries to design a new C API from scratch which doesn't inherit these design issues. Extensions written with HPy are faster on PyPy than the same existing written with the Python C API, and HPy is as fast as the Python C API on CPython. HPy C API is incompatible with the existing Python C API but the migration should not be "too hard" (I didn't try, so I cannot testify). HPy design should also provide a stable ABI (but the project is not mature yet). One advantage is that HPy doesn't require to touch CPython, it's developed externally.
HPy is still young since it's uncertain to only bet on it. That's why I'm modifying the C API in CPython (hide implementation details: PEP 620) in the meanwhile to make it "less broken" (prepare it for future optimizations).
It would be great if optimizing Python and making the stable ABI would not be two "incompatible" goals. But the practical question is more "do we want to fix a few issues with minor incompatible changes in the short term?" or "do we want to fix all issues with major incompatible changes at once for the long term?". Since HPy is developed externally, it doesn't seem incompatible to fix the limited C API and the stable ABI in CPython and develop HPy in parallel.
About the PEP, maybe it would be interesting to explain that it doesn't try to solve all issues at once for practical reasons, and that projects like HPy can be solutions for that.
Victor
On 1/29/21 7:43 PM, Victor Stinner wrote:
Hi,
Yeah, the stable ABI is not tested, maintained manually and errors are very common :-( Something should be done to simply its maintenance and reduce the risk of manual errors. This PEP looks like a great step forward!
If we want to optimize CPython (which remains an open question for 5 years at least!), the limited C API and the stable ABI must evolve. I tried to write down technical reasons to explain why: https://pythoncapi.readthedocs.io/optimize_python.html
For example, the ability to dereference a PyObject* to access PyObject members prevents implementing a moving garbage collector.
"Evolving" here means "introducing incompatible changes on purpose" which goes against the principle of a "stable" API :-( If we break the API once, maybe we should try take in account all currently known issues and try to address all of them at once. The problem is that the current C API is based on "PyObject*". Moving away from PyObject* would be a radical change (more or less rewrite the whole C API). So it doesn't seem possible to fix the API "at once". (I'm not talking about the stable ABI here, but the API.)
I think "introducing incompatible changes on purpose" makes Python worse without much benefit. Has any of the recent incompatible changes actually made Python faster?
Besides the non-opaque PyObject*, the worst thing that prevents "evolution" now is IMO that built-in static types like PyList_Type are exposed in the API and stable ABI as global symbols. This prevents making the GIL per-interpreter. Here's how stuff like that can be changed:
Formalize, test and document the stable ABI
Introduce new API (e.g. PyList_GetListType())
Deprecate the old API
Create
abi4
, a new, incompatible version of the stable ABI, which doesn't include deprecated stuff. Create a flag to mark extension modules that use it.Make it possible to create subinterpreters with a separate GIL. These can only load
abi4
extension modules.Work with various third-party projects to switch to abi4 so we can start getting its benefits, and improve abi4 along the way.
?. In Python 4.0, drop abi3
.
The HPy project tries to design a new C API from scratch which doesn't inherit these design issues. Extensions written with HPy are faster on PyPy than the same existing written with the Python C API, and HPy is as fast as the Python C API on CPython. HPy C API is incompatible with the existing Python C API but the migration should not be "too hard" (I didn't try, so I cannot testify). HPy design should also provide a stable ABI (but the project is not mature yet). One advantage is that HPy doesn't require to touch CPython, it's developed externally.
HPy is still young since it's uncertain to only bet on it. That's why I'm modifying the C API in CPython (hide implementation details: PEP 620) in the meanwhile to make it "less broken" (prepare it for future optimizations).
PEP 620 has many things, some of which conflict with the accepted PEP 387. I don't think all of it is a good plan (and I guess that's why it wasn't accepted).
It would be great if optimizing Python and making the stable ABI would not be two "incompatible" goals. But the practical question is more "do we want to fix a few issues with minor incompatible changes in the short term?" or "do we want to fix all issues with major incompatible changes at once for the long term?". Since HPy is developed externally, it doesn't seem incompatible to fix the limited C API and the stable ABI in CPython and develop HPy in parallel.
About the PEP, maybe it would be interesting to explain that it doesn't try to solve all issues at once for practical reasons, and that projects like HPy can be solutions for that.
I don't think it's worth it to mention that the PEP doesn't try to solve all issues at once. Of course it doesn't :)
Hello!
I'm replying to clarifying a few things about HPy since it was mentioned:
On Mon, Feb 1, 2021 at 12:07 PM Petr Viktorin <encukou@gmail.com> wrote:
Besides the non-opaque PyObject*, the worst thing that prevents "evolution" now is IMO that built-in static types like PyList_Type are exposed in the API
There are the two core things that HPy addresses. Making PyObject* opaque is the bigger of the two changes, because it affects the majority of API methods. HPy attempts to address both of these issues in a "minimalist" way in the sense that although it is a whole new API, it tries to be the obvious conversion from the old API.
PEP 620 has many things, some of which conflict with the accepted PEP 387. I don't think all of it is a good plan (and I guess that's why it wasn't accepted).
HPy and PEP 620 are largely unrelated. They are inspired by some of the same issues, but are completely different things. HPy can (and in the case of CPython is being) implemented on top of the existing C API, and so doesn't require a PEP.
Perhaps once HPy is more mature, CPython's API will evolve to support HPy a little better or to make porting from the old API to HPy easier, but it's likely still slightly premature to be trying to imagine what those changes are, although generally hiding parts of PyObject* or global state like static types is likely to be helpful.
Yours sincerely, Simon Cross
On Mon, Feb 1, 2021 at 2:16 PM Simon Cross <hodgestar@gmail.com> wrote:
HPy and PEP 620 are largely unrelated. They are inspired by some of the same issues, but are completely different things. HPy can (and in the case of CPython is being) implemented on top of the existing C API, and so doesn't require a PEP.
In my opinion, the main advantage of HPy over the existing C API is that it's faster on PyPy. It doesn't provide advantage for CPython or for existing extensions using the C API.
The PEP 620 prepares CPython code base for future optimizations. It has a different scope.
HPy and PEP 620 projects can evolve in parallel, they are not incompatible ;-) But maybe if the PEP 620 can be fully implemented, it might become easier to port extension modules to HPy. But right now, it's not the case.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On Mon, Feb 1, 2021 at 4:00 PM Victor Stinner <vstinner@python.org> wrote:
In my opinion, the main advantage of HPy over the existing C API is that it's faster on PyPy. It doesn't provide advantage for CPython or for existing extensions using the C API.
I think this is true right now, but I suspect it will be possible to make HPy extensions run faster on CPython too, for the same reason that they can run faster on PyPy now -- i.e. the execution context is explicit and implementation details are not exposed. Exploring this requires HPy to be a bit more mature though so that CPython itself could consider how to better support HPy extensions.
The PEP 620 prepares CPython code base for future optimizations. It has a different scope.
Agreed!
HPy and PEP 620 projects can evolve in parallel, they are not incompatible ;-) But maybe if the PEP 620 can be fully implemented, it might become easier to port extension modules to HPy. But right now, it's not the case.
Also agreed.
On Mon, Feb 1, 2021 at 11:06 AM Petr Viktorin <encukou@gmail.com> wrote:
Besides the non-opaque PyObject*, the worst thing that prevents "evolution" now is IMO that built-in static types like PyList_Type are exposed in the API and stable ABI as global symbols. This prevents making the GIL per-interpreter.
I created https://bugs.python.org/issue40601 to discuss the problem of static types "exposed" (indirectly) in the limited C API.
I wrote an experimental PR to see how we could do that: https://github.com/python/cpython/pull/24146
- Add Py_GetXXXType()
- Replace &XXX_Type with Py_GetXXXType()
These heap types must be created as early as possible, and destroyed as late as possible.
Py_GetXXXType() functions access the current interpreter. Currently, _PyInterpreterState_GET() returns NULL when the GIL is released. It means that accessing types is no longer possible when the GIL is released. I consider that it was never officially supported and so it's not an incompatible change, but some extensions do that and will crash with these changes (we can fix them and document the issue).
Here's how stuff like that can be changed:
- Formalize, test and document the stable ABI
- Introduce new API (e.g. PyList_GetListType())
- Deprecate the old API
I suggest to use something like https://github.com/pythoncapi/pythoncapi_compat to provide new "get" functions to old Python functions (they would access static types) and prepare extension modules for the future incompatible changes, without losing support with old Python versions.
For example: static inline PyTypeObject* Py_GetListType(void) { return &PyList_Type; }
Flag day migration as we did with Python 2 => Python 3 is usually a disaster and annoy everybody.
We should automate the migration as much as we can.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
Hi Petr,
On Mon, Feb 1, 2021 at 11:06 AM Petr Viktorin <encukou@gmail.com> wrote:
I think "introducing incompatible changes on purpose" makes Python worse without much benefit. Has any of the recent incompatible changes actually made Python faster?
Optimizing Python is a hard problem. Look back at the 30 years of CPython optimizations. There were some nice optimizations, but the CPython design is the same for 30 years. If it would be easy, it would already be done. There are big companies would need a faster CPython. Significant speedup (ex: 2x faster) require to deeply change CPython design.
It will only become possible to *start* *experimenting* optimizing CPython when most of the PEP 620 will be implemented. Some people disagree, but so far, I didn't see any concrete working optimization proposed and implemented in CPython (without the PEP 620 being implemented). So everyone is free to have their own "beliefs" in term of optimization possibilities (with or without incompatible C API changes) ;-)
To be clear, the PEP 620 brings zero optimization. It should be done separately. The PEP 620 only prepares CPython for future optimizations.
Besides the non-opaque PyObject*, the worst thing that prevents "evolution" now is IMO that built-in static types like PyList_Type are exposed in the API and stable ABI as global symbols. This prevents making the GIL per-interpreter. Here's how stuff like that can be changed:
- Formalize, test and document the stable ABI
- Introduce new API (e.g. PyList_GetListType())
- Deprecate the old API
- Create
abi4
, a new, incompatible version of the stable ABI, which doesn't include deprecated stuff. Create a flag to mark extension modules that use it.- Make it possible to create subinterpreters with a separate GIL. These can only load
abi4
extension modules.
Step 4 introduces a backward incompatible change without much benefit :-)
Step 5 will only be safe once all static types will be removed, which is an incompatible change.
For major CPython design changes (like running multiple interpreters in parallel), we *need* to introduce incompatible changes.
There are two options: consider that CPython must no longer evolve (and so die slowly as a dead language ;-)), or we should find a way to minimize the number of unhappy people when we introduce incompatible changes :-)
- Work with various third-party projects to switch to abi4 so we can start getting its benefits, and improve abi4 along the way.
Over the last years, I proposed different PEP drafts with an opt-in choice for a faster python which is backward incompatible. Nobody liked this option (having two Python). Look at the Python 2 => Python 3 migration and "CPython vs PyPy". "Faster Python" is not enough to migrate the critical mass of users to the "new" Python, the old Python never goes away, and then you have to maintain two Python runtimes instead of one.
My new approach is to do all changes directly in Python master branch (single Python) and slowly introduce incompatible changes. When too many people complain, I revert the change and help to migrate extensions to the new API. I'm doing that for Py_SET_TYPE() and Py_SET_SIZE() using my pythoncapi_compat project for example (I'm waiting for my Mercurial change to be merged upstream ;-)).
https://github.com/pythoncapi/pythoncapi_compat allows to make existing extensions compatibile with the future incompatible changes, without dropping support for old Python versions. The general plan is:
(a) Introduce a new API and deprecate the old one (b) Run pythoncapi_compat on enough extension modules (and get changes accepted) (c) Once enough extensions are ported, drop the old API
I suggest to have at least one Python release between (a) and (c). For example, I introduced Py_SET_REFCNT() in Python 3.9 and I disallowed "Py_REFCNT(obj) = refcnt;" syntax in Python 3.10.
During (b) phase, we can communicate on the future incompatible changes in What's New in Python X.Y, communicate on capi-sig list and/or python-dev, contact extensions maintainers, etc.
I tried to explain my failures and my new approach on C API incompatible changes in an article: https://vstinner.github.io/hide-implementation-details-python-c-api.html
?. In Python 4.0, drop
abi3
.
I don't see how you can run interpreters in parallel if static types are still in use. In my experience, it immediately crash on various ways: https://bugs.python.org/issue40512#msg383830
PEP 620 has many things, some of which conflict with the accepted PEP 387. I don't think all of it is a good plan (and I guess that's why it wasn't accepted).
The PEP 387 is a generic guidelines how to introduce incompatible changes. It remains possible to introduce incompatible changes differently (ex: without deprecation warning or with no deprecation period) with a dedicated PEP. That's why I wrote PEP 620.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On Mon, Feb 1, 2021 at 3:41 PM Victor Stinner <vstinner@python.org> wrote:
- Create
abi4
, a new, incompatible version of the stable ABI, which doesn't include deprecated stuff. Create a flag to mark extension modules that use it.- Make it possible to create subinterpreters with a separate GIL. These can only load
abi4
extension modules.Step 4 introduces a backward incompatible change without much benefit :-)
Step 5 will only be safe once all static types will be removed, which is an incompatible change.
(...)
?. In Python 4.0, drop
abi3
.I don't see how you can run interpreters in parallel if static types are still in use. In my experience, it immediately crash on various ways: https://bugs.python.org/issue40512#msg383830
Ah I forgot to mention that removing static types from the C API is not enough to run multiple interpreters in parallel. Removing static types would be an incompatible changes with no benefit in the short term. It's the same for the PEP 620.
Other incompatible changes are required. For example, reject imports of extensions which don't implement PEP 489 multi-phase initialization API.
See https://bugs.python.org/issue40512 "Meta issue: per-interpreter GIL" for the current status and future required changes.
I also wrote an article about this project progress and what should be done next: https://vstinner.github.io/isolate-subinterpreters.html
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 01.02.2021 15:41, Victor Stinner wrote:
My new approach is to do all changes directly in Python master branch (single Python) and slowly introduce incompatible changes.
I have a feeling that this will drive away extension writers from Python.
The maintenance burden of having to constantly update the code base to accommodate for changes in the Python C API will be too much for them to handle (I know it's too much for me, which is why I stopped doing extension development).
On top of that, those projects will have to keep compatibility with multiple Python versions, which makes things even more difficult.
When too many people complain, I revert the change and help to migrate extensions to the new API. I'm doing that for Py_SET_TYPE() and Py_SET_SIZE() using my pythoncapi_compat project for example (I'm waiting for my Mercurial change to be merged upstream ;-)).
https://github.com/pythoncapi/pythoncapi_compat allows to make existing extensions compatibile with the future incompatible changes, without dropping support for old Python versions. The general plan is:
(a) Introduce a new API and deprecate the old one (b) Run pythoncapi_compat on enough extension modules (and get changes accepted) (c) Once enough extensions are ported, drop the old API
I suggest to have at least one Python release between (a) and (c). For example, I introduced Py_SET_REFCNT() in Python 3.9 and I disallowed "Py_REFCNT(obj) = refcnt;" syntax in Python 3.10.
During (b) phase, we can communicate on the future incompatible changes in What's New in Python X.Y, communicate on capi-sig list and/or python-dev, contact extensions maintainers, etc.
Other vendors introduce shims to handle breaking changes (e.g. MS on Windows) or even provide complete emulations (e.g. Apple for macOS).
I believe that's the only reasonable way to avoid another Python 2 to 3 15+ year transition process or losing the crowd to other languages. (*)
Your tool is already going in that direction, but rather than having to run the tool over and over again for every release, which adds testing and verification overhead every single time, it would be better to have something like HPy readily working and then use a tool like pythoncapi_compat to migrate over the code once, inserting a shim between the extension using the old style Python C API and the new HPy Python C API.
That shim could also be had at a higher level by e.g. using Cython as layer between the extension and Python; and using a tool to help migrate towards this approach.
The latter is what I'm currently considering, but have not checked the performance hit such an approach would have for low level data type implementations. Could be that it's not feasible.
I tried to explain my failures and my new approach on C API incompatible changes in an article: https://vstinner.github.io/hide-implementation-details-python-c-api.html (*) Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily.
It would still be nice to move away from a global lock, but such a change will also require changes in the extensions, since many are not written in a re-entrant way, so the Python C API modifications to remove the GIL are only part of the solution to free-threading.
What I want to say is that we can take our time to avoid disruption :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 01 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 2/1/2021 10:17 AM, M.-A. Lemburg wrote:
On 01.02.2021 15:41, Victor Stinner wrote:
My new approach is to do all changes directly in Python master branch (single Python) and slowly introduce incompatible changes. I have a feeling that this will drive away extension writers from Python.
The maintenance burden of having to constantly update the code base to accommodate for changes in the Python C API will be too much for them to handle (I know it's too much for me, which is why I stopped doing extension development).
On top of that, those projects will have to keep compatibility with multiple Python versions, which makes things even more difficult.
When too many people complain, I revert the change and help to migrate extensions to the new API. I'm doing that for Py_SET_TYPE() and Py_SET_SIZE() using my pythoncapi_compat project for example (I'm waiting for my Mercurial change to be merged upstream ;-)).
https://github.com/pythoncapi/pythoncapi_compat allows to make existing extensions compatibile with the future incompatible changes, without dropping support for old Python versions. The general plan is:
(a) Introduce a new API and deprecate the old one (b) Run pythoncapi_compat on enough extension modules (and get changes accepted) (c) Once enough extensions are ported, drop the old API
I suggest to have at least one Python release between (a) and (c). For example, I introduced Py_SET_REFCNT() in Python 3.9 and I disallowed "Py_REFCNT(obj) = refcnt;" syntax in Python 3.10.
During (b) phase, we can communicate on the future incompatible changes in What's New in Python X.Y, communicate on capi-sig list and/or python-dev, contact extensions maintainers, etc. Other vendors introduce shims to handle breaking changes (e.g. MS on Windows) or even provide complete emulations (e.g. Apple for macOS).
I believe that's the only reasonable way to avoid another Python 2 to 3 15+ year transition process or losing the crowd to other languages. (*)
Your tool is already going in that direction, but rather than having to run the tool over and over again for every release, which adds testing and verification overhead every single time, it would be better to have something like HPy readily working and then use a tool like pythoncapi_compat to migrate over the code once, inserting a shim between the extension using the old style Python C API and the new HPy Python C API.
That shim could also be had at a higher level by e.g. using Cython as layer between the extension and Python; and using a tool to help migrate towards this approach.
The latter is what I'm currently considering, but have not checked the performance hit such an approach would have for low level data type implementations. Could be that it's not feasible.
I agree with MAL here.
I guess my question is: if we go with "slowly introduce incompatible changes", is there some point (let's call it an inflection point) that we think we can get some performance improvements before we drop the C-API entirely? If there's not, I don't think there's any benefit to a slowly introducing incompatible changes. And if there is, let's identify that inflection point now, so we can sell the benefits. And also let's consider not doing slowly incompatible changes, but rather just save all of the changes up for this inflection point.
Eric
I tried to explain my failures and my new approach on C API incompatible changes in an article: https://vstinner.github.io/hide-implementation-details-python-c-api.html (*) Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily.
It would still be nice to move away from a global lock, but such a change will also require changes in the extensions, since many are not written in a re-entrant way, so the Python C API modifications to remove the GIL are only part of the solution to free-threading.
What I want to say is that we can take our time to avoid disruption :-)
On Mon, Feb 1, 2021 at 4:17 PM M.-A. Lemburg <mal@egenix.com> wrote:
I have a feeling that this will drive away extension writers from Python.
The maintenance burden of having to constantly update the code base to accommodate for changes in the Python C API will be too much for them to handle (I know it's too much for me, which is why I stopped doing extension development).
https://docs.python.org/dev/extending/ starts with:
"This guide only covers the basic tools for creating extensions provided as part of this version of CPython. Third party tools like Cython, cffi, SWIG and Numba offer both simpler and more sophisticated approaches to creating C and C++ extensions for Python."
I suggest to stop writing new extensions directly using the C API. The problem is more about maintaining existing extensions written with the C API.
On top of that, those projects will have to keep compatibility with multiple Python versions, which makes things even more difficult.
Cython does that for you.
Other vendors introduce shims to handle breaking changes (e.g. MS on Windows) or even provide complete emulations (e.g. Apple for macOS).
Some people discussed writing a "cpyext"-like (similar to the PyPy module) for CPython, to provide backward compatibility. But so far, nobody came up with an implementation, or even a design.
I believe that's the only reasonable way to avoid another Python 2 to 3 15+ year transition process or losing the crowd to other languages. (*)
The Python 2 => Python 3 migration requires tons of changes at once and it took something like 5 years to have convenient solutions to have a single code base compatible with Python 2 and Python 3.
I'm proposing something different. From the start, providing concrete solutions to have a single code base compatible with the old and the new Python versions. Also, I propose to only make tiny incompatible changes. I started around Python 3.7. If you didn't notice, it means that my incompatible changes only impact a minority of users ;-) It's nothing like Python str type becoming Unicode in Python 3.
Your tool is already going in that direction, but rather than having to run the tool over and over again for every release, which adds testing and verification overhead every single time, it would be better to have something like HPy readily working and then use a tool like pythoncapi_compat to migrate over the code once, inserting a shim between the extension using the old style Python C API and the new HPy Python C API.
I agree that HPy is a better solution for the long term. But HPy is not mature yet: https://hpy.readthedocs.io/en/latest/overview.html#current-status-and-roadma...
I don't want to have to wait years until HPy is mature. We can already make a few changes which only impact a minority of users.
That shim could also be had at a higher level by e.g. using Cython as layer between the extension and Python; and using a tool to help migrate towards this approach.
Right. Don't use the C API, but Cython ;-) Cython can be modified to emit HPy code.
What I want to say is that we can take our time to avoid disruption :-)
In short, you are against CPython evolutions and consider that CPython is good as it is (it's not worth it to optimize CPython). Well, that's one of the two choices in my previous email.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 01.02.2021 17:40, Victor Stinner wrote:
What I want to say is that we can take our time to avoid disruption :-)
In short, you are against CPython evolutions and consider that CPython is good as it is (it's not worth it to optimize CPython). Well, that's one of the two choices in my previous email.
Sorry, but that's a gross misrepresentation of what I said.
I said that we do have the time to wait for HPy to mature and can then make the switch-over to the new API an easy one step solution, rather than introducing breaking changes with every single Python release. The latter causes churn and harms Python's adoption more than it helps.
Python's success is in large parts built on the ease of using it's C API and the thousands of extensions out there to integrate with other software. People use Python to make other software accessible. That's the key advantage we have over other languages.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 01 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Mon, Feb 1, 2021 at 6:08 PM M.-A. Lemburg <mal@egenix.com> wrote:
I said that we do have the time to wait for HPy to mature and can then make the switch-over to the new API an easy one step solution, rather than introducing breaking changes with every single Python release. The latter causes churn and harms Python's adoption more than it helps.
As I wrote in a previous email, HPy by itself doesn't solve any CPython design issue, since we want to continue supporting C extension modules written with the C API.
For example, are you ready to drop support for numpy? There is a project to start rewriting some parts of numpy with HPy, but a full conversion will take months or years.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 01.02.2021 18:16, Victor Stinner wrote:
On Mon, Feb 1, 2021 at 6:08 PM M.-A. Lemburg <mal@egenix.com> wrote:
I said that we do have the time to wait for HPy to mature and can then make the switch-over to the new API an easy one step solution, rather than introducing breaking changes with every single Python release. The latter causes churn and harms Python's adoption more than it helps.
As I wrote in a previous email, HPy by itself doesn't solve any CPython design issue, since we want to continue supporting C extension modules written with the C API.
For example, are you ready to drop support for numpy? There is a project to start rewriting some parts of numpy with HPy, but a full conversion will take months or years.
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
I think that's a very positive development and one we should use for our benefit.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 01 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
As a developer who uses threads for NumPy computations in a multi-platform GUI Python application, I can say that we have definitely *not* moved on from threads. The prospect of using subinterpreters in the same process space is something we're looking forward to in future Python versions and hopefully we do not have to wait years.
On Mon, Feb 1, 2021 at 9:34 AM M.-A. Lemburg <mal@egenix.com> wrote:
On 01.02.2021 18:16, Victor Stinner wrote:
On Mon, Feb 1, 2021 at 6:08 PM M.-A. Lemburg <mal@egenix.com> wrote:
I said that we do have the time to wait for HPy to mature and can then make the switch-over to the new API an easy one step solution, rather than introducing breaking changes with every single Python release. The latter causes churn and harms Python's adoption more than it helps.
As I wrote in a previous email, HPy by itself doesn't solve any CPython design issue, since we want to continue supporting C extension modules written with the C API.
For example, are you ready to drop support for numpy? There is a project to start rewriting some parts of numpy with HPy, but a full conversion will take months or years.
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
I think that's a very positive development and one we should use for our benefit.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 01 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org https://mail.python.org/mailman3/lists/capi-sig.python.org/ Member address: cmeyer1969@gmail.com
On Mon, Feb 01, 2021 at 06:34:04PM +0100, M.-A. Lemburg wrote:
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
Where has it moved to? I see people choosing Go and Rust for performance reasons.
-- Senthil
On 02.02.2021 00:37, Senthil Kumaran wrote:
On Mon, Feb 01, 2021 at 06:34:04PM +0100, M.-A. Lemburg wrote:
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
Where has it moved to? I see people choosing Go and Rust for performance reasons.
Compute has moved on to distributed computing, quoting an earlier email:
""" Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily. """
Have a look at Python Dask for example. There are plenty others as well, e.g. Apache Airflow, Dagster, Prefect, Luigi. In the web app space many people are using Celery or a WSGI server for distributing the load or using memcached for inter-process communication and caching.
Apache Arrow sets out to become the new inter-process comm standard.
For low level computing, several Python libraries are turning to GPUs for speed, e.g. RAPIDS, dask-sql or BlazingSQL.
All this is available in Python, turning to lower level languages and implementations for speed in the same way numpy started this long ago.
The reason for this is simple: threads only work locally, they don't scale beyond the number of cores you have available on your machine.
For many applications I/O has become the main bottleneck and again the only way to scale this up beyond the number of I/O channels you have in the server hardware is by using multiple machines.
Go and Rust are alternatives to get better local performance, but they have the same scalability problems.
Python is well positioned in the distributed computing space, because the C API makes it so easy to interface to other tools out there, even bridging to the Java world, which still dominates a lot of these areas.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 02 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Le 02/02/2021 à 10:14, M.-A. Lemburg a écrit :
On 02.02.2021 00:37, Senthil Kumaran wrote:
On Mon, Feb 01, 2021 at 06:34:04PM +0100, M.-A. Lemburg wrote:
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
Where has it moved to? I see people choosing Go and Rust for performance reasons.
Compute has moved on to distributed computing, quoting an earlier email:
""" Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily. """
Have a look at Python Dask for example.
Speaking as own of the Dask core developers, the Python GIL is very much a problem for some parts of Dask (ironically, for the part that distributes work over multiple computers, since distributed scheduling is a difficult task that takes a lot of CPU).
See for example https://github.com/dask/distributed/issues/4443
Regards
Antoine.
On 02.02.2021 10:27, Antoine Pitrou wrote:
Le 02/02/2021 à 10:14, M.-A. Lemburg a écrit :
On 02.02.2021 00:37, Senthil Kumaran wrote:
On Mon, Feb 01, 2021 at 06:34:04PM +0100, M.-A. Lemburg wrote:
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
Where has it moved to? I see people choosing Go and Rust for performance reasons.
Compute has moved on to distributed computing, quoting an earlier email:
""" Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily. """
Have a look at Python Dask for example.
Speaking as own of the Dask core developers, the Python GIL is very much a problem for some parts of Dask (ironically, for the part that distributes work over multiple computers, since distributed scheduling is a difficult task that takes a lot of CPU).
See for example https://github.com/dask/distributed/issues/4443
Reading through that ticket, it looks more like a socket problem than a GIL problem:
https://github.com/dask/distributed/issues/4443#issuecomment-765241186
Note that I'm not saying that the GIL cannot be a problem for some applications. The situation is less of a mainstream issue, though, due to applications being more often written in a distributed way rather than relying on a single process.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Feb 02 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Le 02/02/2021 à 10:50, M.-A. Lemburg a écrit :
On 02.02.2021 10:27, Antoine Pitrou wrote:
Le 02/02/2021 à 10:14, M.-A. Lemburg a écrit :
On 02.02.2021 00:37, Senthil Kumaran wrote:
On Mon, Feb 01, 2021 at 06:34:04PM +0100, M.-A. Lemburg wrote:
As I said: we have those years to wait. The GIL is not causing people to run away from Python anymore, since compute has moved on and no longer relies on threads for scaling up.
Where has it moved to? I see people choosing Go and Rust for performance reasons.
Compute has moved on to distributed computing, quoting an earlier email:
""" Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily. """
Have a look at Python Dask for example.
Speaking as own of the Dask core developers, the Python GIL is very much a problem for some parts of Dask (ironically, for the part that distributes work over multiple computers, since distributed scheduling is a difficult task that takes a lot of CPU).
See for example https://github.com/dask/distributed/issues/4443
Reading through that ticket, it looks more like a socket problem than a GIL problem:
https://github.com/dask/distributed/issues/4443#issuecomment-765241186
The concerns in that issue a bit commingled. But the underlying issue is that the main scheduler thread competes for CPU resources with other threads where other tasks are offloaded. The scheduler by design works on shared state and cannot be rewritten to use several processes.
Regards
Antoine.
On 2/1/21 3:41 PM, Victor Stinner wrote:
Hi Petr,
On Mon, Feb 1, 2021 at 11:06 AM Petr Viktorin <encukou@gmail.com> wrote:
I think "introducing incompatible changes on purpose" makes Python worse without much benefit. Has any of the recent incompatible changes actually made Python faster?
Optimizing Python is a hard problem. Look back at the 30 years of CPython optimizations. There were some nice optimizations, but the CPython design is the same for 30 years. If it would be easy, it would already be done. There are big companies would need a faster CPython. Significant speedup (ex: 2x faster) require to deeply change CPython design.
It will only become possible to *start* *experimenting* optimizing CPython when most of the PEP 620 will be implemented. Some people disagree, but so far, I didn't see any concrete working optimization proposed and implemented in CPython (without the PEP 620 being implemented). So everyone is free to have their own "beliefs" in term of optimization possibilities (with or without incompatible C API changes) ;-)
To be clear, the PEP 620 brings zero optimization. It should be done separately. The PEP 620 only prepares CPython for future optimizations.
Besides the non-opaque PyObject*, the worst thing that prevents "evolution" now is IMO that built-in static types like PyList_Type are exposed in the API and stable ABI as global symbols. This prevents making the GIL per-interpreter. Here's how stuff like that can be changed:
- Formalize, test and document the stable ABI
- Introduce new API (e.g. PyList_GetListType())
- Deprecate the old API
- Create
abi4
, a new, incompatible version of the stable ABI, which doesn't include deprecated stuff. Create a flag to mark extension modules that use it.- Make it possible to create subinterpreters with a separate GIL. These can only load
abi4
extension modules.Step 4 introduces a backward incompatible change without much benefit :-)
It does not. All existing modules would continue to work.
Step 5 will only be safe once all static types will be removed, which is an incompatible change.
Again, no – all existing code would continue to work.
I see that I didn't explain one aspect of the idea: there would be two kinds of subinterpreters – one that shares the GIL with the "main" interpreter, and a new opt-in kind that does not share the GIL but only allows loading abi4 modules. I don't think this would be hard to do in practice: GIL must already be shared across thread states belonging to a single interpreter; it doesn't seem hard to have some interpreters sharing the GIL and some not.
For major CPython design changes (like running multiple interpreters in parallel), we *need* to introduce incompatible changes.
Sure, and we have PEP 387 for that. A backwards-incompatible change can be done after at least two years of warnings (and if warnings are not possible, then with an accepted PEP). Incompatible changes should be the last resort, and there needs to be enough time to think about approaches that don't break backwards compatibility.
There are two options: consider that CPython must no longer evolve (and so die slowly as a dead language ;-)), or we should find a way to minimize the number of unhappy people when we introduce incompatible changes :-)
This is a false dichotomy. It is also possible to make progress without breaking so much existing code. It's harder and takes more time, but it's much better for existing software.
- Work with various third-party projects to switch to abi4 so we can start getting its benefits, and improve abi4 along the way.
Over the last years, I proposed different PEP drafts with an opt-in choice for a faster python which is backward incompatible. Nobody liked this option (having two Python). Look at the Python 2 => Python 3 migration and "CPython vs PyPy". "Faster Python" is not enough to migrate the critical mass of users to the "new" Python, the old Python never goes away, and then you have to maintain two Python runtimes instead of one.
My new approach is to do all changes directly in Python master branch (single Python) and slowly introduce incompatible changes. When too many people complain, I revert the change and help to migrate extensions to the new API. I'm doing that for Py_SET_TYPE() and Py_SET_SIZE() using my pythoncapi_compat project for example (I'm waiting for my Mercurial change to be merged upstream ;-)).
https://github.com/pythoncapi/pythoncapi_compat allows to make existing extensions compatibile with the future incompatible changes, without dropping support for old Python versions. The general plan is:
(a) Introduce a new API and deprecate the old one (b) Run pythoncapi_compat on enough extension modules (and get changes accepted) (c) Once enough extensions are ported, drop the old API
I suggest to have at least one Python release between (a) and (c). For example, I introduced Py_SET_REFCNT() in Python 3.9 and I disallowed "Py_REFCNT(obj) = refcnt;" syntax in Python 3.10.
PEP 387 is very clear that this should be at least *two* releases. And only if the old behavior raises warnings for those two releases.
On top of that, I don't plans to drop some API should be made until the old API is *actually* incompatible with a real improvement (e.g. speedup). Until then, any experiments can be done on a subset of extension modules, without requiring everyone in the ecosystem to make changes. If people end up needing to run pythoncapi_compat, it would be much nicer to let them run
During (b) phase, we can communicate on the future incompatible changes in What's New in Python X.Y, communicate on capi-sig list and/or python-dev, contact extensions maintainers, etc.
I tried to explain my failures and my new approach on C API incompatible changes in an article: https://vstinner.github.io/hide-implementation-details-python-c-api.html
?. In Python 4.0, drop
abi3
.I don't see how you can run interpreters in parallel if static types are still in use. In my experience, it immediately crash on various ways: https://bugs.python.org/issue40512#msg383830
PEP 620 has many things, some of which conflict with the accepted PEP 387. I don't think all of it is a good plan (and I guess that's why it wasn't accepted).
The PEP 387 is a generic guidelines how to introduce incompatible changes. It remains possible to introduce incompatible changes differently (ex: without deprecation warning or with no deprecation period) with a dedicated PEP. That's why I wrote PEP 620.
The worrying thing is that PEP 620 is still a Draft, even though most action points from it are "Completed". Why do we even have a PEP approval process if it's not being used before changes are done?
On the items from PEP 620 that are not completed yet:
Make structures opaque
Avoid functions returning PyObject** I support these ideas, but again, the old API should stay available until the underlying implementation actually changes to make them impossible. Improvements that these functions are preventing can still be tested – they just need to be tested with modules that avoid the old functions.
pythoncapi_compat.h header file I don't think this is necessary if we're more careful about backwards compatibility.
I've put an updated version on Discourse: https://discuss.python.org/t/pre-pep-maintaining-the-stable-abi/6986 (and https://github.com/encukou/abi3/blob/main/PEP.rst is also up to date.)
I plan to turn it into a PEP next week.
participants (11)
-
Antoine Pitrou
-
Chris Meyer
-
encukou@gmail.com
-
eric+a-python-dev@trueblade.com
-
M.-A. Lemburg
-
Nick Coghlan
-
Pablo Galindo Salgado
-
Petr Viktorin
-
Senthil Kumaran
-
Simon Cross
-
Victor Stinner