[capi-sig]capi roadmap and refactoring goals
There is a roadmap for improving the C-API at https://pythoncapi.readthedocs.io/roadmap.html
From my reading of the information there, it seems there are a number of goals to the refactoring:
- allow compile-once c-extensions that would be binary compatible with more than one version/flavors of python
- hide python internals
- which would make writing c-extensions less bug-prone (refcounting semantics, breaking refernence cycles)
- which could unleash possible performance gains
Are these also goals?:
- allow introspection for JIT or AOT compilation of extension code as part of a larger unit, like Numba, Dask, or PyPy do
- enhance tooling like SIP, cython, pybind11,
What do you think? Is this an accurate summary of the goals? Matti
Hi Matti,
From the C API point of views, I see two main use cases:
Extend Python: performance doesn't really matter, the hotcode is not in the glue code but in the external code written in your favorite language (usually C). My plan is to use a portable ABI for different reasons.
Performance, "C accelerator": abuse "implementation details", private functions, access directly structure fields, etc. This code can be written by hand, or generate by tool like Cython.
These two use cases are not exclusive. You can easily imagine to produce two binaries (dynamic libraries) from the same source code, depending on a compiler option for example. I understood that Cython is already able to produce different code depending on options.
My goal is to make the "portable ABI" usable by more people. Currently, too few developers use the Py_LIMITED_API define (PEP 384, stable ABI).
Le mar. 18 déc. 2018 à 11:48, <matti.picus@gmail.com> a écrit :
Are these also goals?:
- allow introspection for JIT or AOT compilation of extension code as part of a larger unit, like Numba, Dask, or PyPy do
I only plan to add a *new* C API, but the old one remains available for people who need best performances.
Cython will still be able to access CPython internals, make assumptions on the implementation and use private functions.
You can already rely on PyPy internals to emit more efficient code. A new C API is not incompatible, it's just a different use case.
- enhance tooling like SIP, cython, pybind11,
Would you mind to elaborate? Which kind of problem are you trying to solve here?
Victor
Night gathers, and now my watch begins. It shall not end until my death.
Hi Viktor,
please do consider the fact that the "Limited ABI" has essentially failed to meet expectations. It's hardly used and not really needed anymore.
The main reason for the limited ABI was the desire to have extensions work without recompilation on system where access to compilers was difficult, mainly Windows at the time.
This reason has dissolved in recent years, with MS making it possible to get free compilers as well, so instead of adding yet more layers of protection for something which no one uses in real life, I think it's better to focus on way to make the C API more consistent and attractive for gaining performance wins - both for the CPython interpreter and the extensions.
In all this, please remember that C extensions play a major role in why Python has become so popular. They need to be able to tap directly into how the interpreter works without being required to go through layers of copying data over and over again, to sustain this popularity.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Dec 18 2018)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
On Tue, 18 Dec 2018 at 22:34, M.-A. Lemburg <mal@egenix.com> wrote:
Hi Viktor,
please do consider the fact that the "Limited ABI" has essentially failed to meet expectations. It's hardly used and not really needed anymore.
This simply isn't true - the last time this was asserted, the SIP developers pointed out that PyQT binary extensions rely on the stable ABI to maintain compatibility across Python 3 versions, with only SIP itself needing to be recompiled each release.
Remember that we don't hear from happy users of a feature - we only hear from users for whom the feature is incomplete.
The main reason for the limited ABI was the desire to have extensions work without recompilation on system where access to compilers was difficult, mainly Windows at the time.
Err, no - it was so that folks didn't have to ship umpteen quadrillion versions of their binary extensions to support users on multiple different versions of Python.
It turns out that CPython's habit of embedding version numbers in filesystem paths causes additional problems in that regard, and the long life of Python 2.7 meant that folks were needing to build for at least Py2 and Py3 anyway, and hence the opportunity for simplification offered by the stable ABI was limited in practice.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 19.12.2018 12:18, Nick Coghlan wrote:
On Tue, 18 Dec 2018 at 22:34, M.-A. Lemburg <mal@egenix.com> wrote:
Hi Viktor,
please do consider the fact that the "Limited ABI" has essentially failed to meet expectations. It's hardly used and not really needed anymore.
This simply isn't true - the last time this was asserted, the SIP developers pointed out that PyQT binary extensions rely on the stable ABI to maintain compatibility across Python 3 versions, with only SIP itself needing to be recompiled each release.
Remember that we don't hear from happy users of a feature - we only hear from users for whom the feature is incomplete.
I did not say that it isn't used, but the effort it takes to maintain it and keep it working does not match up to the few gains it has for a few extension writers.
If you search for "Py_LIMITED_API" on Google, you get less then 4000 hits. Many of those are related to issues with the limited API, e.g.
https://bitbucket.org/cffi/cffi/issues/350/issue-with-py_limited_api-on-wind...
If you search for '"#define Py_LIMITED_API"' you get around 100 hits.
It is not a widely used feature of the Python C API.
The main reason for the limited ABI was the desire to have extensions work without recompilation on system where access to compilers was difficult, mainly Windows at the time.
Err, no - it was so that folks didn't have to ship umpteen quadrillion versions of their binary extensions to support users on multiple different versions of Python.
Nope. Please see PEP 384 regarding the original motivation back in 2009:
https://www.python.org/dev/peps/pep-0384/
""" On Linux, changes to the ABI are often not much of a problem: the system will provide a default Python installation, and many extension modules are already provided pre-compiled for that version. If additional modules are needed, or additional Python versions, users can typically compile them themselves on the system, resulting in modules that use the right ABI.
On Windows, multiple simultaneous installations of different Python versions are common, and extension modules are compiled by their authors, not by end users. To reduce the risk of ABI incompatibilities, Python currently introduces a new DLL name pythonXY.dll for each feature release, whether or not ABI incompatibilities actually exist. """
As mentioned the situation has changed since then, so it is possible for users to compile extensions on Windows as well. Additionally, the number of needed binaries has gone down a lot, since we no longer have UCS2/UCS4 builds and most Windows systems run 64-bit now. So things have gotten easier for extension writers as well.
You are also forgetting that using the limited ABI significantly limits your possibilities as an extension writer to use APIs from the CPython API. It does work for simple extensions or ones which only do basic operations such as providing call level APIs (sip, cffi, go bridges).
Even SIP ships with pre-compiled versions for each Python release:
https://pypi.org/project/SIP/#files
and so does cffi:
https://pypi.org/project/cffi/#files
SIP also doesn't use the limited ABI per default; it only supports using it. cffi at least tries to use it where possible.
It turns out that CPython's habit of embedding version numbers in filesystem paths causes additional problems in that regard, and the long life of Python 2.7 meant that folks were needing to build for at least Py2 and Py3 anyway, and hence the opportunity for simplification offered by the stable ABI was limited in practice.
I've been building binaries for several systems and several of my extensions for years, but never found it attractive to have limitations in APIs in return for perhaps being able to leave out a few compiler runs in the release process (most of those ran in parallel anyway, so were not causing major issues).
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Dec 19 2018)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
M.-A. Lemburg schrieb am 19.12.18 um 23:17:
On 19.12.2018 12:18, Nick Coghlan wrote:
On Tue, 18 Dec 2018 at 22:34, M.-A. Lemburg wrote:
please do consider the fact that the "Limited ABI" has essentially failed to meet expectations. It's hardly used and not really needed anymore.
This simply isn't true - the last time this was asserted, the SIP developers pointed out that PyQT binary extensions rely on the stable ABI to maintain compatibility across Python 3 versions, with only SIP itself needing to be recompiled each release.
Remember that we don't hear from happy users of a feature - we only hear from users for whom the feature is incomplete.
I did not say that it isn't used, but the effort it takes to maintain it and keep it working does not match up to the few gains it has for a few extension writers.
That makes me wonder if it could be maintained as a separate shared library on top of CPython's own API. Not sure if that's beneficial in terms of maintenance work-load, but it could at least make it easier to detect accidental breakages. And it would introduce a well defined API layer between both implementations that is less likely to break for either side.
It turns out that CPython's habit of embedding version numbers in filesystem paths causes additional problems in that regard, and the long life of Python 2.7 meant that folks were needing to build for at least Py2 and Py3 anyway, and hence the opportunity for simplification offered by the stable ABI was limited in practice.
I've been building binaries for several systems and several of my extensions for years, but never found it attractive to have limitations in APIs in return for perhaps being able to leave out a few compiler runs in the release process (most of those ran in parallel anyway, so were not causing major issues).
Dito. Nick, I owe you (and others) a lot for your efforts to invent wheels, and then bring them to Linux. Since then, compiling and distributing binaries is really a solved problem. As CI testing with all supported CPython versions is a development time requirement anyway, building separate release wheels for them is almost for free.
Stefan
On Wed, Dec 19, 2018 at 2:18 PM M.-A. Lemburg <mal@egenix.com> wrote:
I did not say that it isn't used, but the effort it takes to maintain it and keep it working does not match up to the few gains it has for a few extension writers.
If you search for "Py_LIMITED_API" on Google, you get less then 4000 hits. Many of those are related to issues with the limited API, e.g.
https://bitbucket.org/cffi/cffi/issues/350/issue-with-py_limited_api-on-wind...
If you search for '"#define Py_LIMITED_API"' you get around 100 hits.
It is not a widely used feature of the Python C API.
From one perspective, the limited API is just a subset of the existing API. It includes most of the symbols in the API and excludes a few like PyType_Ready. When there is a replacement, like PyType_FromSpec is for PyType_Ready, the user does not have to define Py_LIMITED_API in order to use it.
Many extensions are effectively clients of the limited API by using the new APIs it provides or by not using symbols that are not part of the limited API. As a consequence, measuring the usage of the API by the definition of Py_LIMITED_API seems to exclude such clients of the limited API.
I am curious if you could elaborate on what aspect of the limited API you are referring to in these statements, especially regarding maintenance. Are you concerned about the effort to leave that in macro in the API headers, or some other maintenance cost beyond maintaining the macro?
On Fri, 21 Dec 2018 at 17:25, Carl Shapiro <carl.shapiro@gmail.com> wrote:
I am curious if you could elaborate on what aspect of the limited API you are referring to in these statements, especially regarding maintenance. Are you concerned about the effort to leave that in macro in the API headers, or some other maintenance cost beyond maintaining the macro?
As originally implemented, the limited API was defined as a collection of "#ifndef ..." macros in the existing rich API header files, so by default, any new API declarations would end up in the limited API, even if they didn't belong there. Since our CI suite wasn't set up to detect and prevent stable ABI violations (and still isn't), we unfortunately haven't been maintaining the stable ABI properly.
However, Victor recently went through and rearranged the C header files such that there are 3 different include directories:
- https://github.com/python/cpython/tree/master/Include -> the limited API/stable ABI
- https://github.com/python/cpython/tree/master/Include/cpython -> the full traditional version-specific CPython API/ABI (implicitly included when Py_LIMITED_API is not set)
- https://github.com/python/cpython/tree/master/Include/internal -> CPython runtime internal APIs that even the standard library's extension modules shouldn't be using
This makes the stable ABI a lot easier to maintain, since we only need to remember two guidelines:
- unless we're explicitly intending to expand the stable ABI, new C function declarations go into "Include/cpython"
- if a change does get made directly in an "Include/*.h" file, then it needs a #ifdef guard specifying the version of the stable ABI where it first appeared
There are probably some updates that should be made to PEP 7 and/or the developer's guide around this...
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (6)
-
Carl Shapiro
-
M.-A. Lemburg
-
matti.picus@gmail.com
-
Nick Coghlan
-
Stefan Behnel
-
Victor Stinner