I'm sorry, I was overwhelmed and didn't find the time until now to answer this. A lot was already said about this, so I'll just briefly explain below (inline).
On Sat, Jan 29, 2022 at 2:38 AM Victor Stinner email@example.com wrote:
On Fri, Jan 28, 2022 at 6:28 PM Guido van Rossum firstname.lastname@example.org wrote:
I think we will get *one* chance in the next decade to get it right.
Whether that's HPy or evolution of the C API I'm not sure.
Would you mind to elaborate? Which risk do you expect from switching to HPy and from fixing the C API (introducing incompatible C API changes)?
IMO users would benefit if we recommended one solution and started deprecating the rest. We currently have too many choices: Stable ABI, limited API (not everybody sees those two as the same thing), CPython API (C API), Cython (for many this is how they interact with the interpreter), HPy... And I think you have another thing in the works, a library that "backfills" (I think that's the word) APIs for older CPython versions so that users can pretend to use the latest C API but are able to compile/link for older versions.
To me, that's too many choices -- at the very least it should be clearer how these relate to each other (e.g. the C API is a superset of the Limited API, the Stable ABI is based on the limited API (explain how), and HPy is a wrapper around the C API. (Or is it?)
Such an explanation (of the relationships) would help users understand the consequences of choosing one or the other for their code -- how will future CPython versions affect them, how portable is their code to other Python implementations (PyPy, GraalPython, Jython). Users can't be expected to understand these consequences without a lot of help (honestly, many of these I couldn't explain myself :-( ).
For me, promoting HPy and evolution of the C API are complementary, can and must done in parallel for me. As I explained in PEP 674, while HPy does help C extensions writers, it doesn't solve any problem for CPython right now. CPython is still blocked by implementation details leaked throught the C API that we must still maintain for a few more years.
I understand the CPython is stuck supporting the de-facto standard C API for a long time. But unless we pick a "north star" (as people call it nowadays) of what we want to support in say 5-10 years, the situation will never improve.
My point about "getting one chance to get it right in the next decade" is that we have to pick that north star, so we can tell users which horse to bet on. If the north star we pick is HPy, things will be clear. If it is evolving the C API things will also be clear. But I think we have to pick one, and stick to it so users (i.e., package maintainers/developers) have clarity.
I understand that HPy is currently implemented on top of the C API, but hopefully it's not stuck on that. And it only helps a small group of extension writers -- those who don't need the functionality that HPy is still missing (they keep saying they're not ready for prime time) and who value portability to other Python implementations, and for whom the existing C API hacks in PyPy aren't sufficient. So it's mostly aspirational. But if it stays that for too long, it will just die for lack of motivation.
Victor, am I right that the (some) stable ABI will remain important
because projects don't have resources to build wheels for every Python release? If a project does R releases per year for P platforms that need to support V versions of Python, they would normally have to build R * P * V wheels. With a stable ABI, they could reduce that to R * P. That's the key point, right?
There are different use cases.
- First, my main worry is that we put a high pressure on maintainers
of most important Python dependencies before the next of a new Python version, because we want them to handle the flow of incompatible C API changes before the final Python 3.x versions is released, to get them available when Python 3.x final is released.
Hm, maybe we should reduce the flow. And e.g. reject PEP 674...
It annoys core developers who cannot change things in Python without getting an increasing number of complains about a large number of broken packages, sometimes with a request to revert.
You are mostly talking about yourself here, right? Since the revert requests were mostly aimed at you. :-)
It annoys C extensions maintainers who have to care about Python alpha and beta releases which are not convenient to use (ex: not available in Linux distributions).
I don't use Linux much, so I am not familiar with the inconvenience of Python alpha/beta releases being unavailable. I thought that the Linux philosophy was that you could always just build from source?
Moreover, it became common to ask multiple changes and multiple releases before a Python final release, since more incompatible changes are introduced in Python (before the beta1).
Sorry, your grammar confuses me. Who is asking whom to do what here?
Is the complaint just that things change between alphas? Maybe we should just give up on alphas and instead do nightlies (fully automated)?
- Second, as you said, the stable ABI reduces the number of binary
packages which have to be built. Small projects with a little team (ex: a single person) don't have resources to set up a CI and maintain it to build all these packages. It's doable, but it isn't free.
Maybe we need to help there. For example IIRC conda-forge will build conda packages -- maybe we should offer a service like that for wheels?
The irony of the situation is that we must break the C API (hiding structures is technically an incompatible change)... to make the C API stable. Breaking it now to make it stable later.
The question is whether that will ever be enough. Unless we manage to get rid of the INCREF/DECREF macros completely (from the public C API anyway) we still can't change object layout.
We already broke the C API many times in the past. The difference here is that changes are done in the purpose of bending it towards the limited C API and the stable ABI.
My expectation is that replacing frame->f_code with PyFrame_GetCode() only has to be done exactly once: this API is not going this change. Sadly, the changes are not limited to frame->f_code, more changes are needed. For example, for PyFrameObject, accesses to every structure member must have to go through a function call (getter or setter function). Hopefully, only a small number of member are used by C extensions.
Is this worth it? Maybe we should just declare those structs and APIs *unstable* and tell people who use them that they can expect to be broken by each alpha release. As you say, hopefully this doesn't affect most people. Likely it'll affect Cython dramatically but Cython is such a special case that trying to evolve the C API will never satisfy them. We'll have to deal with it separately. (Debuggers are a more serious concern. We may need to provide higher-level APIs for debuggers to do the things they need to do. Mark's PEP 669 should help here.)
The tricky part is to think about the high level API ("use cases") rather than just adding functions doing "return struct->member" and "struct->member = new_value". The PyThreadState_EnterTracing() and PyThreadState_LeaveTracing() functions added to Python 3.11 are a good example: the API is "generic" and the implementation changes 2 structure members, not a single one.
In practice, what I did since Python 3.8 is to introduce a small number of C API changes per Python versions. We tried the "fix all the things at once" approach (!!!) with Python 3, and it... didn't go well. All C extensions had to suddenly write their own compatibility layer for a large number of C API functions (ex: replace PyInt_xxx with PyLong_xxx, without losing Python 2 support!). The changes that I'm introducing in the C API usually impact less than 100 extensions in total (usually, I would say between 10 and 25 per Python version, but it's hard to measure exactly).
Ho *do* you count this? Try to compile the top 5000 PyPI packages? That might severely undercount a long tail of proprietary extensions.
Can HPy do that?
I wish more projects are incrementally rewritten with Cython, cffi, pybind11 and HPy, and so slowly move away using directly the C API.
Yes, HPy support an "universal build" mode which allows to only build a C extension once, and use it on multiple *CPython* versions *and* (that's the big news!) multiple *PyPy* versions! I even heard that it also brings GraalPython support for free ;-)
Night gathers, and now my watch begins. It shall not end until my death.