
On 2/1/2021 10:17 AM, M.-A. Lemburg wrote:
On 01.02.2021 15:41, Victor Stinner wrote:
My new approach is to do all changes directly in Python master branch (single Python) and slowly introduce incompatible changes. I have a feeling that this will drive away extension writers from Python.
The maintenance burden of having to constantly update the code base to accommodate for changes in the Python C API will be too much for them to handle (I know it's too much for me, which is why I stopped doing extension development).
On top of that, those projects will have to keep compatibility with multiple Python versions, which makes things even more difficult.
When too many people complain, I revert the change and help to migrate extensions to the new API. I'm doing that for Py_SET_TYPE() and Py_SET_SIZE() using my pythoncapi_compat project for example (I'm waiting for my Mercurial change to be merged upstream ;-)).
https://github.com/pythoncapi/pythoncapi_compat allows to make existing extensions compatibile with the future incompatible changes, without dropping support for old Python versions. The general plan is:
(a) Introduce a new API and deprecate the old one (b) Run pythoncapi_compat on enough extension modules (and get changes accepted) (c) Once enough extensions are ported, drop the old API
I suggest to have at least one Python release between (a) and (c). For example, I introduced Py_SET_REFCNT() in Python 3.9 and I disallowed "Py_REFCNT(obj) = refcnt;" syntax in Python 3.10.
During (b) phase, we can communicate on the future incompatible changes in What's New in Python X.Y, communicate on capi-sig list and/or python-dev, contact extensions maintainers, etc. Other vendors introduce shims to handle breaking changes (e.g. MS on Windows) or even provide complete emulations (e.g. Apple for macOS).
I believe that's the only reasonable way to avoid another Python 2 to 3 15+ year transition process or losing the crowd to other languages. (*)
Your tool is already going in that direction, but rather than having to run the tool over and over again for every release, which adds testing and verification overhead every single time, it would be better to have something like HPy readily working and then use a tool like pythoncapi_compat to migrate over the code once, inserting a shim between the extension using the old style Python C API and the new HPy Python C API.
That shim could also be had at a higher level by e.g. using Cython as layer between the extension and Python; and using a tool to help migrate towards this approach.
The latter is what I'm currently considering, but have not checked the performance hit such an approach would have for low level data type implementations. Could be that it's not feasible.
I agree with MAL here.
I guess my question is: if we go with "slowly introduce incompatible changes", is there some point (let's call it an inflection point) that we think we can get some performance improvements before we drop the C-API entirely? If there's not, I don't think there's any benefit to a slowly introducing incompatible changes. And if there is, let's identify that inflection point now, so we can sell the benefits. And also let's consider not doing slowly incompatible changes, but rather just save all of the changes up for this inflection point.
Eric
I tried to explain my failures and my new approach on C API incompatible changes in an article: https://vstinner.github.io/hide-implementation-details-python-c-api.html (*) Also note that the Python GIL scalability problem is not really that urgent anymore these days. People are moving to distributed computing, splitting workloads across processes, containers, VMs, GPUs and other specialized hardware. Together with async code, the GIL no longer prevents Python applications from scaling easily.
It would still be nice to move away from a global lock, but such a change will also require changes in the extensions, since many are not written in a re-entrant way, so the Python C API modifications to remove the GIL are only part of the solution to free-threading.
What I want to say is that we can take our time to avoid disruption :-)