On 31. 01. 22 16:14, Victor Stinner wrote:
On Mon, Jan 31, 2022 at 4:03 PM Petr Viktorin firstname.lastname@example.org wrote:
If we change the stable ABI, I would prefer to fix multiple issues at once. Examples:
- No longer return borrowed references (ex: PyDict_GetItem is part of
the stable ABI) and no longer steal references (ex: PyModule_AddObject)
- Disallow getting direct access into an object data without a
function to "release" the data. For example, PyBytes_AsString() gives a direct access into the string, but Python doesn't know when the C extension is done with it, and when it's safe to delete the object. Such API prevents to move Python objects in memory (implement a moving garbage collector in Python).
- Disallow dereferencing a PyObject* pointer: most structures must be
opaque. It indirectly means that accessing directly structure members must also be disallowed. PEP 670 and PEP 674 are partially fixing the issues.
(...) fixing these in the API first is probably the way to go.
That's what I already did in the past and what I plan to do in the future.
I see a problem in the subject of this thread: "Slowly bend the C API towards the limited API to get a stable ABI for everyone" is not a good summary of the proposed changes -- that is, bend *all* API (both the general public API and the limited API) to make most structs opaque, etc.
If the summary doesn't match what's actually proposed, it's hard to discuss the proposal. Especially if the concrete plan changes often.
Anyway, I propose a different plan than what I think you are proposing:
- Add new "good" API where the current API is currently lacking. For examrle, PyModule_AddObjectRef is the "good" alternative to PyModule_AddObject -- it returns a strong reference. You've done a lot of great work here, and the API is much better for it.
- "Soft"-deprecate the "bad" API: document that it's only there for existing working code. Why not remove it? The issue is that it *is* possible to use the existing API correctly, and many extension authors have spent a lot of time and effort to do just that. If we force them to use a new API that makes writing correct code easier, it won't actually make their job easier if they've already found the caveats and fixed them.
- Remove the "bad" API from newer versions of the limited API. Extension authors can "opt in" to this new version, gaining new features of the limited API but losing the deprecated parts. (Here is the part where we should make sure the removals are well-documented, provide tools to "modernize" code, etc.)
- Proactively work with popular/important projects (top PyPI packages, distro packages) to move to the latest API. The benefit for a CPython devs here is that we can see the pain points and unforeseen use cases, test any modernization tools, having a "reality check" on how invasive the changes actually are, and helping HPy & other implementations succeed even if they don't implement deprecated API.
- Agree with HPy and other implementations of the limited API that it's not necessary for them to support the deprecated parts.
- When (and only when) a deprecated API is actually harmful -- i.e. it blocks new improvements that benefit actual users in the short term -- it should be deprecated and removed. (Even better, if instead of removing it could be e.g. replaced by a function that's 3x slower, or leaks memory on exit, then it should.)
Basically, instead of "We'll remove this API now because it prevents moving to a hypothetical moving garbage collector", it should be "Here is a moving garbage collector that speeds Python up by 30%, but to add it we need to remove these 30 deprecated APIs". The deprecation can be proactive, but not the removal.