Le mar. 23 juin 2020 à 03:47, Neil Schemenauer email@example.com a écrit :
Thanks for putting work into this.
You're welcome, I took some ideas from your tagged pointer proof of concept ;-) I recall that we met the same C API issues in our experiments ;-)
Changes must be made for well founded reasons and not just because we think it makes a "cleaner" API. I believe you are following those principles.
I mostly used the tagged pointer as a concrete goal to decide which changes are required or not. PyPy and HPy developers also gave me API that they would like to see disappearing :-)
One aspect of the API that could be improved is memory management for PyObjects. The current API is quite a mess and for no good reason except legacy, IMHO. The original API design allowed extension types to use their own memory allocator. E.g. they could call their own malloc()/free() implemention and the rest of the CPython runtime would handle that. One consequence is that Py_DECREF() cannot call PyObject_Free() but instead has to call tp_dealloc(). There was supposed to be multiple layers of allocators, PyMem vs PyObject, but since the layering was not enforced, we ended up with a bunch of aliases to the same underlying function.
I vaguely recall someone explaining that Python memory allocator created high memory fragmentation, and using a dedicated memory allocator was way more efficient. But I concur that the majority of people never override default tp_new and tp_free functions.
By the way, in Python 3.8, heap types started to increase their reference counter when an instance is created, but decrementing the type reference counter is the responsibility of the tp_dealloc function and we failed to find a way to automate it.
More info on this issue:
C extensions maintainers now have to update their tp_dealloc method, or their application will never be able to destroy their heap types.
Perhaps there are a few cases when the flexibility to use a custom object allocator is useful. I think in practice it is very rare than an extension needs to manage memory itself. To achieve something similar, allow a PyObject to have a reference to some externally managed resource and then the tp_del method would take care of freeing it. IMHO, the Python runtime should be in charge of allocating and freeing PyObject memory.
Do you think that it should be in PEP 620 or can it be done independently? I don't know how to implement it, I have no idea how many C extensions would be broken, etc.
I don't see an obvious relationship with interoperability with other Python implementations or the stable ABI and hiding tp_del/tp_free.
While making object allocation and deallocation simpler would be nice, it doesn't seem "required" in PEP 620 for now. What do you think?
Another place for improvement is that the C API is unnecessarily large. E.g. we don't really need PyList_GetItem(), PyTuple_GetItem(), and PyObject_GetItem(). Every extra API is a potential leak of implementation details and a burden for alternative VMs. Maybe we should introduce something like WIN32_LEAN_AND_MEAN that hides all the extra stuff. The Py_LIMITED_API define doesn't really mean the same thing since it tries to give ABI compatibility. It would make sense to cooperate with the HPy project on deciding what parts are unnecessary. Things like Cython might still want to use the larger API, to extract every bit of performance. The vast majority of C extensions don't require that.
At the beginning, I had a plan to remove all functions and only keep "abstract" functions like PyObject_GetItem().
Then someone asked what is the performance overhead of only using abstract functions. I couldn't reply. Also, I didn't see a need to only use abstract functions for now, so I abandoned this idea.
PyTuple_GetItem() returns a borrowed reference which is bad, whereas PyObject_GetItem() returns a strong reference. Since PyPy cpyext already solved this problem, I chose the leave the borrowed references problem aside for now. Trying to fix all issues at once doesn't work :-)
One issue of calling PyTuple_GetItem() or PyDict_GetItem() is that it doesn't take in account the ability to override __getitem__() in a subclass. Few developers write the correct code like:
if (PyDict_CheckExact(ns)) err = PyDict_SetItem(ns, name, v); else err = PyObject_SetItem(ns, name, v);
The PEP 620 is already quite long and introduces many incompatible changes. I tried to make the PEP as short as possible and minimize the number of incompatible C API changes.
Using Py_LIMITED_API provides a stable ABI, but it doesn't reduce the Python maintenance burden, and other Python implementations must continue to implement the full C API since C extensions actually use it.
Unless we make Py_LIMITED_API (or another new macro to reduce the C API size), there is no benefit for CPython nor other Python implementations. Also, only a very few extensions use Py_LIMITED_API, even if it exists since Python 3.2 (released in 2011).
As I wrote in the introduction, the PEP 620 is my third attempt. Previous attempts tried to keep backward compatibility and were based on an "opt-in" option (I want to use the new limited C API because the carrot looks delicious!). But IMO there is a high risk that developers don't opt-in (the carrot isn't as good as I expected :-( ) if there is little benefit in the short term, and it doesn't reduce the maintenance burden. Also, having two C API may explode the test matrix, and some people didn't like that.