[capi-sig]Split the big C API problem into multiple PEPs
Hi,
Last months, I was busy to fill https://pythoncapi.readthedocs.io/ website with random notes. Many discussions occurred on this list and python-dev, but I was only able to make the most simple and least controversal changes in Python upstream. I didn't write a PEP because CPython had a governance crisis. Since a new Steering Committee has been elected, it's time to see how concrete PEP can be written.
IMHO we need to split the giant "C API" problem into multiple PEPs. I propose 4 PEPs:
PEP A: Ecosystem of C extensions in 2019 PEP B: Define good and bad C APIs PEP C: Plan to enhance the existing C API PEP D: Completely new C API
There is also an ongoing discussion about embedded Python and Python initialization API, but I'm scared by this topic so I don't even propose to write a new PEP which would supersed PEP 432 :-) https://bugs.python.org/issue22213
== PEP A: Ecosystem of C extensions in 2019 ==
Discuss cffi, Cython, PyQt usage of the stable ABI, CPython C API, etc. The goal is not to solve any problem, mostly to list existing options.
It sounds like an unsual PEP, but I think that a PEP is needed since the same discussions happened multiple times.
This PEP can describe what are the kind of "C extensions" and maybe suggest which tools are the best depending on the kind. cffi doesn't cover all cases, the C API isn't always the right answer, etc.
== PEP B: Define good and bad C APIs ==
https://pythoncapi.readthedocs.io/bad_api.html can be used as a starting point. It should be an informal PEP which evolves as PEP 7 and PEP 8 are evolving.
== PEP C: Plan to enhance the existing C API ==
This one sounds like the most controversial PEP :-) I see different things:
- Plan to deprecate and remove bad APIs
- Plan to help C extensions maintainers to move away from these bad APIs
- Plan to test the stability of the API
- Plan to test the stability of the ABI
The even more controversial idea: provide multiple Python runtimes for CPython, not only one: https://pythoncapi.readthedocs.io/runtimes.html
== PEP D: Completely new C API ==
Well, that's the obvious alternative to PEP C.
Armin Rigo's PyHandle idea may be a good start? https://pythoncapi.readthedocs.io/pyhandle.html
Victor
Night gathers, and now my watch begins. It shall not end until my death.
Le 22/02/2019 à 18:17, Victor Stinner a écrit :
Hi,
Last months, I was busy to fill https://pythoncapi.readthedocs.io/ website with random notes. Many discussions occurred on this list and python-dev, but I was only able to make the most simple and least controversal changes in Python upstream. I didn't write a PEP because CPython had a governance crisis. Since a new Steering Committee has been elected, it's time to see how concrete PEP can be written.
IMHO we need to split the giant "C API" problem into multiple PEPs. I propose 4 PEPs:
PEP A: Ecosystem of C extensions in 2019
+1. Sounds like a good start.
PEP B: Define good and bad C APIs
I'm not sure the "good" vs "bad" categorization will lead to very productive discussions. I think Steve's categorization proposal is more likely to bridge the various opinions on the subject.
PEP C: Plan to enhance the existing C API
Sounds ok, though of course it depends on whatever the resulting proposal looks like ;-)
PEP D: Completely new C API
Sounds interesting too, if more adventurous.
Regards
Antoine.
On 22Feb2019 1008, Antoine Pitrou wrote:
Le 22/02/2019 à 18:17, Victor Stinner a écrit :
IMHO we need to split the giant "C API" problem into multiple PEPs. I propose 4 PEPs:
PEP A: Ecosystem of C extensions in 2019
+1. Sounds like a good start.
PEP B: Define good and bad C APIs
I'm not sure the "good" vs "bad" categorization will lead to very productive discussions. I think Steve's categorization proposal is more likely to bridge the various opinions on the subject.
PEP C: Plan to enhance the existing C API
Sounds ok, though of course it depends on whatever the resulting proposal looks like ;-)
PEP D: Completely new C API
Sounds interesting too, if more adventurous.
I agree, these all sound like worthwhile PEPs.
Hopefully once we have some categorisations for where the current APIs are at, it will be easier to discuss the various proposals.
In particular, I think PEP D will have a lot of different ideas, and being able to compare them equivalently will be very important.
(And specifically on PEP A - I hear a lot of people say "check the top 10/20/30 C extensions", but I don't actually know what they are? Even just a list of them would be great! And I bet 10 of them are the ones included in CPython ;) )
Cheers, Steve
On Sat, 23 Feb 2019 at 10:08, Steve Dower <steve.dower@python.org> wrote:
(And specifically on PEP A - I hear a lot of people say "check the top 10/20/30 C extensions", but I don't actually know what they are? Even just a list of them would be great! And I bet 10 of them are the ones included in CPython ;) )
While PyPI download stats have their issues, they're mostly sufficient for this purpose: https://hugovk.github.io/top-pypi-packages/
A dozen or so entries from the top 100 that include binary extensions:
- simplejson
- numpy
- lxml
- pyyaml
- cryptography
- protobuf
- pandas
- pyopenssl
- scipy
- psycopg2
- grpcio
- wrapt
- matplotlib
- pynacl
And some entries worth mentioning that don't show up in frequent download lists (suggesting something about the nature of automated CI environments for rich client GUI applications):
- sip
- pyqt5
- pyside
- pygobject
- wxpython
- pyobjc-core
- pywin32
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2019-02-26 11:46, Nick Coghlan wrote:
A dozen or so entries from the top 100 that include binary extensions:
We really should make a difference between projects using the C API only indirectly through some other tool like Cython and packages using the C API directly. Although the split is not perfect, some Cython projects still use C API calls or implement some functionality in manually-written C extensions.
On Tue, 26 Feb 2019 at 21:17, Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
On 2019-02-26 11:46, Nick Coghlan wrote:
A dozen or so entries from the top 100 that include binary extensions:
We really should make a difference between projects using the C API only indirectly through some other tool like Cython and packages using the C API directly. Although the split is not perfect, some Cython projects still use C API calls or implement some functionality in manually-written C extensions.
Indeed, but even doing that level investigation requires picking a set of popular packages to investigate :)
I'll also note that this will only pick up projects using the C API that are themselves distributed using PyPI. It won't pick up:
- Linux projects that are only shipped as distro packages (I'm squatting solv, rpm, and dnf on PyPI because they're not pip-installable, but it would be potentially disastrous if "sudo pip install solv rpm dnf" actually did anything).
- applications embedding the Python runtime (Blender, Maya, ArcGIS, etc)
- other projects targeting the Python C API via a predefined platform like https://www.vfxplatform.com/
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 26.02.2019 06:44, Nick Coghlan wrote:
On Tue, 26 Feb 2019 at 21:17, Jeroen Demeyer <J.Demeyer@ugent.be> wrote:
On 2019-02-26 11:46, Nick Coghlan wrote:
A dozen or so entries from the top 100 that include binary extensions:
We really should make a difference between projects using the C API only indirectly through some other tool like Cython and packages using the C API directly. Although the split is not perfect, some Cython projects still use C API calls or implement some functionality in manually-written C extensions.
Indeed, but even doing that level investigation requires picking a set of popular packages to investigate :)
I'll also note that this will only pick up projects using the C API that are themselves distributed using PyPI. It won't pick up:
- Linux projects that are only shipped as distro packages (I'm squatting solv, rpm, and dnf on PyPI because they're not pip-installable, but it would be potentially disastrous if "sudo pip install solv rpm dnf" actually did anything).
- applications embedding the Python runtime (Blender, Maya, ArcGIS, etc)
- other projects targeting the Python C API via a predefined platform like https://www.vfxplatform.com/
... plus it won't spot projects that are not available on to the general public.
Still, we have to start somewhere, so I think any list is good as long as we don't take it as the ultimate list :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
On Sat, 23 Feb 2019 at 03:19, Victor Stinner <vstinner@redhat.com> wrote:
Hi,
Last months, I was busy to fill https://pythoncapi.readthedocs.io/ website with random notes. Many discussions occurred on this list and python-dev, but I was only able to make the most simple and least controversal changes in Python upstream. I didn't write a PEP because CPython had a governance crisis. Since a new Steering Committee has been elected, it's time to see how concrete PEP can be written.
IMHO we need to split the giant "C API" problem into multiple PEPs. I propose 4 PEPs:
PEP A: Ecosystem of C extensions in 2019 PEP B: Define good and bad C APIs PEP C: Plan to enhance the existing C API PEP D: Completely new C API
There is also an ongoing discussion about embedded Python and Python initialization API, but I'm scared by this topic so I don't even propose to write a new PEP which would supersed PEP 432 :-) https://bugs.python.org/issue22213
== PEP A: Ecosystem of C extensions in 2019 ==
Discuss cffi, Cython, PyQt usage of the stable ABI, CPython C API, etc. The goal is not to solve any problem, mostly to list existing options.
It sounds like an unsual PEP, but I think that a PEP is needed since the same discussions happened multiple times.
This PEP can describe what are the kind of "C extensions" and maybe suggest which tools are the best depending on the kind. cffi doesn't cover all cases, the C API isn't always the right answer, etc.
For tooling recommendations, https://packaging.python.org/guides/packaging-binary-extensions/ would be a better place to contribute updates than a snapshot-in-time PEP.
An informational PEP could potentially cover options that we've currently omitted from the packaging guide as being overly niche or experimental, though.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2019-02-22, Victor Stinner wrote:
== PEP D: Completely new C API ==
Well, that's the obvious alternative to PEP C.
Armin Rigo's PyHandle idea may be a good start? https://pythoncapi.readthedocs.io/pyhandle.html
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
The PyHandle layer can be implemented as a separate project. That gives the freedom to tinker without upsetting people. It will take some missteps and revisions until the API becomes polished. You don't want to make those mistakes inside the CPython repo.
As the PyHandle API evolves, I would imagine we would have a lot of ideas about what PEP C should entail. Ideally the APIs defined by PEP C would be the ones you need to implement PyHandle for CPython. I think trying to do PEP C before PEP D is the wrong way around.
An initial goal would be to make the PyHandle layer be a replacement for the limited API. I.e. make it so that any extension currently using the limited API could switch to it.
Regards,
Neil
On Wed, Feb 27, 2019 at 10:39 AM Neil Schemenauer <nas-python@arctrix.com> wrote:
On 2019-02-22, Victor Stinner wrote:
== PEP D: Completely new C API ==
Well, that's the obvious alternative to PEP C.
Armin Rigo's PyHandle idea may be a good start? https://pythoncapi.readthedocs.io/pyhandle.html
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
+1
The PyHandle layer can be implemented as a separate project. That gives the freedom to tinker without upsetting people. It will take some missteps and revisions until the API becomes polished. You don't want to make those mistakes inside the CPython repo.
+1
As the PyHandle API evolves, I would imagine we would have a lot of ideas about what PEP C should entail. Ideally the APIs defined by PEP C would be the ones you need to implement PyHandle for CPython. I think trying to do PEP C before PEP D is the wrong way around.
I guess it depends on what "enhancements" people are thinking about. But there definitely is a chance that PEP D can help inform PEP C.
An initial goal would be to make the PyHandle layer be a replacement for the limited API. I.e. make it so that any extension currently using the limited API could switch to it.
As long as people don't think it's a 1:1 correlation of the APIs but the idea is that extensions using the limited API should be able to use the PEP D API with a rewrite does make sense as a good goal to me.
Neil Schemenauer schrieb am 27.02.19 um 19:38:
On 2019-02-22, Victor Stinner wrote:
== PEP D: Completely new C API ==
Well, that's the obvious alternative to PEP C.
Armin Rigo's PyHandle idea may be a good start? https://pythoncapi.readthedocs.io/pyhandle.html
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
Why do you think so? It looks like a relatively simple change to me, mostly just replacing "Py_INCREF(obj)" with "obj = Py_INCREF(obj)" in user code, and then clean up a couple of corner cases here and there.
It would obviously break the world, but that's the whole point, right?
Stefan
On 2019-02-28, Stefan Behnel wrote:
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
Why do you think so? It looks like a relatively simple change to me, mostly just replacing "Py_INCREF(obj)" with "obj = Py_INCREF(obj)" in user code, and then clean up a couple of corner cases here and there.
I don't understand why you think Py_INCREF() would have to return a new pointer. That doesn't seem necessary to me or too relevant to making PyObject and PyTypeObject opaque types. I think the challenge is for all the extension module code that looks inside those structs, e.g.
ob->ob_type
or
Py_TYPE(ob)->tp_something
You have to provide APIs that replace all those struct member accesses. When I said "nearly impossible", maybe that's overstating the effort. Making PyObject* opaque actually doesn't seem to bad. ob_refcnt is only accessed in a few places in the CPython source code. ob_type is accessed in a lot more but it is not too terrible to replace those references with Py_TP(ob) (version of Py_TYPE that casts the arg to PyObject *). I did that for my tagged pointer experiment and it wasn't too bad.
Making PyTypeObject opaque seems vastly more difficult. Almost every extension type is defined as a static PyTypeObject structure. They would have to convert to using something like PyType_FromSpec(). To make things easier, I think you could have a function like PyType_FromSpec() that took a traditional PyTypeObject static structure and returned an opaque PyTypeObject pointer. It would copy over the information from the static structure. That way, you decouple the layout of the internal PyTypeObject from what was allocated statically by the extension module. Doing it that way, the module source doesn't have to change much. Basically, change
if (PyType_Ready(&MyType)) {
...
}
to:
if (MyType = PyType_CreateFromDef(&MyTypeDef)) {
...
}
That also solves the problem with some types being heap allocated and some stack allocated. PyType_CreateFromDef() would always return a heap allocated object.
Regards,
Neil
Neil Schemenauer schrieb am 28.02.19 um 19:43:
On 2019-02-28, Stefan Behnel wrote:
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
Why do you think so? It looks like a relatively simple change to me, mostly just replacing "Py_INCREF(obj)" with "obj = Py_INCREF(obj)" in user code, and then clean up a couple of corner cases here and there.
I don't understand why you think Py_INCREF() would have to return a new pointer. That doesn't seem necessary to me or too relevant to making PyObject and PyTypeObject opaque types.
Interesting. I actually don't understand what PyTypeObject has to do with this. :)
Py_INCREF() needs to return a new handle (whether that's a pointer or not is an unimportant detail for now).
I think the challenge is for all the extension module code that looks inside those structs, e.g.
ob->ob_type
or
Py_TYPE(ob)->tp_something
You have to provide APIs that replace all those struct member accesses.
I understand that "obj->ob_type" might be problematic (although that does not even seem sure yet), but what is the problem with "tp->tp_something"?
ob_type is accessed in a lot more but it is not too terrible to replace those references with Py_TP(ob) (version of Py_TYPE that casts the arg to PyObject *). I did that for my tagged pointer experiment and it wasn't too bad.
I think it would still be nice to allow at least fast type checks.
Stefan
On 2019-02-28, Stefan Behnel wrote:
Interesting. I actually don't understand what PyTypeObject has to do with this. :)
In CPython, Py_TYPE() is fast because it returns a borrowed reference to the PyTypeObject. With the PyObject API that is built on top of the pyref handle API, you can't have borrowed references. So, extension modules instead of:
Py_TYPE(o)->tp_something(...);
they have to do:
PyObject *tp = PyObject_GetType(o); // returns new reference
tp->tp_something(...);
Py_DECREF(tp);
that's a fair bit slower for something that is done very often. Also, the PyTypeObject struct for each object might not exist in the runtime and so PyObject_GetType() has to allocate memory for it, all the sub-structures and fill in the slots with appropriate data.
In CPython, Py_TYPE() is super cheap because the PyTypeObject structure is already there and all filled in. Making other Python runtimes emulate that PyTypeObject stucture could be burdensome.
To relieve that, we can provide APIs that do the same things without requiring a PyTypeObject structure with a specific layout. E.g. to check the type:
pyref_t *r = pyref_get_some_object();
pyref_t *tp = pyref_something_that_returns_a_type()
if (pyref_is_instance(r, tp)) {
....
}
Py_INCREF() needs to return a new handle (whether that's a pointer or not is an unimportant detail for now).
I don't follow. Why does it have to return a new handle? Py_INCREF should mutate the object. Are you thinking of some kind of immutable handle type?
I understand that "obj->ob_type" might be problematic (although that does not even seem sure yet), but what is the problem with "tp->tp_something"?
You are forcing all Python runtimes that want to support the C API to have the same memory layouts for type objects. It is a poor design that the source code for extensions ties the implemention into a certain structure layout. They should be decoupled.
I think it would still be nice to allow at least fast type checks.
Certainly we would still have fast type checks. They just won't be done by making the extension module assume a PyObject is a structure that has a ob_type pointer at a certain offset and that the ob_type pointer is the same for every single object of that type. Other runtimes might implement things differently. It is possible to build an API that abstracts over that. Since we can use C99 inline functions now, that would be an obvious way to do it.
Regards,
Neil
Neil Schemenauer schrieb am 28.02.19 um 21:42:
On 2019-02-28, Stefan Behnel wrote:
Py_INCREF() needs to return a new handle (whether that's a pointer or not is an unimportant detail for now).
I don't follow. Why does it have to return a new handle? Py_INCREF should mutate the object. Are you thinking of some kind of immutable handle type?
Something that doesn't require refcounting, yes. Could be a pointer or an index into some object ID mapping array. It would mean that pointer inequality doesn't rule out object identity anymore, because multiple handles could point to the same object, but it would provide an alternative to reference counting because each handle would be a single unique reference.
Your proposal of making it a pointer to a refcount would be a way to keep it backwards compatible – if that's wanted. It's not the only option, though.
I understand that "obj->ob_type" might be problematic (although that does not even seem sure yet), but what is the problem with "tp->tp_something"?
You are forcing all Python runtimes that want to support the C API to have the same memory layouts for type objects.
Not necessarily. I'm just suggesting to keep the current vtable set (a.k.a. slots) to allow for fast protocol usages. That doesn't mean it's the "memory layout for type objects", especially not the one that CPython itself is tied to internally. PyTypeObject currently mingles multiple things, at least a) being an object itself, b) allowing for pointer type tests, c) describing the type/configuration/behaviour of an object and d) providing access to protocols. The different use cases could be separated.
Stefan
On Thu, Feb 28, 2019 at 1:10 PM Stefan Behnel <python_capi@behnel.de> wrote:
Not necessarily. I'm just suggesting to keep the current vtable set (a.k.a. slots) to allow for fast protocol usages. That doesn't mean it's the "memory layout for type objects", especially not the one that CPython itself is tied to internally. PyTypeObject currently mingles multiple things, at least a) being an object itself, b) allowing for pointer type tests, c) describing the type/configuration/behaviour of an object and d) providing access to protocols. The different use cases could be separated.
Regarding d, I am curious to know, given that the set of entities in protocols is limited, do you think abstracting away the protocol access with API functions could provide the same properties of a vtable without having the vtable itself become API?
This would benefit systems that chose to not have a direct pointer from instances to their class or vtable. For them, as even big applications tend to have only order tens of thousands of types, encoding the type of an object in a whole 64-bit pointer wastes space. Instead, the instance type is represented as a small integer ID leaving the rest of the of header (which, ideally, is no more than a word) for other metadata.
Le mer. 27 févr. 2019 à 19:38, Neil Schemenauer <nas-python@arctrix.com> a écrit :
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
In https://pythoncapi.readthedocs.io/ I proposed solutions to get a smooth transition towards a better C API without starting from scratch nor breaking backward compatibility.
I agree that it doesn't solve all problems, and that CPython would be the first one to benefit from this.
That's why I proposed to split the discussions into multiple PEPs.
The PyHandle layer can be implemented as a separate project. That gives the freedom to tinker without upsetting people. It will take some missteps and revisions until the API becomes polished. You don't want to make those mistakes inside the CPython repo.
I proposed to add a new opt-in C API (basically, the current C API with minor changes) directly in the master branch of Python, but I have been asked to write a PEP for that. It would be the PEP C.
Another option would be to work in a fork of CPython. I already did that: https://github.com/pythoncapi/cpython
That's not a good long-term solution. If we decide to enhance the API and add a new opt-in API, it should be easy to use/experiment it. There is always the problem of the critical mass to make a project successful.
As the PyHandle API evolves, I would imagine we would have a lot of ideas about what PEP C should entail. Ideally the APIs defined by PEP C would be the ones you need to implement PyHandle for CPython. I think trying to do PEP C before PEP D is the wrong way around.
An initial goal would be to make the PyHandle layer be a replacement for the limited API. I.e. make it so that any extension currently using the limited API could switch to it.
Even if the PEP D makes your applications 10x faster, I don't believe that we will ever be able to get ride of the current C API. Again, see the transition from Python 2 to Python 3. Ten years later, some people are just discussing how to start to migrate this code base. And a lot of code will stay at Python 2 forever.
That's why I consider that we need to work on PEP C and PEP D in parallel, but also on PEP A to show the other existing solutions ;-)
I'm not sure about the exact timeline. We can draft a PEP C right now, but wait until we get enough feedback to take a decision. We might wait until PEP D made progress if you prefer.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 2019-02-27, Neil Schemenauer wrote:
I think the PyHandle idea has the best chance of producing a good end result. I suspect PEP C doesn't go far enough to solve the problems for alternative Python implementations. They really want PyObject to be an opaque handle-like object. Trying to make the existing C-API work like that seems like a nearly impossible task.
I chatted with Armin a little about his PyHandle idea and we came up with a possible refinement. I hope I can explain it accurately.
One of the key problems with PyPy implementing the CPython API is that it doesn't have space for a reference count field inside its internal memory storage for objects. So, when a CPython API returns a PyObject*, where can PyPy store the reference count? The problem makes passing objects back and forth over the CPython API expensive for PyPy. That's my understanding anyhow.
You can define a new API using opaque object handles and that solves the problem for PyPy. They don't need to emulate reference counting and can just have a global table of open object handles. The problem is, how do you convert existing CPython extensions to use this new API? They still want to do reference counting.
Here is a sketch of the API. Introduce a new, lower level API that works with object handles. Call them pyref_t. The object handle API doesn't implement reference counting. So, it is cheap for PyPy to implement it. Passing objects back and forth using the handle API is cheaper. To make it easy for existing extension modules or ones that want to use reference counting, provide a PyObject layer on top of the handle API. E.g.
pyref_t *r = pyref_some_api_that_opens_a_handle();
PyObject *o = PyObject_FromRef(r);
Py_INCREF(o);
Py_DECREF(o);
Py_DECREF(o); // handle gets closed because refcnt goes to zero
The PyObject structure could be:
typedef struct {
size_t refcnt;
pyref_t *r;
} PyObject;
Calling PyObject_FromPyRef() allocates a new one of these structures. When the reference count goes to zero, the handle is closed and the PyObject memory is freed.
To solve the non-opaque PyObject/PyTypeObject issue, I think you could have a source code option that turns on opaque types. PyPy has already implemented (mostly) compatible PyObject and PyTypeObject structures. With the option off, they do what they do now. In that case, PyObject_FromPyRef() has to return a non-opaque PyObject structure and it needs to have a ob_type slot that points to a non-opaque PyTypeObject structure.
If you turn the source option for opaque types on, something like:
#define Py_OPAQUE_PYOBJECT 1
#include <Python.h>
Things could get more efficient when you use your extension with PyPy. I.e. PyObject_FromRef() is faster because it doesn't need to fill in the ob_type pointer. Obviously your extension would not compile if you are trying to look inside PyObject structs.
Implementing this handle layer for CPython should be quite easy. PyObject_FromRef can just be a typecast from pyref* to PyObject*. No extra piece of memory needs to be allocated because the pyref object already has space for the reference count. In debug builds, CPython should check that the handle API is used correct (e.g. add a field to keep track that handles are properly closed).
Extensions can use a mix of the new pyref handle-based API and the old PyObject-based API, and get conversion functions between them. Additionally, this approach would work even when we don't support the complete details of all the old C API.
Regards,
Neil
Le 28/02/2019 à 20:09, Neil Schemenauer a écrit :
Here is a sketch of the API. Introduce a new, lower level API that works with object handles. Call them pyref_t. The object handle API doesn't implement reference counting. So, it is cheap for PyPy to implement it. Passing objects back and forth using the handle API is cheaper. To make it easy for existing extension modules or ones that want to use reference counting, provide a PyObject layer on top of the handle API. E.g.
pyref_t *r = pyref_some_api_that_opens_a_handle(); PyObject *o = PyObject_FromRef(r); Py_INCREF(o); Py_DECREF(o); Py_DECREF(o); // handle gets closed because refcnt goes to zero
The PyObject structure could be:
typedef struct { size_t refcnt; pyref_t *r; } PyObject;
Are you showing the PyPy implementation of PyObject?
To solve the non-opaque PyObject/PyTypeObject issue, I think you could have a source code option that turns on opaque types.
I don't understand what this means. Do you need a special CPython build with a different set of preprocessor options? Or is this when building an extension?
(if the latter, it sounds close to the existing "stable ABI")
Regards
Antoine.
On 2019-02-28, Antoine Pitrou wrote:
Le 28/02/2019 à 20:09, Neil Schemenauer a écrit :
The PyObject structure could be:
typedef struct { size_t refcnt; pyref_t *r; } PyObject;
Are you showing the PyPy implementation of PyObject?
Yes, this is an example of what PyPy would allocate when you create a PyObject from a pyref. In this case, it is an example of what gets allocated if you have the opaque PyObject flag turned on. The non-opaque version would need to have a compatible structure layout, e.g. something like:
typedef struct { ssize_t ob_refcnt; PyObject *ob_type; pyref_t *r; } PyObject;
To solve the non-opaque PyObject/PyTypeObject issue, I think you could have a source code option that turns on opaque types.
I don't understand what this means. Do you need a special CPython build with a different set of preprocessor options? Or is this when building an extension?
It is a source option for the extension. You don't need a special CPython build. The extension module source code would turn on the opaque types before included the Python API headers, e.g.
#define Py_OPAQUE_PYOBJECT 1
#include <Python.h>
(if the latter, it sounds close to the existing "stable ABI")
I guess they are similar. As I understand, if the limited API is turned on, PyTypeObject becomes opaque but PyObject is not. My Py_OPAQUE_PYOBJECT would also make PyObject opaque. I suppose that's not a big difference because PyObject is actually not too hard to make opaque, PyTypeObject is the tricky one.
If we are overhauling the API, I think there should be a separate option to toggle ABI stability. If you turn it off, functions can become inlined. As an extension author, I can use just use PyList_GetSize(o). The person compiling the extension can decide they would rather get those small functions inlined and lose the ABI stability. If you are distributing a pre-built extension on PyPI, you probably want the ABI stability flag on (and pay the performance hit).
Regards,
Neil
On Thu, Feb 28, 2019 at 11:11 AM Neil Schemenauer <nas-python@arctrix.com> wrote:
Implementing this handle layer for CPython should be quite easy. PyObject_FromRef can just be a typecast from pyref* to PyObject*. No extra piece of memory needs to be allocated because the pyref object already has space for the reference count. In debug builds, CPython should check that the handle API is used correct (e.g. add a field to keep track that handles are properly closed).
It is possible, today, to treat PyObject* as an opaque handle if you do not stray far from the limited API. (PyPy is less restricted than that.) In my experience, this kind of handle can be a pair of a pointer to an object and a reference count and a PyObject* points to that pair. These pairs would be stored together with other handles in a dense array, something that is easy to allocate from and for the garbage collector to visit. The reference count field does add a word of overhead but that is offset by not storing reference counting metadata in the rest of your heap objects.
An interesting property of PyObject* relative to your proposal here is that PyObject* is a direct pointer to an object. This means code expects to be able to compare a PyObject* for identity equality using == as one would do for any other object in C. To ensure that every PyObject* has this property, a mapping from an object to its unique handle must be done when passing it between Python and C. Different implementation techniques for this mapping will make the lookup faster or slower.
This relates to an interesting consequence of something like PyHandle. If PyHandles are not mapped one-to-one to an object, identity comparisons will need to go through a function call. Furthermore, a compatibility scheme such as converting a PyHandle to a PyObject* would be more complicated than a simple wrapping of the PyHandle as the resulting PyObject* would not be identity equal to any other PyObject* referring to the same object.
There are a lot of design considerations and experience with handles in other languages that can inform a design for CPython. For example, references in Java's JNI are most commonly implemented as a handle that indirectly references an object. As such, a user of JNI must be careful to compare references using the IsSameObject predicate instead of an ordinary == compare in C . Despite JNI being >20 years old, this remains counterintuitive and is common source of bugs as you can infer from this Android SDK guide
https://developer.android.com/training/articles/perf-jni
Another lesson we can learn from JNI is that all of the bugs associated with file descriptors apply to handles. Because references to things in memory are more common than file descriptors, these bugs become a lot more commonly occurring. A good implementation of JNI will avoid using stack addresses or dense integers as a handle value because it is too hard to ensure those values are not stale and do not alias to something that shouldn’t belong to you. Therefore, a good implementation typically avoids recycling references and obfuscates their values using some form of encryption. This adds to the overhead of using a reference and the complexity of implementing JNI.
Because of all of the accumulated experience with handles in other systems, I think CPython is positioned to do much better than its predecessors. Having a PyHandle prototype as a third-party extension for experimentation purposes will go even further to help avoid making subtle mistakes that affect developers for decades to come.
On 2019-02-28, Carl Shapiro wrote:
It is possible, today, to treat PyObject* as an opaque handle if you do not stray far from the limited API.
Is this an argument for not introducing a PyHandle data type? Do you think it is better just to make PyObject work as handles?
If PyHandles are not mapped one-to-one to an object, identity comparisons will need to go through a function call. Furthermore, a compatibility scheme such as converting a PyHandle to a PyObject* would be more complicated than a simple wrapping of the PyHandle as the resulting PyObject* would not be identity equal to any other PyObject* referring to the same object.
Good point. I was thinking implicitly that PyHandles would not be mapped one-to-one. However, if CPython makes PyHandle just a type cast from PyObject, people are going to do pointer compares and then be surprised that their extension breaks with other runtimes.
There are a lot of design considerations and experience with handles in other languages that can inform a design for CPython. For example, references in Java's JNI are most commonly implemented as a handle that indirectly references an object.
Thank you for bringing up JNI. I was vaguely aware of it but after doing some reading last night, I see it solves many of the same problems we are trying to solve.
As such, a user of JNI must be careful to compare references using the IsSameObject predicate instead of an ordinary == compare in C . Despite JNI being >20 years old, this remains counterintuitive and is common source of bugs as you can infer from this Android SDK guide
So, what's your opinion on that choice? Not requiring a one-to-one mapping for the handles makes things easier for the runtime but the API is harder to use correctly. Should we follow the JNI model or should we pay the cost to get one-to-one mapping?
If the runtime also pays the memory cost to keep a reference count in the handle table, I think I see how we could just make PyObject be the opaque handle.
A good implementation of JNI will avoid using stack addresses or dense integers as a handle value because it is too hard to ensure those values are not stale and do not alias to something that shouldn’t belong to you.
Interesting. If you give up on binary compatibility,you could have a debug build option that enables encryption of handle values. Disable that for better performance in release builds. Maybe that's poor software engineering though (like no having array bounds checking turned on by default).
Because of all of the accumulated experience with handles in other systems, I think CPython is positioned to do much better than its predecessors.
We better study those systems then. I think it would be best to not be too creative and stick to a design that has been proven to work. JNI looks to be a goldmine of ideas (not that we have to make all the same decisions).
Do you have suggestions for other native interfaces that should be studied? It looks like CoreCLR has something but it seems to require using C++. At least, the exception handling uses C++ features. E.g.
EX_TRY / EX_CATCH / EX_END_CATCH
https://github.com/dotnet/coreclr/blaob/master/Documentation/botr/exceptions.md
Possible other sources of ideas: Common Lisp, LuaJIT, Smalltalk implementations, Erlang, Haskel implementations. Compared to those, I would suspect JNI is much more heavily used in practice.
Regards,
Neil
On Sat, 2 Mar 2019 at 04:11, Neil Schemenauer <nas-python@arctrix.com> wrote:
So, what's your opinion on that choice? Not requiring a one-to-one mapping for the handles makes things easier for the runtime but the API is harder to use correctly. Should we follow the JNI model or should we pay the cost to get one-to-one mapping?
If the runtime also pays the memory cost to keep a reference count in the handle table, I think I see how we could just make PyObject be the opaque handle.
Note that moving reference counts to a separate table is something we've previously discussed doing for CPython itself, with the two most notable problems being:
- the possible performance hit of the extra pointer dereference in Py_INCREF and Py_DECREF (et al)
- any third party code that's accessing ob_refcnt directly
The complete unknown from a performance persecptive is the potential CPU cache management impact of switching from scattered writes to a lot of different memory blocks to frequent writes to a particularly hot memory page (although we know up front that the centralised table will be far more copy-on-write friendly without any need for gc.freeze() shenanigans).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 2019-03-03, Nick Coghlan wrote:
Note that moving reference counts to a separate table is something we've previously discussed doing for CPython itself
I suspect as long as we are using reference counting GC for CPython (which is realistically probably forever), the sensible choice will be to have the counts located with the object. However, IMHO, the extension API should not force the VM into that design.
The complete unknown from a performance persecptive is the potential CPU cache management impact of switching from scattered writes to a lot of different memory blocks to frequent writes to a particularly hot memory page (although we know up front that the centralised table will be far more copy-on-write friendly without any need for gc.freeze() shenanigans).
I was watching an interesting video today talking about the cost of scattered memory access on modern hardware:
https://youtu.be/TJHgp1ugKGM?t=1564
The benchmarks on the Samsung tablet are interesting. CPython is very sparse (like Java/C# in the presentation, but worse yet). If you want to look at the refcnt for many objects (e.g. during cyclic GC pass), it would help to put them in a contiguous array. However, in normal execution, you are already going to have to object data in cache and so it makes sense to have the refcnt there too.
Regards,
Neil
On Fri, Mar 1, 2019 at 10:11 AM Neil Schemenauer <nas-python@arctrix.com> wrote:
Is this an argument for not introducing a PyHandle data type? Do you think it is better just to make PyObject work as handles?
To answer your question somewhat obliquely, I believe that it is possible to make the C-API more amenable to alternative Python implementations with incremental changes to the C-API that could be absorbed by third-party code over a series of releases. I also believe that improving PyObject* does not preclude providing a better abstraction like a PyHandle. I suspect many of the API work needed to make PyObject* better would be required to make even PyHandle possible.
So, what's your opinion on that choice? Not requiring a one-to-one
mapping for the handles makes things easier for the runtime but the API is harder to use correctly. Should we follow the JNI model or should we pay the cost to get one-to-one mapping?
I think a lot of details would have to be considered to make the right decision. For example, instances of types created in the C-API can be allocated outside of the Python heap. Would that feature be preserved? You could keep it with PyObject* but it might be harder to do with PyHandle.
Possible other sources of ideas: Common Lisp, LuaJIT, Smalltalk implementations, Erlang, Haskel implementations. Compared to those, I would suspect JNI is much more heavily used in practice.
Those would be good starting places. I know that Common Lisp and Smalltalk systems typically provide an FFI which is more minimal compared to the C-API.
Le mar. 5 mars 2019 à 07:51, Carl Shapiro <carl.shapiro@gmail.com> a écrit :
On Fri, Mar 1, 2019 at 10:11 AM Neil Schemenauer <nas-python@arctrix.com> wrote:
Is this an argument for not introducing a PyHandle data type? Do you think it is better just to make PyObject work as handles?
To answer your question somewhat obliquely, I believe that it is possible to make the C-API more amenable to alternative Python implementations with incremental changes to the C-API that could be absorbed by third-party code over a series of releases. I also believe that improving PyObject* does not preclude providing a better abstraction like a PyHandle. I suspect many of the API work needed to make PyObject* better would be required to make even PyHandle possible.
IMHO we should attempt both approaches:
Make PyObject structure "more" opaque. Either make it fully opaque and break the Python world in a flag data (haha, that would be funny!), or add a new opt-in C API (using a C #define or whatever). I worked on the opt-in approach. This approach doesn't work with old Python versions which cannot be modified anymore, it cannot start before Python 3.8.
Add a fully new PyHandle API which would be compatible with Python 2.7-3.8.
In parallel, more and more C API changes are pushed in Python 3.8 to make some structures opaque (PyInterpreterState). That's the first obvious option that I called "break the world", but the most risky.
It's way too early to bet which approach will work in the long term.
To be even more clear, all approaches have exactly the same goal: having a more opaque C API to hide as much implementation details as possible. The long term goal should to hide "all" implementation details. I let you try to define what are implementation details or not. Is CPython GC implementation an implementation detail (destroy an object as soon as its ref count reach zero)? ... These are hard questions :-)
PyPy knows better than me than CPython is full of subtle implementation details. For example, I was very surprised to learn that Python creates a ".0" local variable to list comprehensions :)
[locals() for i in range(1)] [{'.0': <range_iterator object at 0x7fcf3bf29300>, 'i': 0}]
Fixing the C API cannot be done at once. It must be a process.
Victor
On 2019-03-04, Carl Shapiro wrote:
[...] instances of types created in the C-API can be allocated outside of the Python heap. Would that feature be preserved? You could keep it with PyObject* but it might be harder to do with PyHandle.
I can only speak for myself but I would like to kill off non-heap allocated types. That's not easy because nearly every extension module that defines a type does so using a static structure for the new type (not heap allocated). Some discussion here:
https://bugs.python.org/issue35810
We have PyType_FromSpec() but converting an extension module to use it is non-trivial. I was wondering if we can make an easier change. Can we just make a version of PyType_Ready() that copies the static structure into a heap allocated type and then returns that? Then, fixing the extension modules is pretty easy. Instead of:
PyType_Ready(&MyType);
you do:
MyType = PyType_FromStatic(&MyTypeDef);
The related thing I would like to change is to force all PyObject structures to be allocated by CPython memory allocators. Aside from statically allocated types, I believe that is already the case for objects with the GC flag. The object memory has to come from _PyObject_GC_New() or _PyObject_GC_NewVar(). That is not the case for non-GC objects, as far as I'm aware. At least, it was the case years ago that extension types could use their own malloc implementation, for example.
There are some legitimate reasons to want to use a special allocator. However, I don't think those are good enough reasons for what we are giving up for supporting that. I suspect most people are not even aware that is a thing. I'm not sure it even works anymore. When we implemented obmalloc, it was a considerable challenge to keep it working, as I recall.
BTW, in the above example, "MyType" is nearly always a static variable in the extension module. All those static PyObject variables are a contributing factor to making CPython shutdown complicated and flakey. Look at Py_FinilizeEx() if you are brave. It is a pile of doggy dodo. Dirty hacks on hacks, slow, and doesn't really work correctly. Instead of keeping a static variable reference to the type, you can add the new type of the globals of the new extension module, e.g.
PyObject *mytype = PyType_FromStatic(&MyTypeDef);
PyModule_AddObject(module, "MyType", mytype);
That way, you don't have an extra ref to MyType keeping it alive longer than the module it is contained within. If you need quick access to the type object, it really should be stored in the per-interpreter module data.
Regards,
Neil
On 05Mar2019 1103, Neil Schemenauer wrote:
On 2019-03-04, Carl Shapiro wrote:
[...] instances of types created in the C-API can be allocated outside of the Python heap. Would that feature be preserved? You could keep it with PyObject* but it might be harder to do with PyHandle.
I can only speak for myself but I would like to kill off non-heap allocated types. That's not easy because nearly every extension module that defines a type does so using a static structure for the new type (not heap allocated). Some discussion here: [SNIP] The related thing I would like to change is to force all PyObject structures to be allocated by CPython memory allocators.
I don't agree.
To be at all useful, I think your last sentence needs to be "force all PyObject structures to be allocated by *the single CPython memory allocator for the current runtime*". That means we don't need to store the deallocator function for each object, and can simply pass the memory blocks to a known allocator (even if that's been switched out at runtime startup, it won't have changed in the meantime).
However, in the context of features like NVRAM, GPU/CPU contexts, and even subinterpreters and subprocesses, I think there's a huge advantage in having objects know how to deallocate themselves. Without this, there's no way to support these more advanced concepts transparently. IMHO, that would be missing a huge opportunity.
(Of course, if Py_DECREF somehow became a per-object/per-class virtual function, then this becomes trivial. Even now, the dealloc function is per-type, and I don't think we'd gain anything by removing that, while what we gain from increasing it to be per-object could be significant.)
Cheers, Steve
On 2019-03-05, Steve Dower wrote:
I don't agree.
To be at all useful, I think your last sentence needs to be "force all PyObject structures to be allocated by *the single CPython memory allocator for the current runtime*".
I think you don't need to have a single allocator. My vision is that the responsibility of allocating and deallocating PyObject memory is responsibilty of the Python VM. It might use specialized allocators for different purposes, for example.
That means we don't need to store the deallocator function for each object, and can simply pass the memory blocks to a known allocator (even if that's been switched out at runtime startup, it won't have changed in the meantime).
It is up to the Python VM to decide how that's done. The VM might still store a deallocator function per type, like what is currently done.
However, in the context of features like NVRAM, GPU/CPU contexts, and even subinterpreters and subprocesses, I think there's a huge advantage in having objects know how to deallocate themselves. Without this, there's no way to support these more advanced concepts transparently. IMHO, that would be missing a huge opportunity.
Does it help if the PyObject can have a pointer to memory allocated in these different ways? It seems to me that allows most of the benefits but still allows the Python VM to GC PyObject memory in an efficient way. So a Python extension type can still allocate some extra memory assocated with instances of it and there is a dealloc method called by the VM to clean it up again. Just the memory for the PyObject itself must be allocated and deallocated by the VM itself.
Maybe that is not flexible enough to do what you want. It adds another layer of indirection. I'm glad you bring up those cases because the new API should support those kinds of things.
Regards,
Neil
On Tue, Mar 5, 2019 at 1:59 PM Neil Schemenauer <nas-python@arctrix.com> wrote:
Does it help if the PyObject can have a pointer to memory allocated in these different ways? It seems to me that allows most of the benefits but still allows the Python VM to GC PyObject memory in an efficient way. So a Python extension type can still allocate some extra memory assocated with instances of it and there is a dealloc method called by the VM to clean it up again. Just the memory for the PyObject itself must be allocated and deallocated by the VM itself.
An advantage to keeping the PyObject_HEAD in the regular Python heap and having a separate pointer to off-heap memory is that memory ordering for the PyObject_HEAD always works as expected. For example, memory for frame buffers is often in a write-combining area that does not respect ordering, allowing the PyObject_HEAD fields to be observed in an inconsistent state.
This affects a Python that wishes to have better shared-memory concurrency support (even if it's just within in its runtime) as not being able to make assumptions about writes to a PyObject_HEAD can slow down common operations.
participants (10)
-
Antoine Pitrou
-
Brett Cannon
-
Carl Shapiro
-
Jeroen Demeyer
-
M.-A. Lemburg
-
Neil Schemenauer
-
Nick Coghlan
-
Stefan Behnel
-
Steve Dower
-
Victor Stinner