Consider the following snippet of code: typedef struct { HPyObject_HEAD long x; long y; } PointObject; void foo(HPyContext ctx, HPy h_point) { PointObject *p1 = HPy_CAST(ctx, PointObject, h_point); PyObjecy *py_point = HPy_AsPyObject(ctx, h_point); // [1] PointObject *p2 = (PointObject*)py_point; ... } [1] Note that it does not need to be a call to HPy_AsPyObject: it might be a legacy method which takes a PyObject *self, or other similar ways It is obvious that HPy_CAST and HPy_AsPyObject need to return the very same address. This is straightforward to implement on CPython, but it poses some challenges on PyPy (and probably GraalPython). Things to consider: 1. currently, in PyPy we allocate the PointObject at a non-movable address, but so far the API does not REQUIRE it. I think it would be reasonable to have an implementation in which objects are movable and HPy_CAST pins the memory until the originating handle is closed. OTOH, the only reasonable semantics is that multiple calls to HPy_AsPyObject returns always the same address. 2. HPyObject_HEAD consists of two words which can be used by the implementation as they like. On CPython, it is obviously mapped to PyObject_HEAD, but in PyPy we (usually) don't need these two extra words, so we allocate sizeof(PointObject)-16 and return a pointer to malloc()-16, which works well since nobody is accessing those two words. I think that GraalPython could use a similar approach. 3. On PyPy, PyObject_HEAD is *three words*, because it also contains ob_pypy_link. But, since the code uses *H*PyObject_HEAD, PointObject will contain only 2 extra words. 4. In the real world usage, there will be "pure hpy types" and "legacy hpy types", which uses legacy methods&co. It would be nice if the pure hpy types do NOT have to pay penalties in case they are never casted to PyObject* With this in mind, how do we implement HPy_AsPyObject on PyPy? One easy way is: 1. we allocate sizeof(PointObject)+8 2. we tweak cpyext to find ob_pypy_link at p-8 3. we teach cpyext how to convert W_HPyObject into PyObject* and vice versa. However, this means that we need to always allocate 24 extra bytes for each object, even if nobody ever calls HPy_AsPyObject on it, which looks bad. Moreover, without changes in the API, the pin/unpin implementation of HPy_CAST becomes de facto impossible. So, my proposal is to distinguish between "legacy hpy types" and "pure hpy types". An HPyType_Spec is legacy if: 1. it uses .legacy_slots = ... OR 2. it ses .legacy = true (i.e., you can explicitly mark a type as legacy even if you no longer have any legacy method/slot. This is useful if you pass it to ANOTHER type which expects to be able to cast the PyObject* into the struct). If a type is "legacy", the snippet shown above works as expected; if it's not legacy, it is still possible to call HPy_AsPyObject on it, but then you are no longer allowed to C-cast if to PointObject* (on pypy, this will mean that you will get a "standard" PyObject* which is a proxy to W_HPyObject). Ideally, in that case it would be nice to catch the invalid cast in the debug mode, but I don't think this is possible... too bad. What do you think? ciao, Anto
After an IRC discussion with Armin, we designed the following solution: https://github.com/hpyproject/hpy/issues/83 If you have comments, please post the on the github issue, to avoid splitting discussions half here and half there :) ciao, Antonio On Sat, Sep 19, 2020 at 11:50 PM Antonio Cuni <anto.cuni@gmail.com> wrote:
Consider the following snippet of code:
typedef struct { HPyObject_HEAD long x; long y; } PointObject; void foo(HPyContext ctx, HPy h_point) { PointObject *p1 = HPy_CAST(ctx, PointObject, h_point); PyObjecy *py_point = HPy_AsPyObject(ctx, h_point); // [1] PointObject *p2 = (PointObject*)py_point; ... }
[1] Note that it does not need to be a call to HPy_AsPyObject: it might be a legacy method which takes a PyObject *self, or other similar ways
It is obvious that HPy_CAST and HPy_AsPyObject need to return the very same address. This is straightforward to implement on CPython, but it poses some challenges on PyPy (and probably GraalPython).
Things to consider:
1. currently, in PyPy we allocate the PointObject at a non-movable address, but so far the API does not REQUIRE it. I think it would be reasonable to have an implementation in which objects are movable and HPy_CAST pins the memory until the originating handle is closed. OTOH, the only reasonable semantics is that multiple calls to HPy_AsPyObject returns always the same address.
2. HPyObject_HEAD consists of two words which can be used by the implementation as they like. On CPython, it is obviously mapped to PyObject_HEAD, but in PyPy we (usually) don't need these two extra words, so we allocate sizeof(PointObject)-16 and return a pointer to malloc()-16, which works well since nobody is accessing those two words. I think that GraalPython could use a similar approach.
3. On PyPy, PyObject_HEAD is *three words*, because it also contains ob_pypy_link. But, since the code uses *H*PyObject_HEAD, PointObject will contain only 2 extra words.
4. In the real world usage, there will be "pure hpy types" and "legacy hpy types", which uses legacy methods&co. It would be nice if the pure hpy types do NOT have to pay penalties in case they are never casted to PyObject*
With this in mind, how do we implement HPy_AsPyObject on PyPy? One easy way is:
1. we allocate sizeof(PointObject)+8
2. we tweak cpyext to find ob_pypy_link at p-8
3. we teach cpyext how to convert W_HPyObject into PyObject* and vice versa.
However, this means that we need to always allocate 24 extra bytes for each object, even if nobody ever calls HPy_AsPyObject on it, which looks bad. Moreover, without changes in the API, the pin/unpin implementation of HPy_CAST becomes de facto impossible.
So, my proposal is to distinguish between "legacy hpy types" and "pure hpy types". An HPyType_Spec is legacy if:
1. it uses .legacy_slots = ... OR
2. it ses .legacy = true (i.e., you can explicitly mark a type as legacy even if you no longer have any legacy method/slot. This is useful if you pass it to ANOTHER type which expects to be able to cast the PyObject* into the struct).
If a type is "legacy", the snippet shown above works as expected; if it's not legacy, it is still possible to call HPy_AsPyObject on it, but then you are no longer allowed to C-cast if to PointObject* (on pypy, this will mean that you will get a "standard" PyObject* which is a proxy to W_HPyObject).
Ideally, in that case it would be nice to catch the invalid cast in the debug mode, but I don't think this is possible... too bad.
What do you think?
ciao,
Anto
The plan looks good!
participants (2)
-
Antonio Cuni
-
Simon Cross