[pypy-dev] Interaction between HPy_CAST and HPy_AsPyObject

Antonio Cuni anto.cuni at gmail.com
Sat Sep 19 17:50:09 EDT 2020


Consider the following snippet of code:

typedef struct {
    HPyObject_HEAD
    long x;
    long y;
} PointObject;
void foo(HPyContext ctx, HPy h_point)
{
    PointObject *p1 = HPy_CAST(ctx, PointObject, h_point);
    PyObjecy *py_point = HPy_AsPyObject(ctx, h_point); // [1]
    PointObject *p2 = (PointObject*)py_point;
    ...
}

[1] Note that it does not need to be a call to HPy_AsPyObject: it might be
a legacy method which takes a PyObject *self, or other similar ways


It is obvious that HPy_CAST and HPy_AsPyObject need to return the very same
address. This is straightforward to implement on CPython, but it poses some
challenges on PyPy (and probably GraalPython).


Things to consider:

1. currently, in PyPy we allocate the PointObject at a non-movable address,
but so far the API does not REQUIRE it. I think it would be reasonable to
have an implementation in which objects are movable and HPy_CAST pins the
memory until the originating handle is closed. OTOH, the only reasonable
semantics is that multiple calls to HPy_AsPyObject returns always the same
address.


2. HPyObject_HEAD consists of two words which can be used by the
implementation as they like. On CPython, it is obviously mapped to
PyObject_HEAD, but in PyPy we (usually) don't need these two extra words,
so we allocate sizeof(PointObject)-16 and return a pointer to malloc()-16,
which works well since nobody is accessing those two words. I think that
GraalPython could use a similar approach.


3. On PyPy, PyObject_HEAD is *three words*, because it also contains
ob_pypy_link. But, since the code uses *H*PyObject_HEAD, PointObject will
contain only 2 extra words.


4. In the real world usage, there will be "pure hpy types" and "legacy hpy
types", which uses legacy methods&co. It would be nice if the pure hpy
types do NOT have to pay penalties in case they are never casted to
PyObject*



With this in mind, how do we implement HPy_AsPyObject on PyPy? One easy way
is:

1. we allocate sizeof(PointObject)+8

2. we tweak cpyext to find ob_pypy_link at p-8

3. we teach cpyext how to convert W_HPyObject into PyObject* and vice versa.


However, this means that we need to always allocate 24 extra bytes for each
object, even if nobody ever calls HPy_AsPyObject on it, which looks bad.
Moreover, without changes in the API, the pin/unpin implementation of
HPy_CAST becomes de facto impossible.


So, my proposal is to distinguish between "legacy hpy types" and "pure hpy
types". An HPyType_Spec is legacy if:

1. it uses .legacy_slots = ... OR

2. it ses .legacy = true (i.e., you can explicitly mark a type as legacy
even if you no longer have any legacy method/slot. This is useful if you
pass it to ANOTHER type which expects to be able to cast the PyObject* into
the struct).


If a type is "legacy", the snippet shown above works as expected; if it's
not legacy, it is still possible to call HPy_AsPyObject on it, but then you
are no longer allowed to C-cast if to PointObject* (on pypy, this will mean
that you will get a "standard" PyObject* which is a proxy to W_HPyObject).

Ideally, in that case it would be nice to catch the invalid cast in the
debug mode, but I don't think this is possible... too bad.


What do you think?

ciao,

Anto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20200919/5cc1c423/attachment.html>


More information about the pypy-dev mailing list