Consider the following snippet of code:
typedef struct {
HPyObject_HEAD
long x;
long y;
} PointObject;
void foo(HPyContext ctx, HPy h_point)
{
PointObject *p1 = HPy_CAST(ctx, PointObject, h_point);
PyObjecy *py_point = HPy_AsPyObject(ctx, h_point); // [1]
PointObject *p2 = (PointObject*)py_point;
...
}
[1] Note that it does not need to be a call to HPy_AsPyObject: it might be
a legacy method which takes a PyObject *self, or other similar ways
It is obvious that HPy_CAST and HPy_AsPyObject need to return the very same
address. This is straightforward to implement on CPython, but it poses some
challenges on PyPy (and probably GraalPython).
Things to consider:
1. currently, in PyPy we allocate the PointObject at a non-movable address,
but so far the API does not REQUIRE it. I think it would be reasonable to
have an implementation in which objects are movable and HPy_CAST pins the
memory until the originating handle is closed. OTOH, the only reasonable
semantics is that multiple calls to HPy_AsPyObject returns always the same
address.
2. HPyObject_HEAD consists of two words which can be used by the
implementation as they like. On CPython, it is obviously mapped to
PyObject_HEAD, but in PyPy we (usually) don't need these two extra words,
so we allocate sizeof(PointObject)-16 and return a pointer to malloc()-16,
which works well since nobody is accessing those two words. I think that
GraalPython could use a similar approach.
3. On PyPy, PyObject_HEAD is *three words*, because it also contains
ob_pypy_link. But, since the code uses *H*PyObject_HEAD, PointObject will
contain only 2 extra words.
4. In the real world usage, there will be "pure hpy types" and "legacy hpy
types", which uses legacy methods&co. It would be nice if the pure hpy
types do NOT have to pay penalties in case they are never casted to
PyObject*
With this in mind, how do we implement HPy_AsPyObject on PyPy? One easy way
is:
1. we allocate sizeof(PointObject)+8
2. we tweak cpyext to find ob_pypy_link at p-8
3. we teach cpyext how to convert W_HPyObject into PyObject* and vice versa.
However, this means that we need to always allocate 24 extra bytes for each
object, even if nobody ever calls HPy_AsPyObject on it, which looks bad.
Moreover, without changes in the API, the pin/unpin implementation of
HPy_CAST becomes de facto impossible.
So, my proposal is to distinguish between "legacy hpy types" and "pure hpy
types". An HPyType_Spec is legacy if:
1. it uses .legacy_slots = ... OR
2. it ses .legacy = true (i.e., you can explicitly mark a type as legacy
even if you no longer have any legacy method/slot. This is useful if you
pass it to ANOTHER type which expects to be able to cast the PyObject* into
the struct).
If a type is "legacy", the snippet shown above works as expected; if it's
not legacy, it is still possible to call HPy_AsPyObject on it, but then you
are no longer allowed to C-cast if to PointObject* (on pypy, this will mean
that you will get a "standard" PyObject* which is a proxy to W_HPyObject).
Ideally, in that case it would be nice to catch the invalid cast in the
debug mode, but I don't think this is possible... too bad.
What do you think?
ciao,
Anto