[pypy-dev] Interaction between HPy_CAST and HPy_AsPyObject

Antonio Cuni anto.cuni at gmail.com
Sun Sep 20 17:56:57 EDT 2020


After an IRC discussion with Armin, we designed the following solution:
https://github.com/hpyproject/hpy/issues/83

If you have comments, please post the on the github issue, to avoid
splitting discussions half here and half there :)

ciao,
Antonio

On Sat, Sep 19, 2020 at 11:50 PM Antonio Cuni <anto.cuni at gmail.com> wrote:

> Consider the following snippet of code:
>
> typedef struct {
>     HPyObject_HEAD
>     long x;
>     long y;
> } PointObject;
> void foo(HPyContext ctx, HPy h_point)
> {
>     PointObject *p1 = HPy_CAST(ctx, PointObject, h_point);
>     PyObjecy *py_point = HPy_AsPyObject(ctx, h_point); // [1]
>     PointObject *p2 = (PointObject*)py_point;
>     ...
> }
>
> [1] Note that it does not need to be a call to HPy_AsPyObject: it might
> be a legacy method which takes a PyObject *self, or other similar ways
>
>
> It is obvious that HPy_CAST and HPy_AsPyObject need to return the very
> same address. This is straightforward to implement on CPython, but it poses
> some challenges on PyPy (and probably GraalPython).
>
>
> Things to consider:
>
> 1. currently, in PyPy we allocate the PointObject at a non-movable
> address, but so far the API does not REQUIRE it. I think it would be
> reasonable to have an implementation in which objects are movable and
> HPy_CAST pins the memory until the originating handle is closed. OTOH, the
> only reasonable semantics is that multiple calls to HPy_AsPyObject returns
> always the same address.
>
>
> 2. HPyObject_HEAD consists of two words which can be used by the
> implementation as they like. On CPython, it is obviously mapped to
> PyObject_HEAD, but in PyPy we (usually) don't need these two extra words,
> so we allocate sizeof(PointObject)-16 and return a pointer to malloc()-16,
> which works well since nobody is accessing those two words. I think that
> GraalPython could use a similar approach.
>
>
> 3. On PyPy, PyObject_HEAD is *three words*, because it also contains
> ob_pypy_link. But, since the code uses *H*PyObject_HEAD, PointObject will
> contain only 2 extra words.
>
>
> 4. In the real world usage, there will be "pure hpy types" and "legacy hpy
> types", which uses legacy methods&co. It would be nice if the pure hpy
> types do NOT have to pay penalties in case they are never casted to
> PyObject*
>
>
>
> With this in mind, how do we implement HPy_AsPyObject on PyPy? One easy
> way is:
>
> 1. we allocate sizeof(PointObject)+8
>
> 2. we tweak cpyext to find ob_pypy_link at p-8
>
> 3. we teach cpyext how to convert W_HPyObject into PyObject* and vice
> versa.
>
>
> However, this means that we need to always allocate 24 extra bytes for
> each object, even if nobody ever calls HPy_AsPyObject on it, which looks
> bad. Moreover, without changes in the API, the pin/unpin implementation of
> HPy_CAST becomes de facto impossible.
>
>
> So, my proposal is to distinguish between "legacy hpy types" and "pure hpy
> types". An HPyType_Spec is legacy if:
>
> 1. it uses .legacy_slots = ... OR
>
> 2. it ses .legacy = true (i.e., you can explicitly mark a type as legacy
> even if you no longer have any legacy method/slot. This is useful if you
> pass it to ANOTHER type which expects to be able to cast the PyObject* into
> the struct).
>
>
> If a type is "legacy", the snippet shown above works as expected; if it's
> not legacy, it is still possible to call HPy_AsPyObject on it, but then you
> are no longer allowed to C-cast if to PointObject* (on pypy, this will mean
> that you will get a "standard" PyObject* which is a proxy to W_HPyObject).
>
> Ideally, in that case it would be nice to catch the invalid cast in the
> debug mode, but I don't think this is possible... too bad.
>
>
> What do you think?
>
> ciao,
>
> Anto
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20200920/f6d82582/attachment.html>


More information about the pypy-dev mailing list