Hello,
we are planning to have a monthly HPy dev call on every first Thursday
at 10a (CET/CEST). The first call will be on April 2nd, 2020 (10a CEST).
The meeting will be held on Zoom:
Single click URL to join Zoom: https://oracle.zoom.us/j/339127822
Meeting ID: 339 127 822
If anyone wants to join but cannot make it due to the scheduled time,
please let us know (by replying to this mailing list or on the #hpy IRC
channel).
- Florian
Hello,
this is a long and dense email which contains some design ideas which are
needed to progress towards supporting custom types in HPy.
So far, I have opened github issues to discuss API design questions, but
usually I don't get much feedback, so I'm trying to write an email to see
whether I have better luck 😅.
I think that it's very important to make the right decisions in this phase,
so I'd like to hear opinions.
The following text is written using markdown syntax and contains
snippets of code, so if you prefer you can read with syntax highlighting at
this link: https://gist.github.com/antocuni/fd44ee897cc4cf14e874e2e686c60649
To keep compatibility with CPython, we need to generate a trampoline for
each
method which uses one of HPy calling conventions, e.g.:
```c
HPy_DEF_METH_VARARGS(add_ints)
static HPy add_ints_impl(HPyContext ctx, HPy self, HPy *args, HPy_ssize_t
nargs);
```
After this, `add_ints` is a function of type `HPyMeth` (which is different
in
the CPython and the Universal cases), and can be put inside an
`HPyMethodDef[]`. Morever, we can put old CPython-style methods by using an
appropriate flag. Note the `HPy_METH_*` vs `METH_*` flags, and also note
that
if you specify the wrong variant, things will happily explode:
```c
static PyObject *legacy_meth(PyObject *self, PyObject *args);
static HPyMethodDef FooMethods[] = {
...
{"add_ints", add_ints, HPy_METH_VARARGS, ""},
{"legacy_meth", legacy_meth, METH_VARARGS, ""},
...
}
```
Right now we have a different `HPy_DEF_METH_*` macro for the few calling
conventions we support, but to fully support custom types, we need many
more,
for example `newfunc`, `allocfunc`, `getattrfunc`, etc. which are C typedefs
stored in the various `tp_new`, `tp_alloc`, etc.
Before you ask: we need trampolines for all these functions to:
1. "Insert" the HPyContext (CPython will call the trampoline without it,
but
the `_impl` function the user is going to write will need it)
2. Convert `PyObject *` into `HPy`
3. Convert the result
The easiest option is to add more macros and create `HPy_DEF_METH_NEWFUNC` &
co., but I wonder whether we can do better.
To start with, I would like to have only ONE macro. This can be done with a
preprocessor trick:
```c
#define _HPy_DEF_METH_O(name) ...
#define _HPy_DEF_METH_VARARGS(name) ...
#define HPy_DEF_METH(kind, name) _HPy_DEF_##kind(name)
HPy_DEF_METH(METH_O, foo)
HPy_DEF_METH(METH_VARARGS, bar)
```
I think this is already an improvement over the existing situation. The
quesion is, what is the best syntax to use? Some random proposals:
* `HPy_DEF_METH(METH_O, ...`: I don't like the repetion of METH; moreover,
`METH_O` is the *CPython* macro to indicate a legacy function, so you
define your meth using `METH_O` but then you need to use `HPy_METH_O` in
the method def table.
* `HPy_DEF_METH(HPy_METH_O, ...)`: this is a bit verbose, but it's more
explicit.
* `HPy_DEF(HPy_METH_O, ...)`: is it better?
* `HPy_DEFINE(...)`
* `HPy_DEF_METH(O, ...)`
* `HPyMeth_DEF(...)`: I think I like this better than the current
`HPy_DEF_METH`.
Keep in mind few facts:
* We need to offer also the corresponding `HPy_DECL_METH`, which is needed
if you want to put your methods in multiple *.c files. So any syntax we
choose, must have a "nice" corresponding syntax for the declaration.
* We need to use this syntax also for the various `newfunc`, `allocfunc`,
etc. This is alredy a bit weird because `METH_{NOARGS,O,VARARGS}` are
flags, while `newfunc` &co. are typedefs; so, for example:
- `HPyMeth_DEF(METH_O, ...)` and `HPyMeth_DEF(newfunc, ...)` ?
The following proposal tries to solve/improve these issues, and goes a bit
deeper than just renaming the macros. To start with, we should have a clear
separation between the `PyMethodDef[]` (for legacy functions) and
`HPyMethodDef[]`, instead of relying on a flag which can be easily
mistaken. For example:
```c
static PyMethodDef LegacyMethods[] = {
{"legacy1", legacy1, METH_NOARGS, ""},
{"legacy2", legacy2, METH_O, ""},
{NULL, NULL, 0, NULL}
};
static HPyMethodDef CoolMethods[] = {
{"foo", foo, METH_NOARGS, ""},
{"bar", bar, METH_O, ""},
{NULL, NULL, 0, NULL}
};
static HPyModuleDef moduledef = {
HPyModuleDef_HEAD_INIT,
...
.m_legacy_methods = LegacyMethods,
.m_methods = CoolMethods,
};
```
Among the other things, this has advantage of removing the distintion
between
`METH_O` and `HPy_METH_O`. The original idea of keeping `.m_methods = ...`
was
to make it as similar as possible to the existing syntax, to make porting
easier without having to modify tons of place. However, my experience with
ultrajson tells me that what you do in practice is to port methods one by
one,
so while you are at it moving the definition from one table to the other is
not a big deal (plus, if we design our types correctly you would get a nice
compilation error if you forget to move the method to the right table).
The main disadvantage of this is that `HPyModule_Create` can no longer be an
alias for `PyModule_Create` in the CPython ABI mode, but it would need to
be a
custom function which creates the "real" `PyModuleDef`. But note that we
need
to implement it anyway for the cpython-universal case, and it's exactly the
same sitation we already have for `HPyType_FromSpec`.
Once we decide to use our own `HPyModuleDef`, we can become creative, which
brings more API design questions. For example, we could use unions to have a
more type-safe way of specifying methods:
```c
typedef HPy (*HPyMeth_NOARGS)(HPyContext, HPy);
typedef HPy (*HPyMeth_O)(HPyContext, HPy, HPy);
typedef struct {
const char *name;
union {
void *pfunc;
HPyMeth_NOARGS meth_noargs;
HPyMeth_O meth_o;
};
int flags;
} HPyMethodDef;
...
static HPyMethodDef Methods[] = {
{"foo", .meth_noargs = foo, METH_NOARGS},
{"bar", .meth_o = bar, METH_O},
{NULL, NULL, 0}
};
```
And the same for `HPyType_Spec`/`HPyType_Slot`:
```c
typedef HPy (*HPy_newfunc)(HPyContext, HPy, HPy, HPy);
typedef HPy (*HPy_allocfunc)(HPyContext, HPy, HPy_ssize_t);
typedef struct {
int slot;
union {
void *p;
HPy_newfunc tp_new;
HPy_allocfunc tp_alloc;
// ...
};
} HPyType_Slot;
static HPyType_Slot slots[] = {
{Py_tp_new, .tp_new = Foo_New},
{Py_tp_alloc, .tp_alloc = Foo_Alloc},
{0, NULL},
};
```
* The biggest advantage is that if you use a function with the wrong
signature, you get a nice warning.
* The biggest disadvantage is that setting the correct field of the union
is
obviously not enough: you still need to manually specify the correct
flags, but at least it should be "visually evident" if something is
wrong
(e.g. if you specify `{... .meth_o = foo, METH_NOARGS}`). I don't think
there is any way to get a nice compiler error/warning in this case: if
you
have any idea, please shout :)
If we go down this route, I suggest the following syntax for
`HPyMeth_DEFINE`. The nice property is that there is a direct correspondence
between the first argument of the macro and the `typedef` of the function
pointer:
```c
typedef HPy (*HPyMeth_NOARGS)(HPyContext, HPy);
typedef HPy (*HPyMeth_O)(HPyContext, HPy, HPy);
typedef HPy (*HPy_newfunc)(HPyContext, HPy, HPy, HPy);
typedef HPy (*HPy_allocfunc)(HPyContext, HPy, HPy_ssize_t);
// ...
HPyMeth_DEFINE(O, foo) // foo_impl is of type HPyMeth_O
HPyMeth_DEFINE(newfunc, Foo_New) // Foo_New_impl is of type HPyMeth_newfunc
```
Another nice property of this scheme is that we can extend it nicely if in
the
future we want to go towards an "argument-clinic-like" setup. E.g.:
```c
// L_LDO means "returns a long and takes a long, a double and an object"
typedef long (*HPyMeth_L_LDO)(long, double, HPy);
HPyMeth_DEFINE(L_LDO, foo)
...
```
But I think that this email is already too long and too dense, so let's not
discuss about "argument-clinic-like" now :)
Please let me know what you think!
ciao,
Anto