Hey all,
For my research, IBM wrote a tracing GC for CPython and I was trying out
some ideas on how we would support the CAPI.
I know about handles used in HPy but I felt they can actually incur
allocation overhead and use more memory.
Instead, I thought of changing the semantics of the union type (PyObject)
to not point to internal structures and use a stack for sharing data
between Python and C. There can be one push function for each Python type
with a direct representation in C: Py_pushInteger for ints, etc. When a C
function returns, all values in the stack are returned to Python as the
results of the C function. Assuming we can have a way of returning multiple
values in Python.
Specifically, change
typedef struct Object *PyObject;
To:
typedef unsigned int PyObject;
Where now PyObject becomes an index into an internal array that stored all
values that had to be given to. This means that when a value is in that
array, it would not be collected by Python. When the C function returns its
whole array is erased, and the values used by the function are collected.
This setup gets us a reliable Union type (PyObject), the garbage collector
can also move objects. I think that backward compatibility can easily be
implemented using macros.
What is some feedback on this approach and am I overconfident of having
reasonable backward compatibility? Also, can this experiment uncover any
insights that CPython would find useful?
--
Best,
Joannah Nanjekye
*"You think you know when you learn, are more sure when you can write, even
more when you can teach, but certain when you can program." Alan J. Perlis*