
Hey all,
For my research, IBM wrote a tracing GC for CPython and I was trying out some ideas on how we would support the CAPI.
I know about handles used in HPy but I felt they can actually incur allocation overhead and use more memory.
Instead, I thought of changing the semantics of the union type (PyObject) to not point to internal structures and use a stack for sharing data between Python and C. There can be one push function for each Python type with a direct representation in C: Py_pushInteger for ints, etc. When a C function returns, all values in the stack are returned to Python as the results of the C function. Assuming we can have a way of returning multiple values in Python.
Specifically, change
typedef struct Object *PyObject;
To:
typedef unsigned int PyObject;
Where now PyObject becomes an index into an internal array that stored all values that had to be given to. This means that when a value is in that array, it would not be collected by Python. When the C function returns its whole array is erased, and the values used by the function are collected.
This setup gets us a reliable Union type (PyObject), the garbage collector can also move objects. I think that backward compatibility can easily be implemented using macros.
What is some feedback on this approach and am I overconfident of having reasonable backward compatibility? Also, can this experiment uncover any insights that CPython would find useful?
-- Best, Joannah Nanjekye
*"You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program." Alan J. Perlis*