
On 08. 06. 21 13:08, Joannah Nanjekye wrote:
Hey all,
For my research, IBM wrote a tracing GC for CPython and I was trying out some ideas on how we would support the CAPI.
I know about handles used in HPy but I felt they can actually incur allocation overhead and use more memory.
Instead, I thought of changing the semantics of the union type (PyObject) to not point to internal structures and use a stack for sharing data between Python and C. There can be one push function for each Python type with a direct representation in C: Py_pushInteger for ints, etc. When a C function returns, all values in the stack are returned to Python as the results of the C function. Assuming we can have a way of returning multiple values in Python.
Sorry, but I don't understand this proposal. What do you mean by "returned to Python"?
Calling a C function is not a special case. It's the oposite: calling a Python function is done by calling a C funtion (one of https://docs.python.org/3/c-api/call.html#object-calling-api or their private variants).
Specifically, change
typedef struct Object *PyObject;
To:
typedef unsigned int PyObject;
Where now PyObject becomes an index into an internal array that stored all values that had to be given to.
Given to what?
This means that when a value is in that array, it would not be collected by Python. When the C function returns its whole array is erased, and the values used by the function are collected.
This setup gets us a reliable Union type (PyObject), the garbage collector can also move objects. I think that backward compatibility can easily be implemented using macros.
What is some feedback on this approach and am I overconfident of having reasonable backward compatibility? Also, can this experiment uncover any insights that CPython would find useful?