When you mean "an order of magnitude less overhead than the current CPython implementation" do you mean compared with the main branch? We recently implemented already almost everything is listed in this paragraph:

https://github.com/python/cpython/pull/27077

We also pack some extra similar optimizations in this other PR, including stealing the frame arguments from python to python calls:

https://github.com/python/cpython/pull/28488

This could explain why the performance is closer to the current master branch as you indicate:


This means that if we remove the GIL + add the 3.11 improvements we should get some more speed?

(or if those are integrated in the POC?)



Kind Regards,

Abdur-Rahmaan Janhangeer
github
Mauritius