As far as I understand we should get a smaller improvement on single thread because some of the optimizations listed in this work are partially or totally implemented.

This is excluding any non linear behaviour between the different optimizations of course, and assuming that both versions yield the same numbers.

On Mon, 11 Oct 2021, 20:28 Abdur-Rahmaan Janhangeer, <> wrote:
When you mean "an order of magnitude less overhead than the current CPython implementation" do you mean compared with the main branch? We recently implemented already almost everything is listed in this paragraph:

We also pack some extra similar optimizations in this other PR, including stealing the frame arguments from python to python calls:

This could explain why the performance is closer to the current master branch as you indicate:

This means that if we remove the GIL + add the 3.11 improvements we should get some more speed?

(or if those are integrated in the POC?)

Kind Regards,

Abdur-Rahmaan Janhangeer