On Wed, 15 Dec 2021 10:42:17 +0100 Christian Heimes <christian@python.org> wrote:
On 14/12/2021 19.19, Eric Snow wrote:
A while back I concluded that neither approach would work for us. The approach I had taken would have significant cache performance penalties in a per-interpreter GIL world. The approach that modifies Py_INCREF() has a significant performance penalty due to the extra branch on such a frequent operation.
Would it be possible to write the Py_INCREF() and Py_DECREF() macros in a way that does not depend on branching? For example we could use the highest bit of the ref count as an immutable indicator and do something like
ob_refcnt += !(ob_refcnt >> 63)
instead of
ob_refcnt++
Probably, but that would also issue spurious writes to immortal refcounts from different threads at once, so might end up worse performance-wise. Regards Antoine.