[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

15 Dec 2021


      On Wed, Dec 15, 2021 at 2:21 AM Antoine Pitrou <antoine@python.org> wrote:
...
On Wed, 15 Dec 2021 10:42:17 +0100
Christian Heimes <christian@python.org> wrote:
...
On 14/12/2021 19.19, Eric Snow wrote:
...
A while back I concluded that neither approach would work for us.  The
approach I had taken would have significant cache performance
penalties in a per-interpreter GIL world.  The approach that modifies
Py_INCREF() has a significant performance penalty due to the extra
branch on such a frequent operation.
Would it be possible to write the Py_INCREF() and Py_DECREF() macros in
a way that does not depend on branching? For example we could use the
highest bit of the ref count as an immutable indicator and do something like
ob_refcnt += !(ob_refcnt >> 63)
instead of
ob_refcnt++
Probably, but that would also issue spurious writes to immortal
refcounts from different threads at once, so might end up worse
performance-wise.
Unless the CPU is clever enough to skip claiming the cacheline in
exclusive-mode for a "+= 0". Which I guess is something you'd have to
check empirically on every microarch and instruction pattern you care
about, because there's no way it's documented. But maybe? CPUs are
very smart, except when they aren't.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org

[Python-Dev] Re: "immortal" objects and how they would help per-interpreter GIL

Nathaniel Smith