
On Wed, Dec 15, 2021 at 02:57:46PM -0800, Guido van Rossum wrote:
On Wed, Dec 15, 2021 at 6:04 AM Antoine Pitrou <antoine@python.org> wrote:
On Wed, 15 Dec 2021 14:13:03 +0100 Antoine Pitrou <antoine@python.org> wrote:
Did you try to take into account the envisioned project for adding a "complete" GC and removing the GIL?
Sorry, I was misremembering the details. Sam Gross' proposal (posted here on 07/10/2021) doesn't switch to a "complete GC", but it changes reference counting to a more sophisticated scheme (which includes immortalization of objects):
https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd...
A note about this: Sam's immortalization covers exactly the objects that Eric is planning to move into the interpreter state struct: "such as interned strings, small integers, statically allocated PyTypeObjects, and the True, False, and None objects". (Well, he says "such as" but I think so does Eric. :-)
Sam's approach is to use the lower bit of the ob_refcnt field to indicate immortal objects. This would not work given the stable ABI (which has macros that directly increment and decrement the ob_refcnt field). In fact, I think that Sam's work doesn't preserve the stable ABI at all. However, setting a very high bit (the bit just below the sign bit) would probably work. Say we're using 32 bits. We use the value 0x_6000_0000 as the initial refcount for immortal objects. The stable ABI will sometimes increment this, sometimes decrement it. But as long as the imbalance is less than 0x_2000_0000, the refcount will remain in the inclusive range [ 0x_4000_0000 , 0x_7FFF_FFFF ] and we can test for immortality by testing a single bit:
if (o->ob_refcnt & 0x_4000_0000)
I don't know how long that would take, but I suspect that a program that just increments the refcount relentlessly would have to run for hours before hitting this range. On a 64-bit machine the same approach would require years to run before a refcount would exceed the maximum allowable imbalance. (These estimates are from Mark Shannon.)
I did some research on this a few years back. I was curious what sort of "max reference counts" were encountered in the wild, in long-running real life programs. For the same reason: I wanted to get some insight into how many unused bits could possibly be repurposed for future shenanigans (I had PyParallel* in the mind at the time). I added some logic to capture* the max reference counts of the None, True, and Zero objects (in a trace callback), then ran a really long simulation program of a client's (it ran for about 5-6 hours). The results were as follows: MaxNoneRefCount 9,364,132 MaxTrueRefCount 204,215 MaxZeroRefCount 36,784 I thought that was pretty interesting. Potentially many, many upper bits for the taking. The code also had some logic that would int 3 as soon as a 32-bit refcnt overflowed, and that never hit either (obviously, based on the numbers above). I also failed to come up with real-life code that would result in a Python object having a reference count higher than None's refcnt, but that may have just been from lack of creativity. Just thought I'd share. Regards, Trent. [*] 1: https://github.com/pyparallel/pyparallel [*] 2: https://github.com/tpn/tracer/blob/master/PythonTracer/PythonTracer.h#L690