[Python-ideas] Multi-core reference count garbage collection

Barry Scott barry at barrys-emacs.org
Sat Jul 21 05:03:32 EDT 2018


> On 21 Jul 2018, at 08:54, Jonathan Fine <jfine2358 at gmail.com> wrote:
> 
> Hi Steve
> 
> Thank you for your message. I think my response below allows us to go move forward.
> 
> WHAT'S THE PROBLEM
> You asked:
> > What problem are you trying to solve?
> > Its okay if there is no immediate problem, that you're just exploring
> > alternative garbage collection strategies. Or if you're trying to remove
> > the GIL.
> 
> I'm exploring. I hope we'll find something that helps CPython run faster on multi-core machines, at least for some users. I hope this helps you understand where I'm coming from.
> 

I think that the problem that needs a solution is providing each thread with knowledge
that is safe to update state.

The questions that any solution needs to answer is then:
1) How does the solution provide each thread with the knowledge that it is safe to mutate state?
2) How does the solution manage the life time of the objects?

For 1 today:

In python all state is shared between all threads. One lock, the GIL, provides the code with
the knowledge that it is safe to mutate state.

For 2 today:

The reference counts track the number of uses of an object.
In the simple case when the ref count reaches 0 run the __del__ protocol.

In the cases where circular reference patterns prevents detecting that an object
is no longer needed via reference counts the exist python garbage collector
figures out that such objects can be deleted.

I'd note that the reference counts are just one piece of state of many pieces of state in an object.

As an example of a idea that is looking directly at these questions is the work on using a sub-interpreter for
each thread is interesting. It is aiming to solve (1) by using a per/interpreter lock, which will allow
each thread to run at full speed. I'm following with interest how the life time management will work.

In multi processing (1) is solved by no longer sharing object state.

> Please, in this thread, can we confine ourselves to getting a better understanding of multi-core reference count garbage collection. That is for me already hard enough, in isolation, without mixing in other issues.

I'd argue that the ref counts are not interesting at all, only a side effect of one possible solution to the object life time problem.

Barry





More information about the Python-ideas mailing list