[Python-ideas] Multi-core reference count garbage collection

Steven D'Aprano steve at pearwood.info
Fri Jul 20 21:23:26 EDT 2018


On Fri, Jul 20, 2018 at 09:44:39PM +0100, Jonathan Fine wrote:
> Hi Steve
> 
> You wrote:
> > My understanding is that reference counting is both deterministic and
> > immediate. Shifting the reference counting into another thread so that
> > it becomes non-deterministic and potentially delayed doesn't sound like
> > an advantage to my naive understanding.
> 
> The choice is not as simple as that. Here's an example.
> 
> Suppose you and your friend want to sort a deck of cards. Here's an algorithm.

Jonathan, you keep avoiding my direct questions. What problem are you 
trying to solve?

Its okay if there is no immediate problem, that you're just exploring 
alternative garbage collection strategies. Or if you're trying to remove 
the GIL. Or you think that this will make print faster. (Probably not 
the last one *wink*)

And please don't patronise me -- I might not be an expert on garbage 
collection but I know that parallelising certain tasks can make them run 
faster. Your sorting example is irrelevant to understanding this 
reference counting proposal.

Help us understand where you are coming from in this discussion.


> * Divide the deck into two
> * Each person sorts their half deck
> * One of you does a merge of the two half decks
> 
> This algorithm is non-deterministic.

No in any important way it isn't. Although the sub-sorting steps may run 
in any order, the final merge doesn't occur until they are both 
completed. There are no race conditions where the merge happens before 
the sort, for example (unless your implementation is *really* bad). 
Which means from the perspective of the outsider the overall sort is 
deterministic: you start with an unsorted list, call sort, and when the 
sort() function returns, you have a sorted list. The non-determinism is 
no more important than the random partition in Quicksort.

But in the case of reference counting, I understand that multithreaded 
reference counting can suffer from race conditions when incrementing or 
decrementing:

http://flyingfrogblog.blogspot.com/2013/09/how-do-reference-counting-and-tracing.html

That would be a bad thing.

But as I said, my view of multi-threaded reference counting is very 
naive, and I'm no expert. I'm less interested in the technical details 
of how and whether it works as to *why* you are interested in trying it.

Hence my repeated requests for you to explain what problem(s) (if any) 
you are hoping to solve.


-- 
Steve


More information about the Python-ideas mailing list