Hi

Based on other people's work (including in particular talks by Larry Hastings) and my own thinking, I've come up with a scheme for multi-core reference count garbage collection. I think it works, and much or all of it comes from others. It might even be in the literature. Of course, if it's wrong then the debit goes to me.

I'll explain it in instalments. Please let me know what you think, as we go along.

The basic ideas of reference counting garbage collection are:

1. Each allocated piece of memory is given an ID (which is often its address in memory).
2. For each ID, the system keeps a count of how many references (to that piece of memory).
3. The count starts at ONE.
4. Processes increment and decrement the count, to keep the count correct.
5. When the count reaches zero, the piece of memory is garbage collected.
6. The previous step might result in further reference count decrements.

The simplest possible garbage collection scheme is to do nothing (and let the garbage build up). In other words, allocate on a stack, and never free memory. This has very good performance, until memory is exhausted. It is also conceptually useful. I'll call it the do-nothing garbage collection scheme.

Suppose for definiteness we have a 6-core CPU, with 5 worker processes and one garbage collection (GC) process. The worker processes will do as little as possible, consistent with not running out of memory. To achieve this:

1. Each worker process with have two GC buffers, one for increment and the other for decrement.
2. For increment, the process will append the ID to the increment buffer (for the process). And similarly for decrement.
3. If the buffer to be appended to is full, the worker process will
(a) set a buffer-full flag
(b) pause until the buffer-full flag has been cleared
(c) and then do what it was previously blocked from doing

I call any such scheme BUFFERED multi-core reference count garbage collection. The worker processes know nothing about how garbage collection is managed. Instead, they pass over to the GC process sufficient information to allow it to manage the garbage.

This is now a good place to pause. We've specified what it is that the worker processes should do, regarding garbage collection. And we've passed the problem on to the garbage collection process. If that process does nothing, we have refactored the do-nothing garbage collection scheme.

Expect another instalment tomorrow.

with best regards

Jonathan