[Python-ideas] Ideas towards GIL removal
Brett Cannon
brett at python.org
Sun Apr 15 21:52:33 CEST 2007
On 4/15/07, Talin <talin at acm.org> wrote:
>
> Greg Ewing wrote:
> > 2) Objects owned by a thread
> >
> > Python code creates and destroys temporary objects
> > at a high rate -- stack frames, argument tuples,
> > intermediate results, etc. If the code is executed
> > by a thread, those objects are rarely if ever seen
> > outside of that thread. It would be beneficial if
> > refcount operations on such objects could be carried
> > out by the thread that created them without locking.
>
> While we are on the topic of reference counting, I'd like to direct your
> attention to Recycler, an IBM research project:
>
> "The Recycler is a concurrent multiprocessor garbage collector with
> extremely low pause times (maximum of 6 milliseconds over eight
> benchmarks) while remaining competitive with the best throughput-
> oriented collectors in end-to-end execution times. This paper describes
> the overall architecture of the Recycler, including its use of reference
> counting and concurrent cycle collection, and presents extensive
> measurements of the system comparing it to a parallel, stop-the-world
> mark-and-sweep collector."
>
> There are a bunch of research papers describing Recycler which can be
> found at the following link:
>
> http://www.research.ibm.com/people/d/dfb/publications.html
>
> I'd start with the papers entitled "Java without the Coffee Breaks: A
> Non-intrusive Multiprocessor Garbage Collector" and "Concurrent Cycle
> Collection in Reference Counted Systems".
>
> Let me describe a bit about how the Recycler works and how it relates to
> what you've proposed.
>
> The basic idea is that for each thread, there is a set of thread local
> data (TLD) that contains a pair of "refcount buffers", one buffer for
> increfs and one buffer for decrefs. Each refcount buffer is a flat array
> of pointers which starts empty and gradually fills up.
>
> The "incref" operation does not actually touch the reference count field
> of the object. Instead, an "incref" appends a pointer to the object to
> the end of the incref buffer for that thread. Similarly, a decref
> operation appends a pointer to the object to the decref buffer. Since
> the refcount buffers are thread-local, there is no need for locking or
> synchronization.
>
> When one of the buffers gets full, both buffers are swapped out for new
> ones, and the old buffers are placed on a queue which is processed by
> the collector thread. The collector thread is the only thread which is
> allowed to actually touch the reference counts of the individual
> objects, and its the only thread which is allowed to delete objects.
>
> Processing the buffers is relatively simple: First, the incref buffer is
> processed. The collector thread scans through each pointer in the
> buffer, and increments the refcount of each object. Then the decref
> buffer is processed in a similar way, decrementing the refcount.
>
> However, it also needs to process the buffers for the other threads
> before the memory can be reclaimed. Recycler defines a series of
> "epochs" (i.e. intervals between collections). Within a refcount buffer,
> each epoch is represented as a contiguous range of values within the
> array -- all of the increfs and decrefs which occurred during that
> epoch. An auxiliary array records the high water mark for each epoch.
Huh, interesting idea. I downloaded the papers and plan to give them a
read. The one isssue I can see with this is that because of these epochs
and using a buffer instead of actually manipulating the refcount means a
delay. And I know some people love the (mostly) instantaneous garbage
collection when the refcount should be at 0.
Anyway, I will give the paper a read before I make any more ignorant
statements about the design. =)
-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20070415/0899c563/attachment.html>
More information about the Python-ideas
mailing list