[This should be on python-ideas, so I'm replying to there instead of python-dev] On 10/1/07, Justin Tulloss <tulloss2@uiuc.edu> wrote:
Hello,
I've been doing some tests on removing the GIL, and it's becoming clear that some basic changes to the garbage collector may be needed in order for this to happen efficiently. Reference counting as it stands today is not very scalable.
I've been looking into a few options, and I'm leaning towards the implementing IBMs recycler GC ( http://www.research.ibm.com/people/d/dfb/recycler-publications.html ) since it is very similar to what is in place now from the users' perspective. However, I haven't been around the list long enough to really understand the feeling in the community on GC in the future of the interpreter. It seems that a full GC might have a lot of benefits in terms of performance and scalability, and I think that the current gc module is of the mark-and-sweep variety. Is the trend going to be to move away from reference counting and towards the mark-and-sweep implementation that currently exists, or is reference counting a firmly ingrained tradition?
Refcounting is fairly firmly ingrained in CPython, but there are conservative GCs for C that mostly work, and other implementations aren't so restricted. The problem with Python is that it produces a *lot* of garbage. Pystones on my box does around a million objects per second and fills up available ram in about 10 seconds. Not only do you need to collect often enough to not fill up the ram, but for *good* performance you need to collect often enough to keep your L1 cache hot. That would seem to demand a generational GC at least. You might as well assume it'll be more expensive than refcounting[1]. The real advantage would be in scalability. Concurrent, parallel GCs are an active field of research though. If you're really interested you should research conservative GCs aimed at C in general, and only minimally interact with CPython (such as to disable the custom allocators.) A good stepping off point is The Memory Management Reference (although it looks like it hasn't been updated in the last few years). If some of my terms are unfamiliar to you, go start reading. ;) http://www.memorymanagement.org/ [1] This statement is only in the context of CPython, of course. There are certainly many situations where a tracing GC performs better. -- Adam Olsen, aka Rhamphoryncus
On 10/1/07, Adam Olsen <rhamph@gmail.com> wrote:
[This should be on python-ideas, so I'm replying to there instead of python-dev]
On 10/1/07, Justin Tulloss <tulloss2@uiuc.edu> wrote:
Hello,
I've been doing some tests on removing the GIL, and it's becoming clear that some basic changes to the garbage collector may be needed in order for this to happen efficiently. Reference counting as it stands today is not very scalable.
I've been looking into a few options, and I'm leaning towards the implementing IBMs recycler GC ( http://www.research.ibm.com/people/d/dfb/recycler-publications.html ) since it is very similar to what is in place now from the users' perspective. However, I haven't been around the list long enough to really understand the feeling in the community on GC in the future of the interpreter. It seems that a full GC might have a lot of benefits in terms of performance and scalability, and I think that the current gc module is of the mark-and-sweep variety. Is the trend going to be to move away from reference counting and towards the mark-and-sweep implementation that currently exists, or is reference counting a firmly ingrained tradition?
Refcounting is fairly firmly ingrained in CPython, but there are conservative GCs for C that mostly work, and other implementations aren't so restricted.
The problem with Python is that it produces a *lot* of garbage. Pystones on my box does around a million objects per second and fills up available ram in about 10 seconds. Not only do you need to collect often enough to not fill up the ram, but for *good* performance you need to collect often enough to keep your L1 cache hot. That would seem to demand a generational GC at least.
You might as well assume it'll be more expensive than refcounting[1]. The real advantage would be in scalability. Concurrent, parallel GCs are an active field of research though. If you're really interested you should research conservative GCs aimed at C in general, and only minimally interact with CPython (such as to disable the custom allocators.)
Ahh, I forgot a major alternative. You could instead work on an exact GC for PyPy. I personally think it's more interesting to get cooperation from the compiler. ;)
A good stepping off point is The Memory Management Reference (although it looks like it hasn't been updated in the last few years). If some of my terms are unfamiliar to you, go start reading. ;) http://www.memorymanagement.org/
[1] This statement is only in the context of CPython, of course. There are certainly many situations where a tracing GC performs better.
-- Adam Olsen, aka Rhamphoryncus
-- Adam Olsen, aka Rhamphoryncus
Em Oct 1, 2007, às 12:59 PM, Adam Olsen escreveu:
Ahh, I forgot a major alternative. You could instead work on an exact GC for PyPy. I personally think it's more interesting to get cooperation from the compiler. ;)
That would be great, they are looking for people wanting to do that. It is hard work, but if you think you can do that for CPython you can do it on PyPy. Please contact the people using http://codespeak.net/ pypy/dist/pypy/doc/contact.html. Usually the IRC channel (#pypy on freenode.net) is the fastest way to get started. -- Leonardo Santagada
Adam Olsen wrote:
Ahh, I forgot a major alternative. You could instead work on an exact GC for PyPy. I personally think it's more interesting to get cooperation from the compiler. ;)
The open source world sorely needs an exact garbage collector, which is why I am in the process of writing one. Right now its going very slowly, as I've been able to only commit a few hours a week to the work. One difficulty in making a FOSS garbage collector is that exact collectors tend to be tightly coupled to the specific object model of the runtime environment. The design of the collector puts a lot of constraints on the design of the object model, and vice versa. This makes it difficult to create a collector that works with a lot of different languages. As you already know, PyPy is currently targeting LLVM as a backend. LLVM provides an abstract interface for garbage collection that solves some of these problems. Because the collector interface is defined at the virtual instruction level, before optimization takes place, it means that some of the overhead that would be incurred by making a completely generic callback interface can be avoided - that is, the abstraction levels needed for a loosely-coupled collector can be optimized away by LLVM's optimizer. Based on a study of the LLVM code and docs, I have come to the belief that it would be possible to create a generic collector implementation which would work with a fairly broad class of both object models and collection algorithms. In other words, while it might not support every possible language out there, it would be possible to support a fairly wide subset. So far, what I've got is a general purpose heap implementation, which is loosely inspired by (although not copied from) the popular dlmalloc implementation and a few others, and also incorporates a number of features which would be needed by a collector algorithm (such as reserved space for collector state bits and type tags in the object allocation header.) I've also got some utility classes that would be useful to a collector, such as a lock-free implementation of work queues. I don't have an actual collector working yet, and I'm not going to post a link to the code here because it's not ready for public viewing yet. In any case, if anyone is interested in discussing this further, feel free to email me privately. Since this isn't directly related to CPython or PyPy, I don't want to clutter up the discussion lists here with issues related to garbage collection. -- Talin
Adam Olsen wrote:
Refcounting is fairly firmly ingrained in CPython, but there are conservative GCs for C that mostly work,
Unfortunately, "mostly works" isn't good enough for something to be incorporated into CPython. It needs to work completely reliably and very portably. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+
On 10/1/07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Adam Olsen wrote:
Refcounting is fairly firmly ingrained in CPython, but there are conservative GCs for C that mostly work,
Unfortunately, "mostly works" isn't good enough for something to be incorporated into CPython. It needs to work completely reliably and very portably.
Indeed. An example of how they can fail would be python linking the boehm GC (which overrides malloc and free), then dynamically loading python from another app. By the time boehm overrides malloc and free they've already been used significantly. Another example is any sort of custom allocator. At best the conservative GC will see their heap as a single giant object (and be unable to release objects within it). At worst (if they allocate the heap via mmap rather than the overridden malloc) it won't even know the heap exists, not include it in the root set, and prematurely free objects it references. -- Adam Olsen, aka Rhamphoryncus
participants (4)
-
Adam Olsen
-
Greg Ewing
-
Leonardo Santagada
-
Talin