Adam Olsen wrote:
Ahh, I forgot a major alternative. You could instead work on an exact GC for PyPy. I personally think it's more interesting to get cooperation from the compiler. ;)
The open source world sorely needs an exact garbage collector, which is why I am in the process of writing one. Right now its going very slowly, as I've been able to only commit a few hours a week to the work.
One difficulty in making a FOSS garbage collector is that exact collectors tend to be tightly coupled to the specific object model of the runtime environment. The design of the collector puts a lot of constraints on the design of the object model, and vice versa. This makes it difficult to create a collector that works with a lot of different languages.
As you already know, PyPy is currently targeting LLVM as a backend. LLVM provides an abstract interface for garbage collection that solves some of these problems. Because the collector interface is defined at the virtual instruction level, before optimization takes place, it means that some of the overhead that would be incurred by making a completely generic callback interface can be avoided - that is, the abstraction levels needed for a loosely-coupled collector can be optimized away by LLVM's optimizer.
Based on a study of the LLVM code and docs, I have come to the belief that it would be possible to create a generic collector implementation which would work with a fairly broad class of both object models and collection algorithms. In other words, while it might not support every possible language out there, it would be possible to support a fairly wide subset.
So far, what I've got is a general purpose heap implementation, which is loosely inspired by (although not copied from) the popular dlmalloc implementation and a few others, and also incorporates a number of features which would be needed by a collector algorithm (such as reserved space for collector state bits and type tags in the object allocation header.) I've also got some utility classes that would be useful to a collector, such as a lock-free implementation of work queues. I don't have an actual collector working yet, and I'm not going to post a link to the code here because it's not ready for public viewing yet.
In any case, if anyone is interested in discussing this further, feel free to email me privately. Since this isn't directly related to CPython or PyPy, I don't want to clutter up the discussion lists here with issues related to garbage collection.