[Python-ideas] [Python-Dev] GC Changes

Adam Olsen rhamph at gmail.com
Mon Oct 1 17:59:54 CEST 2007


On 10/1/07, Adam Olsen <rhamph at gmail.com> wrote:
> [This should be on python-ideas, so I'm replying to there instead of python-dev]
>
> On 10/1/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
> > Hello,
> >
> > I've been doing some tests on removing the GIL, and it's becoming clear that
> > some basic changes to the garbage collector may be needed in order for this
> > to happen efficiently. Reference counting as it stands today is not very
> > scalable.
> >
> > I've been looking into a few options, and I'm leaning towards the
> > implementing IBMs recycler GC (
> > http://www.research.ibm.com/people/d/dfb/recycler-publications.html
> > ) since it is very similar to what is in place now from the users'
> > perspective. However, I haven't been around the list long enough to really
> > understand the feeling in the community on GC in the future of the
> > interpreter. It seems that a full GC might have a lot of benefits in terms
> > of performance and scalability, and I think that the current gc module is of
> > the mark-and-sweep variety. Is the trend going to be to move away from
> > reference counting and towards the mark-and-sweep implementation that
> > currently exists, or is reference counting a firmly ingrained tradition?
>
> Refcounting is fairly firmly ingrained in CPython, but there are
> conservative GCs for C that mostly work, and other implementations
> aren't so restricted.
>
> The problem with Python is that it produces a *lot* of garbage.
> Pystones on my box does around a million objects per second and fills
> up available ram in about 10 seconds.  Not only do you need to collect
> often enough to not fill up the ram, but for *good* performance you
> need to collect often enough to keep your L1 cache hot.  That would
> seem to demand a generational GC at least.
>
> You might as well assume it'll be more expensive than refcounting[1].
> The real advantage would be in scalability.  Concurrent, parallel GCs
> are an active field of research though.  If you're really interested
> you should research conservative GCs aimed at C in general, and only
> minimally interact with CPython (such as to disable the custom
> allocators.)

Ahh, I forgot a major alternative.  You could instead work on an exact
GC for PyPy.  I personally think it's more interesting to get
cooperation from the compiler. ;)


> A good stepping off point is The Memory Management Reference (although
> it looks like it hasn't been updated in the last few years).  If some
> of my terms are unfamiliar to you, go start reading. ;)
> http://www.memorymanagement.org/
>
>
>
> [1] This statement is only in the context of CPython, of course.
> There are certainly many situations where a tracing GC performs
> better.
>
> --
> Adam Olsen, aka Rhamphoryncus
>


-- 
Adam Olsen, aka Rhamphoryncus



More information about the Python-ideas mailing list