[Python-ideas] Ideas towards GIL removal

Talin talin at acm.org
Fri Apr 13 08:51:11 CEST 2007


Greg Ewing wrote:
> I've been thinking about some ideas for reducing the
> amount of refcount adjustment that needs to be done,
> with a view to making GIL removal easier.

(omitted)

I'm thinking along similar lines, but my approach is to eliminate 
refcounting entirely. (Note that the incref and decref macros could 
still exist for backwards compatibility, but would do nothing.)

Garbage collection for concurrent systems is an active area of research, 
however it appears that many of the research systems out there have 
settled on a few basic design parameters. Most of them use a copying 
collector for the "young" generation (with a separate heap per thread, 
for exactly the reasons you suggest), and a shared mark-and-sweep heap 
store for the older generation that uses a traditional free list.

For my own amusement and curiosity, I've been playing around with 
implementing such a collector, using a heap allocator design that's 
loosely based on the one from dlmalloc, which is an open source malloc 
implementation with very good overall performance. The idea is to build 
a stand-alone garbage collection library, similar to the popular Boehm 
collector, but parallel by design and intended for "cooperative" 
language interpreters rather than uncooperative languages such as C and C++.

Of course, this is purely a hobby-level effort, and I admit that I 
really have no clue as to what I am doing here - the real point of the 
exercise is to educate myself about the problem space, not to actually 
produce a working library, although that might be a possible side effect 
(equally likely is that I'll never finish it.) I doubt that an untutored 
amateur such as myself can actually create a robust, efficient parallel 
implementation, given how hard such programming actually is, and how 
inexperienced I am. But it's interesting and enjoyable to work on, and 
that's the only reason I need. (And also produces some interesting side 
effects - I wasn't happy with the various graphical front-ends to gdb, 
so I took a couple of days off on wrote one using wxPython :)

Now, all that being said, even if such a GC library were to exist, that 
is a long way from removal of the GIL, although it is a necessary step. 
For example, take the case of a dictionary in which more than one thread 
is inserting values. Clearly, that will require a lock or some other 
mechanism to prevent corruption of the hash table as it is updated. I 
think we want to avoid the Java situation where every object has its own 
lock. Instead, we'd have to require the user to provide a lock around 
that insertion operation. But what about dictionaries that the user 
isn't aware of, such as class methods and module contents? In a world 
without a GIL, what kind of steps need to be taken to insure that shared 
data structures can be updated without creating chaos?

-- Talin




More information about the Python-ideas mailing list