[Python-Dev] Removing the GIL (Me, not you!)

Fri Sep 14 18:33:09 CEST 2007

On 9/14/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:
>
> On 9/14/07, Adam Olsen <rhamph at gmail.com> wrote:
> > > Could be worth a try. A first step might be to just implement
> > > the atomic refcounting, and run that single-threaded to see
> > > if it has terribly bad effects on performance.
> >
> > I've done this experiment.  It was about 12% on my box.  Later, once I
> > had everything else setup so I could run two threads simultaneously, I
> > found much worse costs.  All those literals become shared objects that
> > create contention.
>
> It's hard to argue with cold hard facts when all we have is raw speculation.
> What do you think of a model where there is a global "thread count" that
> keeps track of how many threads reference an object? Then there are
> thread-specific reference counters for each object. When a thread's refcount
> goes to 0, it decrefs the object's thread count. If you did this right,
> hopefully there would only be cache updates when you update the thread
> count, which will only be when a thread first references an object and when
> it last references an object.
>
> I mentioned this idea earlier and it's growing on me. Since you've actually
> messed around with the code, do you think this would alleviate some of the
> contention issues?

There would be some poor worst-case behaviour.  In the case of
literals you'd start referencing them when you call a function, then
stop when the function returns.  Same for any shared datastructure.

I think caching/buffering refcounts in general holds promise though.
My current approach uses a crude hash table as a cache and only
flushes when there's a collision or when the tracing GC starts up.  So
far I've only got about 50% of the normal performance, but that's with
90% or more scalability, and I'm hoping to keep improving it.

-- 
Adam Olsen, aka Rhamphoryncus