<!doctype html public "-//W3C//DTD W3 HTML//EN">

<html><head><style type="text/css"><!--

blockquote, dl, ul, ol, li { margin-top: 0 ; margin-bottom: 0 }

 --></style><title>Re: [Python-Dev] Removing the GIL (Me, not

you!)</title></head><body>

<div>At 1:51 AM -0500 9/14/07, Justin Tulloss wrote:</div>

<blockquote type="cite" cite>On 9/14/07,<b> Adam Olsen</b> &lt;<a

href="mailto:rhamph@gmail.com">rhamph@gmail.com</a>&gt; wrote:<br>

<blockquote>&gt; Could be worth a try. A first step might be to just

implement<br>

&gt; the atomic refcounting, and run that single-threaded to see<br>

&gt; if it has terribly bad effects on performance.<br>

<br>

I've done this experiment.&nbsp;&nbsp;It was about 12% on my

box.&nbsp;&nbsp;Later, once I<br>

had everything else setup so I could run two threads simultaneously,

I<br>

found much worse costs.&nbsp;&nbsp;All those literals become shared

objects that<br>

create contention.<br>

</blockquote>

</blockquote>

<blockquote type="cite" cite><br></blockquote>

<blockquote type="cite" cite>It's hard to argue with cold hard facts

when all we have is raw speculation. What do you think of a model

where there is a global &quot;thread count&quot; that keeps track of

how many threads reference an object? Then there are thread-specific

reference counters for each object. When a thread's refcount goes to

0, it decrefs the object's thread count. If you did this right,

hopefully there would only be cache updates when you update the

thread count, which will only be when a thread first references an

object and when it last references an object.</blockquote>

<div><br></div>

<div>It's likely that cache line contention is the issue, so don't

glom all the different threads' refcount for an object into one

vector.&nbsp; Keep each thread's refcounts in a per-thread vector of

objects, so only that thread will cache that vector, or make

refcounts so large that each will be in its own cache line (usu. 64

bytes, not too horrible for testing purposes).&nbsp; I don't know all

what would be required for separate vectors of refcounts, but each

object could contain its index into the vectors, which would all be

the same size (Go Virtual Memory!).</div>

<div><br></div>

<div><br></div>

<blockquote type="cite" cite>I mentioned this idea earlier and it's

growing on me. Since you've actually messed around with the code, do

you think this would alleviate some of the contention issues?<br>

</blockquote>

<blockquote type="cite" cite>Justin</blockquote>

<div><br></div>

<div>Your idea can be combined with the maxint/2 initial refcount for

non-disposable objects, which should about eliminate thread-count

updates for them.</div>


<div>-- <br>

_________________________________________<span

></span>___________________________<br>

TonyN.:'&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span

></span

>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span

></span>&nbsp;&nbsp;&nbsp;

&lt;mailto:tonynelson@georgeanelson.com&gt;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

'&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span

></span

>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span

></span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

&lt;http://www.georgeanelson.com/&gt;</div>

</body>

</html>