[Python-Dev] Variant of removing GIL.
falcon at intercable.ru
Thu Sep 15 09:19:13 CEST 2005
Excuse my English.
I think I know how to remove GIL!!!! Obviously I am an idiot.
First about Py_INCREF and Py_DECREF.
We should not remove GIL at all. We should change it.
It must be "one writer-many reader" in a following semantic:
Lock has a "read-counter" and a "write-counter". Initially both are 0.
When "reader" tries to acquire lock for "read" it sleeps until
"write-counter" is 0.
When he "reader" acquires lock, he increase "read-counter".
When "reader" releases lock, he decreases "read-counter".
One reader will not block other, since he not increases "write-counter".
Reader will sleep, if there is any waiting writers, since they are
When "writer" tries to acquire lock for "write", he increase
sleeps until "read-counter" happens 0. For "writers" lock for "write" -
when "writer" release lock, he decrease "write-counter".
When there is no waiting writers, readers arise.
Excuse me for telling obviouse things. I am really reinvent wheel in my
since I was a bad studient.
I think this kind of lock is native for linux (i saw it in a kernel
source, but do not know
is waiting writer locks new readers or not?).
Now, every thread keep an queue of objects to decref. It can be
implemented as array, cause
it will be freed at once.
Initially, every object acquires GIL for "read".
Py_INCREF works as usually,
Py_DECREF places a ref into a queue.
When queue has became full or "100" instruction left ( :-) , it usefull),
thread releases GIL for "read" and acquires for "write",
when he acquire it, he decrefs all objects stored in a queue and clear
After all he acquires GIL for "read".
But what could we do with changing objects (dicts,lists and another)?
There should be a secondary "one-writer-many-reader" "public-write" GIL
SGIL ought to be more complicated, since it should work in RLOCK
semantic for "write" lock.
Lets call this lock ROWMR(reentreed one writer - many reader)
So semantic for ROWMR can be:
When a thread acquires ROWMR lock, it acquires it at a "read" level.
Lets name it "write-level"=0.
While threads "write-level"=0 it is a "reader".
Thread can increase "write-level".
When he turns "write-level" from 0 to 1, he becomes "writer".
while "write-level">0, thread is writer.
Thread can decrease "write-level".
When "write-level" turns from 1 to 0, thread becomes "reader".
With PWGIL :
We can mark every _mutable_ object with a creator thread number.
If mark match current thread number, object is "private" for the thread.
If mark is 0 (or another imposible thread number) object is "public".
If mark !=0 and !=current thread number, object is "alien".
When we access _mutable_ object, we check is it "private"?
If it is, we can do anything without locking.
If it is not and we access for read, we check is it "public".
If yes ("read" of "public"), then we can read it without locking.
If no, we increase "write-level",
if object is "alien", make it "public",
if we need to change object, change it,
Of couse, when we append object to "public" collection, we chould make
"write-level" is already increased so we do not make many separate
when we then will access thouse object for read, we will not lock for
make it "public".
I don't know, how nested scopes are implemented, but i think it should
be considered as a mutable object.
So there is a small overhead for a single threaded application
( only for comparing 2 numbers)
and in a big part of multithreaded, since we are locking only writting on
_mutable_ _public_ objects. Most part of "public" objects is not
accessed to write
often: they are numbers, classes and mostly-read collections.
And one can optimize a program by accumulating results in a "private"
and then flush it to "public" one.
Also, there may be a statement for explicit increasing "write-level"
around big update
of "public" object and decreasing after it.
PWGIL also must be released and reacquired with every "100" instructions
left,but only if "write-level=0",
it conforms to current GIL semantic.
I think, it must be not released with flushing decref queue, since it
can happen while we are in C code.
And there must be strong think about blocking IO.
Mostly awful situation (at my point of view):
object O is "private" for a thread A.
thread B accesses O and try to mark it "public", so it locks in attempt
of increasing "write-level"
thread A starts to change O (it is in "write-level 0"), and in a C code
it releases PWGIL
(around blocking IO, for example).
thread B becomes "writer", changes object to "public", becomes "reader"
and starts to read O,
returning thread A continue to change O , remaining in a "write-level=0".
But, I think, well written C code should not attemt to make blocking IO
inside of changing non-local objects
(and it does not attempt at the moment, as I guess. Am I mistaken?).
Or/and, when it returns and continues
to change O, it must check, is it "private" or it isn't?
I think, big part of checks and manipulation with GIL&PWGIL could be
hidden inside of current C API,
so we should not change a tons of libraries written in C. Only
libraries, which create mutable objects which
use notstandart containers for storing.
Maybe there should be only one united SGIL for incref-decref and "write
each Py_DECREF place reference in a thread local queue (it could be
small enough - about 1000, and not
dinamic - just an array with counter).
every object (mutable?) store thread mark (onle 4 byte, i think)
every access to an object whould check - mutable is it? only if yes,
'private' is it?
and only for 'mutable public/alien' object we are locking.
There would no more than 20% of perfomance overhead, i think.
And +50% advantage in ordinary multithreated programm on dual processor
(Maybe +90% on 3 processor, +110 % on 4 processor, since write block
will lock all readers).
More information about the Python-Dev