[Python-Dev] GIL removal question

Thu Aug 11 21:11:11 CEST 2011

Den 09.08.2011 11:33, skrev Марк Коренберг:
> Probably I want to re-invent a bicycle. I want developers to say me
> why we can not remove GIL in that way:
>
> 1. Remove GIL completely with all current logick.
> 2. Add it's own RW-locking to all mutable objects (like list or dict)
> 3. Add RW-locks to every context instance
> 4. use RW-locks when accessing members of object instances
>
> Only one reason, I see, not do that -- is performance of
> singlethreaded applications. Why not to fix locking functions for this
> 4 cases to stubs when only one thread present?

This has been discussed to death before, and is probably OT to this list.

There is another reason than speed of single-threaded applications, but 
it is rather technical: As CPython uses reference counting for garbage 
collection, we would get "false sharing" of reference counts -- which 
would work as an "invisible GIL" (synchronization bottleneck) anyway. 
That is, if one processor writes to memory in a cache-line shared by 
another processor, they must stop whatever they are doing to synchronize 
the dirty cache lines with RAM. Thus, updating reference counts would 
flood the memory bus with traffic and be much worse than the GIL. 
Instead of doing useful work, the processors would be stuck 
synchronizing dirty cache lines. You can think of it as a severe traffic 
jam.

To get rid of the GIL, CPython would either need

(a) another GC method (e.g. similar to .NET or Java)

or

(b) another threading model (e.g. one interpreter per thread, as in Tcl, 
Erlang, or .NET app domains).

As CPython has neither, we are better off with the GIL.

Nobody likes the GIL, fork a project to write a GIL free CPython if you 
can. But note that:

1. With Cython, you have full manual control over the GIL. IronPython 
and Jython does not have a GIL at all.

2. Much of the FUD against the GIL is plain ignorance: The GIL slows 
down parallel computational code, but any serious number crunching 
should use numerical performance libraries (i.e. C extensions) anyway. 
Libraries are free to release the GIL or spawn threads internally. Also, 
the GIL does not matter for (a) I/O bound code such as network servers 
or clients and (b) background threads in GUI programs -- which are the 
two common use-cases for threads in Python programs. If the GIL bites 
you, it's most likely a warning that your program is badly written, 
independent of the GIL issue.

There seems to be a common misunderstanding that Python threads work 
like fibers due to they GIL. They do not! Python threads are native OS 
threads and can do anything a thread can do, including executing library 
code in parallel. If one thread is blocking on I/O, the other threads 
can continue with their business.

The only thing Python threads cannot do is access the Python interpreter 
concurrently. And the reason CPython needs that restriction is reference 
counting.

Sturla