[Python-Dev] GIL removal question
Sturla Molden
sturla at molden.no
Thu Aug 11 21:11:11 CEST 2011
Den 09.08.2011 11:33, skrev Марк Коренберг:
> Probably I want to re-invent a bicycle. I want developers to say me
> why we can not remove GIL in that way:
>
> 1. Remove GIL completely with all current logick.
> 2. Add it's own RW-locking to all mutable objects (like list or dict)
> 3. Add RW-locks to every context instance
> 4. use RW-locks when accessing members of object instances
>
> Only one reason, I see, not do that -- is performance of
> singlethreaded applications. Why not to fix locking functions for this
> 4 cases to stubs when only one thread present?
This has been discussed to death before, and is probably OT to this list.
There is another reason than speed of single-threaded applications, but
it is rather technical: As CPython uses reference counting for garbage
collection, we would get "false sharing" of reference counts -- which
would work as an "invisible GIL" (synchronization bottleneck) anyway.
That is, if one processor writes to memory in a cache-line shared by
another processor, they must stop whatever they are doing to synchronize
the dirty cache lines with RAM. Thus, updating reference counts would
flood the memory bus with traffic and be much worse than the GIL.
Instead of doing useful work, the processors would be stuck
synchronizing dirty cache lines. You can think of it as a severe traffic
jam.
To get rid of the GIL, CPython would either need
(a) another GC method (e.g. similar to .NET or Java)
or
(b) another threading model (e.g. one interpreter per thread, as in Tcl,
Erlang, or .NET app domains).
As CPython has neither, we are better off with the GIL.
Nobody likes the GIL, fork a project to write a GIL free CPython if you
can. But note that:
1. With Cython, you have full manual control over the GIL. IronPython
and Jython does not have a GIL at all.
2. Much of the FUD against the GIL is plain ignorance: The GIL slows
down parallel computational code, but any serious number crunching
should use numerical performance libraries (i.e. C extensions) anyway.
Libraries are free to release the GIL or spawn threads internally. Also,
the GIL does not matter for (a) I/O bound code such as network servers
or clients and (b) background threads in GUI programs -- which are the
two common use-cases for threads in Python programs. If the GIL bites
you, it's most likely a warning that your program is badly written,
independent of the GIL issue.
There seems to be a common misunderstanding that Python threads work
like fibers due to they GIL. They do not! Python threads are native OS
threads and can do anything a thread can do, including executing library
code in parallel. If one thread is blocking on I/O, the other threads
can continue with their business.
The only thing Python threads cannot do is access the Python interpreter
concurrently. And the reason CPython needs that restriction is reference
counting.
Sturla
More information about the Python-Dev
mailing list