another thread on Python threading
josiah.carlson at sbcglobal.net
Mon Jun 4 22:11:09 CEST 2007
> On Jun 4, 3:10 am, Josiah Carlson <josiah.carl... at sbcglobal.net>
>> From what I understand, the Java runtime uses fine-grained locking on
>> all objects. You just don't notice it because you don't need to write
>> the acquire()/release() calls. It is done for you. (in a similar
>> fashion to Python's GIL acquisition/release when switching threads)
> The problem is CPython's reference counting. Access to reference
> counts must be synchronized.
> Java, IronPython and Jython uses another scheme for the garbage
> collector and do not need a GIL.
There was a discussion regarding this in the python-ideas list recently.
You *can* attach a lock to every object, and use fine-grained locking
to handle refcounts. Alternatively, you can use platform-specific
atomic increments and decrements, or even a secondary 'owner thread'
refcount that doesn't need to be locked by 1 thread at a time.
It turns out that atomic updates are slow, and I wasn't able to get any
sort of productive results using 'owner threads' (seemed generally
negative, and was certainly more work to make happen). I don't believe
anyone bothered to test fine-grained locking on objects.
However, locking isn't just for refcounts, it's to make sure that thread
A isn't mangling your object while thread B is traversing it. With
object locking (course via the GIL, or fine via object-specific locks),
you get the same guarantees, with the problem being that fine-grained
locking is about a bazillion times more difficult to verify the lack of
deadlocks than a GIL-based approach.
> Changing CPython's garbage collection from reference counting to a
> generational GC will be a major undertaking. There are also pros and
> cons to using reference counts instead of 'modern' garbage collectors.
> For example, unless there are cyclic references, one can always know
> when an object is garbage collected. One also avoids periodic delays
> when garbage are collected, and memory use can be more modest then a
> lot of small temporary objects are being used.
It was done a while ago. The results? On a single-processor machine,
Python code ran like 1/4-1/3 the speed of the original runtime. When
using 4+ processors, there were some gains in threaded code, but not
substantial at that point.
> There are a number of different options for exploiting multiple CPUs
> from CPython, including:
My current favorite is the processing package (available from the Python
cheeseshop). You get much of the same API as threading, only you are
using processes instead. It works on Windows, OSX, and *nix.
> def synchronized(fun):
> from threading import RLock
> rl = RLock()
> def decorator(*args,**kwargs):
> with rl:
> retv = fun(*args,**kwargs)
> return retv
> return decorator
> It is not possible to define a 'synchronized' block though, as Python
> do not have Lisp macros :(
Except that you just used the precise mechanism necessary to get a
synchronized block in Python:
lock = threading.Lock()
More information about the Python-list