Lockless algorithms in python (Nothing to do with GIL)
Ryan Kelly
ryan at rfk.id.au
Mon Jun 28 20:30:29 EDT 2010
On Mon, 2010-06-28 at 16:46 -0700, Zac Burns wrote:
> In my experience it is far more expensive to allocate a lock in python
> then it is the types that use them. Here are some examples:
>
> >>> timeit.timeit('Lock()', 'from threading import Lock')
> 1.4449114807669048
>
> >>> timeit.timeit('dict()')
> 0.2821554294221187
>
> >>> timeit.timeit('list()')
> 0.17358153222312467
>
>
> Granted, I know these types do not allocate a lock - but a similarly
> defined user type would need to allocate a lock to guarantee
> thread-safety.
Sure, but I think you're timing the wrong thing here. You would only
allocate the lock relatively rarely - it's the overhead of *acquiring*
the lock that's the real problem.
rfk at durian:~$ python -m timeit -s "from threading import Lock; l = Lock()" "l.acquire(); l.release()"
1000000 loops, best of 3: 0.378 usec per loop
Compare to one of my favourite dict methods, setdefault:
rfk at durian:~$ python -m timeit -s "d = dict()" "d.setdefault('hello','world')"
10000000 loops, best of 3: 0.189 usec per loop
In the same ballpark really.
> I have not done a great deal of research into lockless algorithms
> thusfar, but would it be worthwhile to investigate implementing some
> of them in python to optimize performance critical sections of code?
>
> Many lockless algorithms that I have looked at thusfar require a
> "Compare and Swap" operation. Does python have an equivalent to this?
> Is python high level enough that it has something better than this or
> it simply doesn't need it?
I've managed to avoid locking in some cases by using dict.setdefault()
as a kind of atomic test-and-set operation. It's not a full
compare-and-swap but you could implement a simple locking scheme using
it like this:
class PretendLock:
def __init__(self):
self.d = dict()
def acquire(self):
myid = threading.currentThread().ident
while self.d.setdefault("owner",myid) != myid:
pass
def release(self):
del self.d["owner"]
This depends on some peculiarities of the CPython implementation that
aren't guaranteed by the language spec - namely that thread switches
can't happen while executing C code, that dict.setdefault is implemented
in C, and that is can calculate the hash of a string key without calling
back out to python code.
Usually, it's not worth it. The one time I use this regularly is for
lazy object initialisation, e.g. something like this:
def get_my_object(key):
try:
return objects[key]
except KeyError:
return objects.setdefault(key,create_my_object(key))
If two threads happen to try initialising the object at the same time,
only one will wind up in the dict and being used; the other will be
discarded and garbage-collected.
Cheers,
Ryan
--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ryan at rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-list/attachments/20100629/423143bf/attachment-0001.sig>
More information about the Python-list
mailing list