[pypy-dev] STM on CPython

Fri Aug 26 09:09:09 CEST 2011

Hi,

A follow-up to the blog post about Software Transactional Memory (STM)
at http://morepypy.blogspot.com/ .

First, there is a growing number of papers out there that seems to
give solid evidence that STM is "better" than lock-based systems, for
various definitions of "better".  In fact Greg Wilson pointed to
http://www.neverworkintheory.org/?p=122 ; it's a study of students,
showing that it takes them less time to implement the exercise with
STM than fine-grained locks --- but most importantly, only 10% of the
programs produced contain errors, as opposed to 70% with fine-grained
locks.  There are also more abstract definitions of "better", e.g. for
my point of view I'd say that the limiting ingredient in locks is that
they don't compose:
http://en.wikipedia.org/wiki/Software_transactional_memory#Composable_operations
.  But more generally it seems to be emerging as a consensus in the
last 5 years.

Brett C. made in 2003 a post on python-dev that was rejected (see
http://mail.python.org/pipermail/python-dev/2003-February/033259.html);
it was about a hack to extend CPython's GIL, giving "atomic sections"
--- which would be implemented very simply by not releasing the GIL
for a while.

This was rejected as a hack that only worked because we had a GIL to
start with, and that would no longer work as soon as somebody came up
with a great idea to remove it.  Me, of course, at that time, I was
sure that this was the correct decision as well.

In retrospect, I would argue that this was possibly one of the most
dreadful decision ever made in Python.  It would be exactly STM
(without actually running on more than on core, but that's an
implementation detail).  If we had it 8 years ago, we would have had
all this time to get experience with STM in a nice language and how
you don't need locks at all to do multithreading.  And that would seem
quite precisely in line with the original goal of Python, i.e. to make
programming easier at the expense of raw performance (in this case it
is even worse, because we would not have lost any performance, but
only lost a hypothetical future path in which to gain more
performance).

So, all this to say: 8 years later, I implemented that on CPython:
https://bitbucket.org/arigo/cpython-withatomic/overview (on the 2.7
branch).  It is sadly a fork of CPython instead of a C extension
module because it needs to access and change a few things in ceval.c.
The total patch is 190 lines, and that's because I need a new type
(very verbose).  It implements "thread.atomic", which you use in a
"with" statement.  So you write "with atomic:" and you get precisely
the simple case described here:
http://en.wikipedia.org/wiki/Software_transactional_memory#Proposed_language_support
.

Fwiw it's probably also a minimal patch to PyPy.  Sorry for being
mostly off-topic for once...

A bientôt,

Armin.