Minimalistic Software Transactional Memory

Michael Sparks ms at cerenity.org
Sat Dec 8 17:53:38 EST 2007


Hi,


I'm interested in writing a simple, minimalistic, non persistent (at this
stage) software transactional memory (STM) module. The idea being it should
be possible to write such a beast in a way that can be made threadsafe fair
easily.

For those who don't know, STM is a really fancy way of saying variables
with version control (as far as I can tell :-) designed to enable threadsafe
shared data.

I'm starting with the caveat here that the following code is almost
certainly not threadsafe (not put any real thought into that as yet),
and I'm interested in any feedback on the following:

   * Does the API look simple enough?
   * Are there any glaring mistakes in the code ? (It's always harder to see
     your own bugs)
   * What key areas appear least threadsafe, and any general suggestions
     around that.

If I get no feedback I hope this is of interest. Since these things get
archived, if you're reading this a month, 6 months, a year or more from
now, I'll still be interested in feedback...

OK, API.

First of all we need to initialise the store:

    S = Store()

We then want to get a value from the store such that we can use the value,
and do stuff with it:

    greeting = S.using("hello")

Access the value:

    print repr(greeting.value)

Update the value:

    greeting.set("Hello World")

Commit the value back to the store:

    greeting.commit()

If you have concurrent updates of the same value, the following exception
gets thrown:
    ConcurrentUpdate

cf:
    >>> S = Store()
    >>> greeting = S.using("hello")
    >>> par = S.using("hello")
    >>> greeting.set("Hello World")
    >>> par.set("Woo")
    >>> greeting.commit()
    >>> par.commit()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 12, in commit
      File "<stdin>", line 11, in set
    __main__.ConcurrentUpdate

That's pretty much the simplest API I can come up with. (I've tried a few
others but didn't really like them)

The way this works is we have a Store that manages Values. (perhaps a better
name may be variables to avoid clashing with pythons parlance of labels and
values?) 

Anyhow, you can ask the store for Value, which it will give you. The Value
knows what it's called and where it's from, so when you tell it to commit,
it can try to do so. The little detail here is you get a /copy/ of the
Value not the stored Value. (This is to avoid accidental concurrent update
of the same actual object)

As a result, Store looks like this:

class Store(object):
    def __init__(self):
        self.store = {}

    def get(self, key):
        return self.store[key].clone()

    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1, value.value, self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

    def using(self, key):
        try:
            return self.get(key)
        except KeyError:
            self.store[key] = Value(0, None,self,key)
            return self.get(key)

    def dump(self):
        for k in self.store:
            print k, ":", self.store[k]

You'll note that the set method is the greatest candidate for any possible
race hazard here - though I can see a possible boundary issue in "using".
(I think :-)

Otherwise I think the above code is relatively straightforward. You'll note
that this API allows this:

    greeting.set("Hello")
    greeting.commit()
    greeting.set("Hello World")
    greeting.commit()
    greeting.set("Hello World. Game")
    greeting.commit()
    greeting.set("Hello World. Game Over")
    greeting.commit()

The other class is value that looks like this:

class Value(object):
    def __init__(self, version, value,store,key):
        self.version = version
        self.value = value
        self.store = store
        self.key = key

    def __repr__(self):
        return "Value"+repr((self.version,self.value,self.store,self.key))

    def set(self, value):
        self.value = value

    def commit(self):
        self.store.set(self.key, self)

    def clone(self):
        return Value(self.version, self.value,self.store,self.key)

To me this looks like a pretty complete minimalistic thing, which does seem
to work OK, but I'm interested in the three points I mention above - if
anyone is willing to comment - specifcally:

   * Does the API look simple enough?
   * Are there any glaring mistakes in the code ? (It's always harder to see
     your own bugs)
   * What key areas appear least threadsafe, and any general suggestions
     around that.

Full code below.

Many thanks for any comments in advance,


Michael
--
Michael Sparks, Kamaelia Project Lead
http://kamaelia.sourceforge.net/Developers/
http://yeoldeclue.com/blog

#!/usr/bin/python

class ConcurrentUpdate(Exception): pass

class Value(object):
    def __init__(self, version, value,store,key):
        self.version = version
        self.value = value
        self.store = store
        self.key = key

    def __repr__(self):
        return "Value"+repr((self.version,self.value))

    def set(self, value):
        self.value = value

    def commit(self):
        self.store.set(self.key, self)

    def clone(self):
        return Value(self.version, self.value,self.store,self.key)

class Store(object):
    def __init__(self):
        self.store = {}

    def get(self, key):
        return self.store[key].clone()

    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1, value.value, self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

    def using(self, key):
        try:
            return self.get(key)
        except KeyError:
            self.store[key] = Value(0, None,self,key)
            return self.get(key)

    def dump(self):
        for k in self.store:
            print k, ":", self.store[k]

S = Store()
greeting = S.using("hello")
print repr(greeting.value)
greeting.set("Hello World")
greeting.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
par = S.using("hello")
par.set("Woo")
par.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
greeting.set("Woo")
greeting.commit()

print repr(greeting), repr(greeting.value)

S.dump()




More information about the Python-list mailing list