[python-uk] [pyconuk] Minimalistic software transactional memory

Wed Dec 12 09:43:14 CET 2007

Mike

On Tuesday 11 December 2007, Michael Sparks wrote:
> Hi Richard,
> 
> 
> On Tuesday 11 December 2007 13:36, Richard Taylor wrote:
> > I don't think that you can rely on the threadsafety of these functions.
> > Even if they are threadsafe in C Python (which I doubt that 'set' is), the
> > locking in Jython in more fine grained and would likely catch you out.
> 
> It's perhaps worth noting in the version of the code I posted, in this 
thread, 
> where it said...
> 
>    """What key areas appear least threadsafe, and any general suggestions
>       around that."""
> 
> ...I knew that set and using were not threadsafe, but wondered about other 
> parts. I perhaps should've been more explicit on that point. (I wanted to 
> simply post some ideas which showed the core logic without locking. Perhaps 
a 
> mistake :)
> 

I did miss the subtlety :-) but at least it prompted me to reply!

> Anyhow, the current version is here:
> 
> 
https://kamaelia.svn.sourceforge.net/svnroot/kamaelia/branches/private_MPS_Scratch/Bindings/STM/Axon/STM.py
> 
> In that version, "set" now looks like this:
>     def set(self, key, value):
>         success = False
>         if self.lock.acquire(0):
>             try:
>                 if not (self.store[key].version > value.version):
>                     self.store[key] = Value(value.version+1,
>                                             copy.deepcopy(value.value),
>                                             self, key)
>                     value.version= value.version+1
>                     success = True
>             finally:
>                 self.lock.release()
>         else:
>             raise BusyRetry
> 
>         if not success:
>             raise ConcurrentUpdate
> 
> and "using" has changed to "usevar: (using now relates to a collection)
> 
>     def usevar(self, key):
>         try:
>             return self.get(key)
>         except KeyError:
>             if self.lock.acquire(0):
>                 try:
>                     self.store[key] = Value(0, None,self,key)
>                 finally:
>                     self.lock.release()
>             else:
>                 raise BusyRetry
> 
>             return self.get(key)
> 
> Since mutations of the store rely on acquiring the lock on the store, that 
> should be safe(r). User code doesn't have to worry about locks however - 
> which is course the point of the code :-)
> 

I like to use explicit read locks for shared data structures, mainly because 
it makes it much safer when someone comes along and adds functionality to the 
methods later on. The next developer may not realise that the method needs to 
be threadsafe so the explicit locking code will help. I like to use 
decorators for the same reason, it is very easy to get the try/finally wrong 
end up with deadlocks.

I think that I would split the usevar method in two and use decorators to 
acquire the read and write locks e.g.

@read_lock()
def usevar(self, key):
      try:
          return self.get(key)
      except KeyError:
          return self.make(key)

@write_lock()
def make(self,key):
      self.store[key] = Value(0, None,self,key)
      return self.get(key)

> The reason for specifically using the acquire(0) call rather than acquire() 
> call is because I want it to fail hard if the lock can't be acquired. I know 
> it'd be nicer to have a finer grained lock here, but I'm personally 
primarily 
> going to be using this for rare operations rather than common operations.
>

Personally I am not sure that I would bother with the non-blocking acquire 
here. It will complicate the client code and the length of time that the lock 
will be held will be so small that almost all client code will simply retry 
the set. So I would go for a blocking acquire or maybe add a timeout based 
exception to catch a deadlock.

> These locks above are of course in relation to write locking. I'll think 
about 
> the read locking you've suggested.
> 
> Your locking looks incorrect on using since it both allows reading and 
writing 
> of the store. (retrieve value & if not present create & initialise)

Yep, I did say that it was not tested :-) It is wrong on 'set' as well, 
as 'set' should acquire both the read and the write lock.

> I also think the independent locks are a misnomer, but they're useful for 
> thinking about it.
> 
> > I would suggest that you should routinely wrap shared datamodels like 
these
> > in thread locks to be certain about things.
> 
> Indeed. It makes the code look worse, so for this example I was really after 
> suggestions (like yours :-) of "OK, where does this break badly" as well as 
> "does the logic look sane?".

Decorators help to hide the details from the main code path. I think that 
decorators are over used sometimes, but in this case I think that are very 
useful.

> > I would also suggest that a small change to the Value class would make it
> > possible for client code to subclass it, which might make it more 
flexible.
> 
> I'm not convinced by the changes to Value - its there for storing arbitrary 
> values, rather than extending Value itself. It's probably worth noting 
> that .clone has changed in my version to this:
> 
>     def clone(self):
>         return Value(self.version,
>                      copy.deepcopy(self.value),self.store,self.key)
> 
> Which includes deepcopy on the value stored by Value. I'm beginning to think 
> that Value should be called "Variable" to make this clearer...
> 

I just have a tendency to like to enable flexibility from client code where it 
is possible. I think I have fought with library code that was inflexible one 
too many times :-)

> It's interesting though, after having developed large amounts of code of 
code 
> based on no-shared-data & read-only/write-only pipes with data handoff and 
> not having had any major concurrency issues (despite mixing threads and non 
> threads) switching to a shared data model instantly causes problems. 
> 
> The difference is really stark. One is simple, natural and easy and the 
other 
> is subtle & problematic. I'm not shocked, but find it amusing :-)

I agree. Where possible we use 'pipe and filter' or 'message queue' patterns 
for inter-thread communication. We have a library based around the MASCOT 
design method that uses concepts of 'queues' and 'pools', which has proved 
very powerful for large multi-threaded applications. We have found that the 
best way to avoid thread related problems is to indulge in a little design 
work to explicitly limit all thread communication to a few well thought 
through mechanisms and then police their use rigidly.

Regards

Richard

-- 
QinetiQ                                  
B009 Woodward Building
St. Andrews Road
Malvern
Worcs WR14 3PS
Jabber: RichardTaylor at jabber.org
PGPKey: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xA7DA9FD9
Key fingerprint = D051 A121 E7C3 485F 3C0E  1593 ED9E D868 A7DA 9FD9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://mail.python.org/pipermail/python-uk/attachments/20071212/949a3f7c/attachment-0001.pgp