Static vs. dynamic checking for support of concurrent programming

Mike Meyer mwm at mired.org
Fri Aug 26 05:59:31 CEST 2005


The recent thread on threads caused me to reread the formal definition
of SCOOP, and I noticed something I hadn't really impressed me the
first time around: it's using staticly checkable rules to help ensure
correct behavior in a concurrent environment.

That's impressive. That's *really* impressive. I know of no other
language that does that - though they probably exist. I'd be
interested in references to them.

Normally, I think of static checking as something that's not critical,
as a good test suite will cause exceptions on the things that static
checking catches. But concurrency errors are a completely different
class of critter entirely, and generally result in really nasty
runtime errors.

Which leads to the question - what would it take to get Python to
throw exceptions in the cases where a compiler implementing SCOOP will
emit an error message? There's two reasons for looking into this. One,
wanting to continue to believe that static checking doesn't really buy
anything. Two, wanting Python to have more powerful and robust tools
for dealing with threads. If these can be made to work, then they're a
candidate for addition. If not, then we need to look elsewhere. I'm
not going to worry about how the semantics would be implemented at
this point - I'm more interested in whether the static rules can be
enforced in a dynamic environment.

First, the rules. These are greatly simplified and mangled to describe
a hypothetical Python-like language. If you want to see the formald
definitions yourself, you can find them at <URL:
http://archive.eiffel.com/doc/manuals/technology/concurrency/CONCURRENCY.html
>.

The new feature: a variable can be marked as "separate". This means
the object it refers to is running on a different thread, and changes
the semantics of accessing it in various ways:

First, invoking a method and ignoring it's return value is run
asynchronously. Likewise, an assigment to an attribute is run
asynchronously. I'm going to call such actions a "command".  Commands
are queued by the object in question, and are guaranteed to run in the
order they appear in the code. The execution order of the commands
with respect to purely thread-local statements or commands on other
separate objects is not guaranteed.

Reading the value of an attribute is a synchronization point. All
commands on the object will be execute before you are allowed to read
the value of the attribute, and no following commands on the object
will execute before the value is read.

Note that this makes the Law of Demeter something your really want to
pay attention to. Doing "so.addChild(newChild)" runs asynchronously,
whereas doing "so.children.append(newChild)" causes a read of
so.children, which will synchronize the object, then an update of
children.

Now, the constraints:

1) Any object assigned to a separate variable must really be
   separate. This applies to all the ways you can do an assignment.

2) The invocation of a command on a separate objects is only legal if
   the separate object is a formal argument of the enclosing
   function/method.

The first one seems trivial, but avoids some nasty bugs. For instance,
if you have a list of separate objects and you want to divide a
workload up among them, you'd do something like:

def divide(self, work, workers):
    for worker in workers:
        self.startWork(worker, work.getNextChunk)

def startWork(self, separate worker, chunk):
    worker.process(chunk)


Adding a deceleration seems to the the only way to handle this in
Python. It's basically an assertion isinstance(object,
Separate). startWork should throw an exception if it's called with an
object that isn't separate.

What happens if you call a routine with a separate object and the
routine doesn't expect one? That's almost certainly bad - most
functions expect all their statements to be executed in order. Does
solving this requrie a check on every argument passed to a function to
verify that it's not separate?

The second rule is how locking is handled. If you invoke a function
with one or more separate objects, the language processor waits until
it has the locks on all the separate objects before running the body
of the routine. This means that locking order when you need multiple
locks is determined by the language processor, not the programmer, so
you've eliminated locking order errors as a source of deadlocks.

If you don't declare separate objects, then this would seem to require
that every command be checked to see if it's on a separate object, and
if so throw an exception if the object isn't a formal argument to the
enclosing routine. If you have declarations and enforce them, you can
limit the checks to objects declared separate.

These solutions aren't very dynamic. Can anyone see how to do this
without having to declare separate variables as such?

        Thanks,
        <mike
-- 
MIKE Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list