Mailman 3 Concurrent safety? - Python-ideas

Oct. 31, 2011

      This is a blue sky idea. It can't happen in Python 3.x, and possibly
not ever in cPython. I'm mostly hoping to get smarter people than me
considering the issue.

Synopsis

Python, as a general rule, tries to be "safe" about things. If
something isn't obviously correct, it tends to throw runtime errors to
let you know that you need to be explicit about what you want.  When
there's not an obvious choice for type conversions, it raises an
exception. You generally don't have to worry about resource allocation
if you stay in python. And so on.

The one glaring exception is in concurrent programs. While the tools
python has for dealing with such are ok, there isn't anything to warn
you when you fail to use those tools and should be.

The goal of this proposal is to fix that, and get the Python
interpreter to help locate code that isn't safe to use in concurrent
programs.

Existence Proof

This is possible. Clojure is a dynamic language in the LISP family
that will throw exceptions if you try mutating variables without
properly protecting them against concurrent access.

This is not to say that the Clojure solution is the solution, or even
the right solution for Python. It's just to demonstrate that this can
be done.

Object Changes

Object semantics don't need to change very much. The existing
immutable types will work well in this environment exactly as is. The
mutable types - well, we can no longer go changing them
willy-nilly. But any language needs mutable types, and there's nothing
wrong with the ones we have.

Since immutable types don't require access protection *at all*, it
might be worthwhile to add a new child of object,
"immutable". Instances of this type would be immutable after
creation. Presumably, the __new__ method of Python classes inheriting
from immutable would be used to set the initial attributes, but the
__init__ method might also be able to handle that role. However, this
is a performance tweak, allowing user-written classes to skip any
runtime checks for being mutated.

Binding Changes

One of the way objects are mutated is by changing their bindings. As
such, some of the bindings might need to be protected.

Local variables are fine. We normally can't export those bindings to
other functions, just the values bound to them. So changing the
binding can stay the same. The bound object can be exported to other
threads of execution, but changing it will fall under the rules for
changing objects.

Ditto for nonlocals.

On the other hand, rebindings of module and class and instance
variables can be visible in other threads of execution, so they
require protection, just like changing mutable objects.

New Syntax

The protection mechanism is the change to the language. I propose a
single new keyword, "locking", that acts similar to the "try"
keyword. The syntax is:

'locking' value [',' value]*':' suite

The list of values are the objects that can be mutated in this
lock. An immutable object showing up in the list of values is a
TypeError.

It's not clear that function calls should be allowed in the list of
values. On the other hand, indexing and attributes clearly should be,
and those can turn into function calls, so it's not clear they
shouldn't be allowed, either.

The locked values can be mutated during the body of the locking
suite. For the builtin mutable types, this means invoking their
mutating methods. For modules, classes and object instances, it means
rebinding their attributes.

Locked objects stay locked during function invocations in the
suite. This means you can write utility functions that expect to be
passed locked objects to mutate.

A locking statement can be used inside of another locking
statement. See the Implementation section for possible restrictions on
this.

Any attempt to mutate an object that isn't currently locked will raise
an exception. Possibly ValueError, possibly a new exception class just
for this purpose. This includes rebinding attributes of objects that
aren't locked.

Implementation

There are at least two ways this can be implemented, both with
different restrictions on the suite. While both of them can probably
be optimized if it's know that there are no other threads of
execution, checking for attempts to mutate unlocked objects should
still happen.

1) Conventional locking

All the objects being locked have locks attached to them, which are
locked when before entering the suite. The implementation must order
the locked object in some repeatable way, so that two locking
statements that have more than one locked object in common will obtain
the locks on those objects in the same order. This will prevent
deadlocks.

This method will require that the initial locking statement lock all
objects that may be locked during the execution of it's suite. This
may be a reason for allowing functions as locking values, as a way to
get locks on objects that code called in the suite is going to need.

Another downside is that the programmer needs to handle exceptions
raised during the suite to insure that a set of related changes leaves
the relevant objects in a consistent state. In this case, an optional
'except' clause should be added to the locking statement to hold such
code.

2) Software Transactional Memory

In an STM implementation, copies of the locked objects are created by
the locking statement, and they original are "fingerprinted" in some
way. The locking suite then runs. When the suite completes, the
fingerprints of the originals are checked to see if some other thread
of execution has changed them. If they haven't changed, they are
replaced by the copies, and execution continues. If the originals have
changed, the entire process starts over. In this implementation, the
only actual locking is during the original fingerprinting process (to
insure that a consistent state is captured) and at the end of the
suite. FWIW, this is one of the models provided by Clojure.

The restriction on the suite in this case is that running it twice -
except for changes to the locked objects - needs to be acceptable.

In this case, exceptions don't need to be handled by the programmer to
insure consistency. If an exception happens during the execution of
the suite, the original values are never replaced.

    <mike
-- 
Mike Meyer <mwm@mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Concurrent safety?

Mike Meyer

Nick Coghlan

Mike Meyer

Antoine Pitrou

Mike Meyer

Antoine Pitrou

Mike Meyer

Amaury Forgeot d'Arc

Mike Meyer

Jim Jewett

Mike Meyer

Greg Ewing

Mike Meyer

Bruce Leban

Mike Meyer

Bruce Leban

Matt Joiner

Nick Coghlan

Mike Meyer

Antoine Pitrou

Mike Meyer

Antoine Pitrou

Mike Meyer

Amaury Forgeot d'Arc

Mike Meyer

Jim Jewett

Mike Meyer

Greg Ewing

Mike Meyer

Bruce Leban

Mike Meyer

Bruce Leban

Matt Joiner

tags

participants (8)