[Python-Dev] Adding Python-Native Threads

Adam Olsen rhamph at gmail.com
Sun Jun 26 11:34:07 CEST 2005


There are some very serious problems with threads today.  Events are
often proposed as a solution to these problems, but they have their own
issues.  A detailed analysis of the two options and some solutions
(most of which are used in my proposal) can be found in Why Events Are
A Bad Idea [0].  I'm going to skip the event aspects and sum up the
thread problems here:

* Expensive (resident memory, address space, time to create and to
             switch between)
* Finely-grained atomicity (python is coarser than C, but still finer
                            than we need)
* Unpredictable (switching between them is not deterministic)
* Uninterruptible (no way to kill them, instead you must set a flag and
                   make sure they check it regularly)
* Fail silently (if they die with an exception it gets printed to the
                 console but the rest of the program is left uninformed)

To resolve these problems I propose adding lightweight cooperative
threads to Python.  They can be built around a Frame object, suspending
and resuming like generators do.

That much is quite easy.  Avoiding the C stack is a bit harder.
However, it can be done if we introduce a new operator for threaded
calls (which I refer to as non-atomic calls), which I propose be
"func@()".  As a bonus this prevents any operation which doesn't use
the non-atomic call syntax from triggering a thread switch accidentally.

After that you need a way of creating new threads.  I propose we use a
"sibling func@()" statement (note that the "@()" would be parsed as
part of the statement itself, and not as an expression).  It would
create a new thread and run func in it, but more importantly it would
also "prejoin" the new thread to the current function.  This means that
for the current function to return to its parent, all of the threads it
created must first return themselves.  It also means that an exception
in any of them will be propagated to the parent of the creator, after
interrupting all of its siblings (with an Interrupted exception) so
that it is permitted to return.

Now for an example of how this would work in practice, using an echo
server.  Note that I am assuming a "con" module (short for concurrent)
that would contain standard functions to support these python-native
threads.


from con import bootstrap, tcplisten, ipv6addr, ipv4addr

def main():
    main_s = tcplisten@(ipv6addr('::1', 6839), ipv4addr('127.0.0.1', 6839))
    while 1:
        sibling echo@(main_s.accept@())

def echo(s):
    try:
        try:
            while 1:
                s.write@(s.read@())
        except (EOFError, IOError):
            pass
    finally:
        s.close()

if __name__ == '__main__':
    try:
        bootstrap(main)
    except KeyboardInterrupt:
        pass


And here is a diagram of the resulting cactus stack, assuming three
clients, two of which are reading and the other is writing.


bootstrap - main - accept
             |
            echo - read
             |
            echo - read
             |
            echo - write


Some notes:
* An idle@() function would be used for all thread switches.  I/O
  functions such as read@() and write@() would use it internally, and
  idle would internally call a function to do poll() on all file
  descriptors.
* idle is the function that will raise Interrupted
* It should be illegal to attempt a non-atomic call once your thread is
  in an interrupted state.  This ensures you cannot get "interrupted
  twice", by having something further up the call stack fail and
  interrupt you while you are already handling an interruption.
* Being a cooperative system, it does not allow you to use multiple
  CPUs simultaneously.  I don't regard this as a significant problem,
  and in any case there are better ways use multiple CPUs if you need to.


References:
[0] http://www.usenix.org/events/hotos03/tech/full_papers/vonbehren/vonbehren_html/index.html

-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-Dev mailing list