[Python-Dev] coroutines and microthreads
Guido van Rossum
Thu, 16 Nov 2000 22:15:06 -0500
> > This is about as simple as it gets. It supports both generators (like
> > the example here) and coroutines (like Tim's Demo/threads/squasher.py
> > example).
> This looks very nice and clean. It looks sufficient for the type
> of thing I'm doing (and planning to do), but is it really a full
> coroutine API? That is, doesn't the fact that you always
> suspend to the guy who just activated you make this a
> generator API? (OTOH, if I want A and B to talk to each other
> as "coroutines", is it sufficient to make them both "generators"
> and then glue them together with another routine that just
> swaps results?)
Yes, it is a full coroutine API -- you don't have to use suspend() at
all. Appended is my version of squasher.py, an almost mechanical
translation from Tim's version (co.tran(cofoobar, arg) => cofoobar(arg)).
That's built out of a bunch of real coroutines with only explicit
> > Besides the APIs shown here (coroutine(), suspend(), and EarlyExit) I
> > propose a function current() which returns the current coroutine
> > object. There should also be a way to kill a coroutine (or at least
> > to send an exception). When a coroutine falls through at its end,
> > *some* other coroutine needs to be resumed.
> Random thoughts:
> - I've found it handy that Christian's stuff lets you grab a
> coroutine as well as return one (that is, either side can
> instigate it). Not sure if that's necessary.
Sorry, not sure what this means.
> - What do you mean by "kill a coroutine"? You can't interrupt
> one, so isn't it sufficient that when it goes out of scope it gets
> GC'd somehow?
That's one solution. You can't interrupt threads either, and most
people are happy with that (although there's a request for thread
interrupts in PEP 42).
> - It appears from your example that falling off the end
> automatically raises an EarlyExit. I think I see more
> arguments for that than against it :-).
Yes, this follows Tim's lead. What would be an argument against it?
> > Microthreads?
> > -------------
> > I'm much less clear about microthreads (uthreads). Last time, I
> > thought this was a fancy name for coroutines. It isn't! Microthreads
> > are "almost real" threads, with round-robin scheduling. What makes
> > them attractive to some is the fact that they don't use operating
> > system resources: there's no OS-level stack or kernel process table
> > entry associated with them. This also accounts for their biggest
> > weakness: when a microthread is suspended for I/O, *all* microthreads
> > are suspended. In limited contexts, such as a network server, this
> > can be solved by an approach similar to that in Gordon's
> > SelectDispatcher. (It would be a titanic effort to do this for every
> > potentially blocking I/O operation, and it probably wouldn't work well
> > with C extensions.)
> Using the raw Win32 API, I think you could come pretty close.
You can do it on Unix too, by trapping open() to always set files in
nonblocking mode. It's just a lot of work because there are so many
calls to trap (cf. your dispatcher, which just traps read, write,
accept, connect). Long ago, there was a user-mode thread package on
Solaris that did this; it's still immortalized in Python/thread_lwp.h
although I doubt that it still works. GNU Pth uses the same approach
for user-mode threads. Hm... This makes me wonder... Maybe instead
of uthreads we could promote Python linked with Pth?
> I've been wondering if it's possible to do something that would
> get Cameron to quit raving about Tcl's event loop ;-).
Cameron is now lobbying for a separation between Tcl and Tk. I guess
that will mean that the event loop will have to be moved back to Tk,
where it started many years ago. :-)
> > I'm not sure what applications require the round-robin scheduling
> > property of uthreads -- certainly Gordon's application would seem to
> > be doing just fine without it (I'm not sure if it is being scheduled
> > round-robin or not).
> Each coroutine (or whatever they are) runs until it calls one of
> the SelectDispatcher methods that suspends it. The socket
> methods suspend it until select says the socket is ready;
> yield suspends it till the next time round the select loop
> (which has a timeout). So piggish routines are encouraged to
> yield once in a while.
That's what I figured.
> (While I'm doing things the other way 'round, I believe I could
> use your API without changing SelectDispatcher's API at all.)
I think so too. Maybe that'll be my next exercise. I've already done
8 queens with coroutines (slower than recursion, alas -- should be
faster when coroutines are implemented in C though because you save on
number of function calls). I've also made tho working implementations of
my proposed coro API: one using threads, one using Stackless and the
continuation module. (Found a few quirks in the examples in your
tutorial. Interested in details?)
> > Proper round-robin scheduling for uthreads requires explicit switching
> > code in the Python interpreter. Stackless provides this, at the same
> > place where in regular Python the global interpreter lock is released
> > and re-acquired to give other threads a chance to run. Is this
> > needed?
> While I'm not really a uthread user, I think they would give you
> an unqualified "yes". The advantage to explicitly yeilding is
> that (with proper thought) you don't need nasty things like
> locks; the disadvantage (as demonstrated by a particular OS
> with a rabidly fanatical following) is that one jerk can ruin it for
Let me guess -- you don't like the Mac much, do you? :-)
And yes, I noticed that the first thing the uthread API does is define
a bunch of synchronization primitives. :-( I think it wouldn't be
hard to implement the round-robin scheduling, but it's yet more added
BTW, how's PEP 219 coming along? :-)
--Guido van Rossum (home page: http://www.python.org/~guido/)
# Coroutine example: general coroutine transfers
# (After Tim Peters)
# The program is a variation of a Simula 67 program due to Dahl & Hoare,
# (Dahl/Dijkstra/Hoare, Structured Programming; Academic Press, 1972)
# who in turn credit the original example to Conway.
# We have a number of input lines, terminated by a 0 byte. The problem
# is to squash them together into output lines containing 72 characters
# each. A semicolon must be added between input lines. Runs of blanks
# and tabs in input lines must be squashed into single blanks.
# Occurrences of "**" in input lines must be replaced by "^".
# Here's a test case:
test = """\
d = sqrt(b**2 - 4*a*c)
twoa = 2*a
L = -b/twoa
R = d/twoa
A1 = L + R
A2 = L - R\0
# The program should print:
# d = sqrt(b^2 - 4*a*c);twoa = 2*a; L = -b/twoa; R = d/twoa; A1 = L + R;
#A2 = L - R
# getline: delivers the next input line to its invoker
# disassembler: grabs input lines from getline, and delivers them one
# character at a time to squasher, also inserting a semicolon into
# the stream between lines
# squasher: grabs characters from disassembler and passes them on to
# assembler, first replacing "**" with "^" and squashing runs of
# assembler: grabs characters from squasher and packs them into lines
# with 72 character each, delivering each such line to putline;
# when it sees a null byte, passes the last line to putline and
# then kills all the coroutines
# putline: grabs lines from assembler, and just prints them
from coro import coroutine, suspend, main, EarlyExit
for line in string.splitfields(text, '\n'):
card = cogetline()
for i in range(len(card)):
ch = codisassembler()
if ch == '*':
ch2 = codisassembler()
if ch2 == '*':
ch = '^'
ch = ch2
if ch in ' \t':
ch2 = codisassembler()
if ch2 not in ' \t':
ch = ch2
line = ''
ch = cosquasher()
if ch == '\0':
if len(line) == 72:
line = ''
line = line + ch
line = line + ' ' * (72 - len(line))
line = coassembler()
main() # Hack -- resume main coroutine to exit
cogetline = coroutine(getline, test)
coputline = coroutine(putline)
coassembler = coroutine(assembler)
codisassembler = coroutine(disassembler)
cosquasher = coroutine(squasher)
# end of example