[Python-Dev] uthread strawman

Gordon McMillan gmcm@hypernet.com
Wed, 8 Nov 2000 12:09:48 -0500


[Guido]
> Without continuations, but with microthreads (uthreads) or
> coroutines, each (Python) stack frame can simply be "paused" at a
> specific point and continued later.  The semantics here are
> completely clear (except perhaps for end cases such as unhandled
> exceptions and intervening C stack frames).

Exceptions require some thought, particularly because the "for" 
protocol uses an IndexError as a signal. In my own stuff I've found I 
need to catch all exceptions in the coroutine, primarily because I've 
always got resources to clean up, but clearly the implementation has 
to do the right thing when an exception crosses the boundary.
 
> A strawman proposal:
> 
> The uthread module provides the new functionality at the lowest
> level. Uthread objects represent microthreads.  An uthread has a
> chain of stack frames linked by back pointers just like a regular
> thread. Pause/resume operations are methods on uthread objects. 
> Pause/resume operations do not address specific frames but
> specific uthreads; within an uthread the normal call/return
> mechanisms can be used, and only the top frame in the uthread's
> stack of call frames can be paused/resumed (the ones below it are
> paused implicitly by the call to the next frame, and resumed when
> that call returns).

I'm not convinced (though I'm not asking you to convince me - I need to 
ponder some more) that this is the right approach. My worry is that to 
do coroutines, we end up with a bunch of machinery on top of 
uthreads, just like Tim's old coroutine stuff, or implementations of 
coroutines in Java.

My mental test case is using coroutines to solve the impedance 
mismatch problem. SelectDispatcher is a simple example (write 
"client" code that looks like it's using blocking sockets, but multiplex 
them behind the curtain). Using a "pull" parser as a "push" parser is 
another case, (that is, letting it think it's doing its own reads).

But what about using a "pull" lexer and a "pull" parser, but tricking 
them with coroutines so you can "push" text into them? Tim's 
implementation of the  Dahl & Hoare example (which I 
rewrote in mcmillan-inc.com/tutorial4.html) shows 
you *can* do this kind of thing on top of a thread 
primitive, but might it not be much better done on 
a different primitive?

Again, I'm not really asking for an answer, but I think this type of problem is not uncommon, and a wonderful use of coroutines; so I'm wondering if this is a good trade-off.

> - u.yield() pauses the current uthread and resume the uthread
> u where it was paused.  The current uthread is resumed when
> some other uthread calls its yield() method.  Calling
> uthread.current().yield() is a no-op. 

This doesn't seem like enough: sort of as though you designed a 
language in which "call" and "return" were spelled the same way. 
Certainly for coroutines and generators, people gravitate towards 
paired operations (eg. "suspend" and "resume"). Again, Tim's 
demonstrated you can do that on top of threads, but it sure seems to 
me like they should be primitives.

> I think this API should enough to implement Gordon's
> SelectDispatcher code.  In general, it's easy to create a
> scheduler uthread that schedules other uthreads.

Thank you :-).

> Open issues:
> 
> - I'm not sure that I got the start conditions right.  Should
> func() be
>   be allowed to run until its first yield() when
>   uthread.new(func) is called?

For coroutine stuff, that doesn't bother me. For uthreads, I'd think (like 
real threads) that creation and starting are different things.
 
> - I'm not sure that the rules for returning and raising
> exceptions
>   from func() are the right ones.
> 
> - Should it be possible to pass a value to another uthread by
> passing
>   an argument to u.yield(), which then gets returned by the
>   resumed yield() call in that uthread?

Argument (in and / or out) passing is a necessity for generators and 
coroutines. But, as mentioned above, I don't think a symmetrical 
"yield" is the right answer.
 
> - How do uthreads interact with real threads?  Uthreads are
> explicitly
>   scheduled through yield() calls; real threads use preemptive
>   scheduling.  I suppose we could create a new "main" uthread for
>   each real thread.  But what if we yield() to an uthread that's
>   already executing in another thread?  How is that error
>   detected?

I think I would (perhaps naively) expect that I could create a uthread in 
one (real) thread, and then pass it off to another (real) thread to 
execute.

Another post brings up GUIs and uthreads. I already expect that I'm 
going to have to dedicate a (real) thread to the GUI and ponder very 
carefully how that thread interacts with others. Of course, that's from 
lessons learned the hard way; but personally I'm not expecting 
uthreads / coroutines to make that any easier.

[About continuations: while I love the fact that Christian has made 
these available for playing, I have so far not found them productive. I 
wrote a simple minded backtracking parser using them, but found it no 
better than a coroutine based one. But I am interested in how a *real* 
pervert (eg, Tim) feels about it - and no "Gee, that sounds like a *good* 
idea, boss", please.]

- Gordon