[Python-Dev] uthread strawman

Guido van Rossum guido@python.org
Tue, 07 Nov 2000 20:34:49 -0500

[Note: the stackless@starship.python.net list seems to have trouble
again.  BeOpen may have crashed the machine again. :-( ]

[Note: response to Christian at end.]

> > 
> > A strawman proposal:
> > 
> > The uthread module provides the new functionality at the lowest level.

Eric (Tiedemann, right?  There are two Erics here -- it would help if
you signed your full name :-) writes:

> I really like this as a primitive appropriate for Python's evolution.

Cool.  I think I spoke too soon when I called it uthreads -- these are
really more like coroutines.

I also forgot to mention that I am assuming with this strawman that we
can use an essentially stackless implementation of the PVM.  I hope
that it will be possible to make it a lot simpler than current
stackless though, by not doing continuations.  Freezing a frame in
place is a lot simpler than freezing for multiple uses, which requires
one to decide what to copy and what to share!

> > - When func() returns, the uthread that was executing it ceases to be
> >   runnable.  The uthread that most recently yielded to it is resumed,
> >   unless that is no longer runnable, in which case the uthread that
> >   most recently yielded to *it* is resumed, and so on until a runnable
> >   uthread is found or until no runnable uthreads are left, in which
> >   case the program terminates.  (XXX I need a proof here that this
> >   works.)
> I'd like it added that when a uthread chains to its yielder it drops
> (i.e., nulls and decrefs) the reference to that yielder.  I want
> uthreads in some of the same applications in which i disable gc for
> real-time purposes, and I don't want circular structures of unrunnable
> uthreads leaking my memory.

Good point.

> > - Calling u.yield() or u.kill() for a non-runnable uthread is an error
> >   and raises an exception.
> A runnable() predicate might be nice.


> > - I'm not sure that I got the start conditions right.  Should func() be
> >   be allowed to run until its first yield() when uthread.new(func) is
> >   called?
> +1 for no on this.

You're being unnecessarily cryptic.  "Yes for no"?  So you're for the
original proposal (which doesn't start func() at all until it is
yielded to for the first time).

> > - I'm not sure that the rules for returning and raising exceptions
> >   from func() are the right ones.
> I'm particularly unsure about the exception propagation.  It could
> always be disabled by a universal exception handler in the uthread,
> but I'm not sure it's even worth the implementation effort.

Agreed.  We may have to experiment.

> > - Should it be possible to pass a value to another uthread by passing
> >   an argument to u.yield(), which then gets returned by the resumed
> >   yield() call in that uthread?
> Certainly! :)

This affects the initial condition.  If u hasn't called func() yet,
and I call u.yield(42), where does the 42 go?  Does it call func(42)?
That may make it make it unnecessarily hard to get the end conditions
right for func().  Again, we'll have to write some sample code to see
how this turns out in practice.

> As written, the strawman seems to require that the dread intervening C
> stack frames are handled transparently (since it doesn't say anything
> about them).  This seems pretty important to me.  An instance method
> may well know that it should yield, yet not know that it's being
> called as a callback from a class/type that's just been moved to C.

Not sure what you meant by intervening.  I certainly intended Python
to Python calls to be handled without creating extra C stack frames.
When Python calls C which calls back into Python, this is considered
all part of the same uthread.

Where it gets tricky is when this spawns a new uthread, which also
calls C which calls Python.  Now the second uthread has a C stack
frame above the C stack frame that's part of the first uthread.  This
means that the second uthread must return from its C code before the
first uthread can return to its C code!  A little extra bookkeeping
will be necessary to check for this -- so that when it is attempted an
exception is raised, rather than a return attempted from the wrong
uthread back into C.  This is the same as for current stackless.  The
solution is simply that the application "shouldn't do that."

> OTOH..not handling this transparently would increase my market value
> as a Python programmer.  Handling it right might get me some unpaid
> work implementing some of the low-level details for Linux.  Hmm! :D

If you want do hack continuations in C, be my guest -- as long as you
stay 10,000 kilometers away from core Python. :-)

[Now replying to Christian:]

> Just answering/clarifying a few bits,
> since I can't change your opinion about
> continuations, anyway.


> Guido van Rossum wrote:
> > 
> > I've thought about it a little more, and drawn some pictures in my
> > head.
> I have to agree with Guido when he says:
> > I still have to disagree with Christian when he says:
> > 
> > > Making Python completely coroutine aware, without
> > > tricking the C stack, is 90 percent of the problem.
> > > But after walking that far, there is no reason
> > > to leave the other 10 percent alone.
> ... since I meant implementability. Of course there are
> other reasons gainst continuations. I just did it since they
> were in reach.

Hm.  Having seen a few fragments of your implementation today (just a
very little bit, since we were having an all-day meeting) I feel that
there are a lot of extra hacks needed to make the reuse of
continuations necessary.  This shouldn't be needed in my version.

> > Without continuations, but with microthreads (uthreads) or coroutines,
> > each (Python) stack frame can simply be "paused" at a specific point
> > and continued later.  The semantics here are completely clear (except
> > perhaps for end cases such as unhandled exceptions and intervening C
> > stack frames).
> I agree. But also with continuations, the situation is identical,
> as long as you don't try anything else where continuations
> would be needed.

But the complexity in the code still exists because a continuation
*could* be reused and you don't know if it will ever happen so you
must be prepared.

> Note that they will not need to be created when the mentioned
> structures are implemented well. We don't have to implement them,
> but providing support for them in the interpreter framework
> is simple. (that's the 10% issue).

Having seen your code (frankly, a mess!) I don't believe that it's
only 10% at all.

> > With continuations, you have to decide how much state to save for a
> > future continuation.  It would seem easy enough: save all state kept
> > in the frame except for the local variables.  But now consider this:
> > the "evaluation stack" contained in each frame could easily be
> > replaced by a bunch of temporary variables, if we had a slightly
> > different instruction set (3-address opcodes instead of stack-based
> > opcodes).  Then would we save those temporary variables or not?  it
> > can make a difference!  Since the "save continuation" operation is a
> > function call, you can easily save a continuation while there are some
> > items on the value stack.  I believe the current implementation saves
> > these so they are restored when you jump to the continuation.  But if
> > the compiler were to use temporary variables instead of the evaluation
> > stack, they might not have been restored!
> I would consider these temporary variables registers which must
> be preserved. They are immutable objects as part of the immutable
> continuation, treated as values. Stack or registers, this is part
> of an expression evaluation. Temporary results must conceptually
> be read-only, whatever way I implement this.

I heard from Tim that he helped you get this right.  The fact that it
is so hard to know the requirements for a practical implementation
makes me very worried that continuations may have hidden bugs.

> > Here's another example.  Suppose you set up a for loop.  After three
> > iterations through the loop you save a continuation.  Then you finish
> > hree more iterations.  Then you return to the saved continuation.
> > Where does the loop continue: at 3 or at 6 iterations?  Try to answer
> > this without trying it.  My guess: it gets restarted at 3 iterations,
> > because the loop index is saved on the value stack.  If you rewrite
> > this using a while loop, however, it would get restarted at 6
> > iterations, because then your loop index is an unsaved local variable.
> > Ditto if you changed the bytecode compiler so for loops use an
> > anonymous local variable instead of an entry on the evaluation
> > stack.
> Wrong guess!
> Exactly for that reason I changed the loop code to put a mutable
> loopcounter object on the stack.
> The loop works perfectly.

Wow.  i'm impressed.  You must have borrowed my time machine. :-)

Still, I believe there was a time when the loop *didn't* work
perfectly yet. It is really hard to know what is needed.  Are you
*sure* that it now *always* does the right thing?  What if I save a
continuation in the middle of a shortcut Boolean expression (and/or
stuff)?  Or in cases like a<b<c?  (Here b gets saved on the stack to
avoid loading it again.)

> > This semantic imprecision is one of the reasons why I don't like the
> > concept of continuations.  (I've been told that the exact semantics of
> > continuations in Scheme differ greatly between Scheme implementations.)
> In a sense, you have continuations already, also with the restriction
> to gen/co/uthread structures. The only difference is to treat a
> frame as exactly one continuation and to disallow to have more
> than one at any time.
> This saves the decision about the ambiguities you mentioned.

Yeah, right.  You can call pausable/resumable frames use-once
continuations if you want to.  And if that gives you the happy feeling
that I "support" continuations, fine.

> I agree that going to this point and not further for the
> Python core is a good idea.
> A PEP doesn't need to name continuations at all.


> On the other hand, I don't see a reason why this hsould mean
> that Python *must not* support them. What I'd like to keep
> is the possibility to still write such an extension module.
> Enabling this for educational purposes is a great value
> that comes at a cheap price and no impact for the core.

I doubt it.  I'm not going to allow any compromises just to make it
easier to reuse continuations.  (Such as using a mutable counter in
the for-loop code.)

> > Now let's look at Jython.  In Jython, we can simulate "paused frames"
> > well enough by using real Java threads.  However full continuations
> > would require modifications to the JVM -- which is unacceptable to a
> > language boasting "100% Pure Java".  Another reason against allowing
> > continuations.
> Not quite true, after I heard of a paper that shows how
> to implement continuations in Java, using threads.
> But I'll come back to that when I have the paper.

I've heard that his margin was too small to contain the proof.  I
expect that it will be a disappointment from a practical point of
view: perhaps he emulates the JVM in Java.

> > So, all in all, I don't think of continuations as "the last 10% that
> > we might as well add to finish the job."  I see it as an ill-specified
> > hypergeneralization.
> Can we agree to not support them without forbidding them?

I won't forbid them, but I won't make compromises to the core PVM that
would make them easier to implement.  Your patch set would still be a
heck of a lot smaller of course.

> ...
> > A strawman proposal:
> Ok, this looks all very well to me. More on that later.
> One question: Why do you want an explicit u.yield() ?
> Uthreads are scheduled automatically now, like real
> threads. Do you see a major drawback in supporting
> this, maybe as an option? Or do you see automatic
> scheduling as an extra construct on top with a special
> "scheduler" uthread?

See my response near the top to Eric about this.  I was thinking of a
lower-level concept, like coroutines.  I might consider automatic
scheduling of uthreads too.  But I've noticed that there are some ugly
hacks in that code, too. :-)

I've lived for years with a system (early Amoeba) that had threads
with only explicit scheduling: other threads would only run when you
were blocked for I/O or waiting for a semaphore.  It made for very
easy coding in some cases, since you didn't need to protect critical
sections with mutexes.  Unles, that is, you invoke stuff that might do
I/O (maybe for debugging :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)