Re: [Python-Dev] How stackless can Python be?

23 May 1999

      After a good sleep, I can answer this one by myself.

I wrote:
...
to make the core interpreter stackless is one thing.
...
Internals like apply are rather uncomplicated to convert.
CallObjectWithKeywords is done.
What I have *no* good solution for is map.
Map does an iteration over evaluations and keeps
state while it is running. The same applies to reduce,
but it seems to be not used so much. Map is.
...
About stackless map,
and this applies to every extension module
which *wants* to be stackless. We don't have to enforce
everybody to be stackless, but there is a couple of
modules which would benefit from it.

The problem with map is, that it needs to keep state,
while repeatedly calling objects which might call
the interpreter. Even if we kept local variables
in the caller's frame, this would still be not
stateless. The info that a map is running is sitting
on the hardware stack, and that's wrong.

Now a solution. In my last post, I argued that I don't
want to replace map by a slower Python function. But
that gave me the key idea to solve this:

C functions which cannot tail-recursively unwound to
return an executable frame object must instead return
themselves as a frame object. That's it! Frames need
again to be a little extended. They have to spell their
interpreter, which normally is the old eval_code loop.

Anatomy of a standard frame invocation:
A new frame is created, parameters are inserted,
the frame is returned to the frame dispatcher,
which runs the inner eval_code loop until it bails out.
On return, special cases of control flow are handled,
as there are exception, returning, and now also calling.
This is an eval_code frame, since eval_code is its
execution handler.

Anatomy of a map frame invocation:
Map has several phases. The first phases to
argument checking and basic setup.
The last phase is iteration over function calls
and building the result. This phase must be split
off as a second function, eval_map.
A new frame is created, with all temporary variables
placed there. eval_map is inserted as the execution
handler.

Now, I think the analogy is obvious.
By building proper frames, it should be possible
to turn any extension function into a stackless function.

The overall protocol is:
A C function which does a simple computation which cannot
cause an interpreter invocation, may simply evaluate
and return a value.
A C function which might cause an interpreter invocation,
should return a freshly created frame as return value.
- This can be done either in a tail-recursive fashion,
  if the last action of the C function would basically 
  be calling the frame.
- If no tail-recursion is possible, the function must
  return a new frame for itself, with an executor
  for its purpose.

A good stackless candidate is Fredrik's xmlop, which
calls back into the interpreter. If that worked
without the hardware stack, then we could build
ultra-fast XML processors with co-routines!

As a side note: 
The frame structure which I sketched
so far is still made for eval_code in the first place,
but it has all necessary flexibilty for pluggable
interpreters. An extension module can now create
its own frame, with its own execution handler, and
throw it back to the frame dispatcher.
In other words: People can create extensions and
test their own VMs if they want.
This was not my primary intent, but comes for free
as a consequence of having a stackless map.

ciao - chris

-- 
Christian Tismer             :^)   mailto:tismer@appliedbiometrics.com
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home