After a good sleep, I can answer this one by myself.
to make the core interpreter stackless is one thing.
Internals like apply are rather uncomplicated to convert. CallObjectWithKeywords is done.
What I have *no* good solution for is map. Map does an iteration over evaluations and keeps state while it is running. The same applies to reduce, but it seems to be not used so much. Map is.
About stackless map, and this applies to every extension module which *wants* to be stackless. We don't have to enforce everybody to be stackless, but there is a couple of modules which would benefit from it.
The problem with map is, that it needs to keep state, while repeatedly calling objects which might call the interpreter. Even if we kept local variables in the caller's frame, this would still be not stateless. The info that a map is running is sitting on the hardware stack, and that's wrong.
Now a solution. In my last post, I argued that I don't want to replace map by a slower Python function. But that gave me the key idea to solve this:
C functions which cannot tail-recursively unwound to return an executable frame object must instead return themselves as a frame object. That's it! Frames need again to be a little extended. They have to spell their interpreter, which normally is the old eval_code loop.
Anatomy of a standard frame invocation: A new frame is created, parameters are inserted, the frame is returned to the frame dispatcher, which runs the inner eval_code loop until it bails out. On return, special cases of control flow are handled, as there are exception, returning, and now also calling. This is an eval_code frame, since eval_code is its execution handler.
Anatomy of a map frame invocation: Map has several phases. The first phases to argument checking and basic setup. The last phase is iteration over function calls and building the result. This phase must be split off as a second function, eval_map. A new frame is created, with all temporary variables placed there. eval_map is inserted as the execution handler.
Now, I think the analogy is obvious. By building proper frames, it should be possible to turn any extension function into a stackless function.
The overall protocol is: A C function which does a simple computation which cannot cause an interpreter invocation, may simply evaluate and return a value. A C function which might cause an interpreter invocation, should return a freshly created frame as return value. - This can be done either in a tail-recursive fashion, if the last action of the C function would basically be calling the frame. - If no tail-recursion is possible, the function must return a new frame for itself, with an executor for its purpose.
A good stackless candidate is Fredrik's xmlop, which calls back into the interpreter. If that worked without the hardware stack, then we could build ultra-fast XML processors with co-routines!
As a side note: The frame structure which I sketched so far is still made for eval_code in the first place, but it has all necessary flexibilty for pluggable interpreters. An extension module can now create its own frame, with its own execution handler, and throw it back to the frame dispatcher. In other words: People can create extensions and test their own VMs if they want. This was not my primary intent, but comes for free as a consequence of having a stackless map.
ciao - chris