Is Stackless Python DEAD?

Tue Jan 1 14:13:15 EST 2002

Christian Tismer <tismer at tismer.com> writes:

> Well, it is a little late to answer this, but...

Hey, I don't care, I'm just glad to see you're still paying attention
:)

> Michael Hudson wrote:
[...] 
>  >>  * Just to implement map, 150 lines of builtin_map had to be
>  >>    rewritten into 350 lines (builtin_map, make_stub_code,
>  >>    make_map_frame, builtin_map_nr, builtin_map_loop). The author
>  >>    indicates that the same procedure still needs to be done for
>  >>    apply and filter. Just what is the "same procedure"? Isn't there
>  >>    some better way?
>  >>
>  >
>  > This is where implementing stackless in C really, really hurts.
> 
> 
> [great explanation of stackless techniques skipped.]
> 
> I agree this is not easy to undrstand and to implement.
> I always was thinking of a framework which makes this
> easier, but I didn'tcome up with something suitable.

I've had similar thoughts, but likewise fell short.  I think you could
probably do things with m4 that took something readable and spat out
stack-neutral C, but it would be a Major Project.

>  >>- The code adds PREPARE macros into each branch of ceval. Why?
>  >>- It adds a long list of explicitly not-supported opcodes into
>  >>  the ceval switch, instead of using 'default:'. No explanation
>  >>  for that change is given, other than 'unused opcodes go here'.
>  >>  Is it necessary to separately maintain them? Why?
>  >>
>  >
>  > This was an optimization Chris used to try and get back some of the
>  > performance lost during the stackless changes.  IIRC, he handles
>  > exceptions and return values as "pseudo-opcodes" rather than using the
>  > WHY_foo constants the current ceval.c uses.  I never really understood
>  > this part.
> 
> 
> This is really just an optimization.
> The PREPARE macros were used to limit code increase, and to
> gove me some more options to play with.
> Finally, the PREPARE macros do an optimized opcode prefetch
> which turns out to be a drastical speedup for the interpreter loop.
> Standard Python does an increment for every byte code and then
> one for the optional argument, and the argument is picked bytewise.
> What I do is a single add to the program counter, dependent of the
> opcode/argument size which is computed in the PREPARE macro.
> Then, on intel machines, I use a short word access to the argument
> which gives a considerable savings. (Although this wouldn't be
> necessary if the compilers weren't that dumb).

This could/should be split off from stackless, right?

>  >>It may be that some of these questions can be answered giving a good
>  >>reason for the change, but I doubt that this code can be incorporated
>  >>as-is, just saying "you need all of this for Stackless Python". I
>  >>don't believe you do, but I cannot work it out myself, either.
>  >>
>  >
>  > I think integrating stackless into the core is a fairly huge amount of
>  > work.  I'd like to think I could do it, given several months of
>  > full-time effort (which isn't going to happen).  About the only likely
>  > way I see for it to get in is for it to become important to Zope
>  > Corp. for some reason, and them paying Tim or Guido (or Chris) to do
>  > it.
> 
> 
> I'm at a redesign for Stackless 2.2. I hope to make it simpler,
> split apart Stackless and optimization, 

Ah :)

> and continuations are no longer my primary target, but built-in
> microthreads.

Fair enough.  Glad to hear you've found some time for your baby!

Cheers,
M.