[Python-Dev] Stackless 3.0 alpha 1 at blinding speed

Christian Tismer tismer@tismer.com
Thu, 17 Apr 2003 06:40:32 +0200

Dear community, dear Stackless addicts, dear friends,

Ich habe Euch wirklich was zu erz=E4hlen, liebe Freunde,

I really have to tell you a story!

During the last four months, I have been struggling with
Stackless Python, and especially with myself and how to
get re-focused on my major project which you know very well.
Some of you might know quite well too how hard this was for me,
especially in the context of my parent's endangeroured health.
This particular problem seems to be solved,
for the moment, so let's celebrate the moment, celebrate the moment!

Without going into details, I would like to tell you about the
current status of Stackless Python.
For short, like an abstract, Stackless 3.0 is something like an
or-merge of Stackless 1.0 and 2.0 technology.

Guido, Tim, you both will probably remember my lengthy approaches
to introduce those continuations, years ago, you both convinced
me to drop them, and I did what I was supposed to do. I'm hopefully
a proper citizen, right now. Anyway, you know I'll never really be...

After a long period of depression, I re-invented Stackless in early
2002, with a version number of 2.0, denoting that I had dropped all the
1.0 paradigms (as there are: (1) try to keep compatible, (2) do minimal
changes only, (3) absolutely avoid assembly code at all)

At the same time, I dismissed all of my Stackless 1.0 code, which was
continuation-based, an absolute no-no in Guido's eyes. I still do think
that TimP wasn't that conformant to this "nono"-statement, after I read
a lot of his comments, especially side-notes on the thread-sig,
but this time Guido's veto was clearly stronger than Tim's arguing,
a thing that doesn't happen so often, but I'm proactively respecting
this, positively.

Now, after all that rubbish, let's go into facts, which are quite


Today, I finished Stackless Python 3.0, alpha 3.0.1!

First of all, I would like to talk about the new principles.
Yes, no, there are no longer continuations in that sense.
I'm meanwhile convinced that we don't want to support them,
any longer, although I'm happy that Stackless allowed me to
learn *all* any much more about them that that is avalable
on the wor(th|ld) w/h)i(d|l)e net!!


Q: What is it about that Stackless 3.0, will this guy never shut up???

A: No, he most probably never will, unless he's dead, and this is
another 40 or more years in advance, for heaven's sake.

Q: So, what is it about that Stackless 3.0 hype around since months?

A: Simple! Stackless 3.0 has all the hardware switching stuff in it
that Stackless 2.0 had. Stackless 3.0 also incorporates 80% of the
soft switching protocol that Stackless 1.0 had.
But there are a lot of new features:
Stackless has again shown how to marry the impossible with the
imbelievable, and this is the new concept of Stackless 3.0:
There is a maerge between (1.0) soft context switching and (2.0)
hard context switching, which always does the most reasonable thing.

There are a lot of benefits which stem from this hybrid solution,
which will appear in one of my most recent papers, pretty soon.



Let me simply end this pamphlete with some simple sentences:
Stackless Python is more capable of tasklets switching than any
other light-weight threading software package.
If anyone disagrees, please give me a runnable counter-example.

Here are some impressive site-specific time measurements, which
especially show, that 20.000.000 cframe tasklet switches per
second are really, really hard to beat.

Pythonon Win32:

D:\slpdev\src\2.2\src\Stackless\test>..\..\pcbuild\python taskspeed.py
10000000 frame switches      took 3.83061 seconds, rate =3D    2610551/s
10000000 frame softswitches  took 2.40112 seconds, rate =3D    4164718/s
10000000 cfunction calls     took 2.13033 seconds, rate =3D    4694098/s
10000000 cframe softswitches took 0.49296 seconds, rate =3D   20285627/s
10000000 cframe switches     took 1.98907 seconds, rate =3D    5027486/s
10000000 cframe 100 words    took 3.93737 seconds, rate =3D    2539768/s
The penalty per stack word is about 0.980 percent of raw switching.
Stack size of initial stub   =3D 14
Stack size of frame tasklet  =3D 58
Stack size of cframe tasklet =3D 35


Python on Debian

Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/