From tim_one@email.msn.com  Sat May  1 09:32:30 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 1 May 1999 04:32:30 -0400
Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.)
In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us>
Message-ID: <000801be93ad$27772ea0$7a9e2299@tim>

[Andrew M. Kuchling]
> ...
> A performance improvement project would definitely be a good idea
> for 1.6, and a good sub-topic for python-dev.

To the extent that optimization requires uglification, optimization got
pushed beyond Guido's comfort zone back around 1.4 -- little has made it in
since then.

Not griping; I'm just trying to avoid enduring the same discussions for the
third to twelfth times <wink>.

Anywho, on the theory that a sweeping speedup patch has no chance of making
it in regardless, how about focusing on one subsystem?  In my experience,
the speed issue Python gets beat up the most for is the relative slowness of
function calls.  It would be very good if eval_code2 somehow or other could
manage to invoke a Python function without all the hair of a recursive C
call, and I believe Guido intends to move in that direction for Python2
anyway.  This would be a good time to start exploring that seriously.

inspirationally y'rs  - tim




From da@ski.org  Sat May  1 23:15:32 1999
From: da@ski.org (David Ascher)
Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <37296856.5875AAAF@lemburg.com>
Message-ID: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>

> Since you put out to objectives, I'd like to propose a little
> different approach...
> 
> 1. Have eval/exec accept any mapping object as input
> 
> 2. Make those two copy the content of the mapping object into real
>    dictionaries
> 
> 3. Provide a hook into the dictionary implementation that can be
>    used to redirect KeyErrors and use that redirection to forward
>    the request to the original mapping objects

Interesting counterproposal.  I'm not sure whether any of the proposals on
the table really do what's needed for e.g. case-insensitive namespace
handling.  I can see how all of the proposals so far allow
case-insensitive reference name handling in the global namespace, but
don't we also need to hook into the local-namespace creation process to
allow case-insensitivity to work throughout? 

--david





From da@ski.org  Sun May  2 16:15:57 1999
From: da@ski.org (David Ascher)
Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9905020810270.152-100000@david.ski.org>

On Sun, 2 May 1999, Mark Hammond wrote:

> > I'm not sure whether any of the
> > proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation
> > process to
> > allow case-insensitivity to work throughout?
> 
> Why not?  I pictured case insensitive namespaces working so that they
> retain the case of the first assignment, but all lookups would be
> case-insensitive.
> 
> Ohh - right!  Python itself would need changing to support this.  I suppose
> that faced with code such as:
> 
> def func():
>   if spam:
>     Spam=1
> 
> Python would generate code that refers to "spam" as a local, and "Spam" as
> a global.
> 
> Is this why you feel it wont work?

I hadn't thought of that, to be truthful, but I think it's more generic.
[FWIW, I never much cared for the tag-variables-at-compile-time
optimization in CPython, and wouldn't miss it if were lost.]

The point is that if I eval or exec code which calls a function specifying
some strange mapping as the namespaces (global and current-local) I
presumably want to also specify how local namespaces work for the
function calls within that code snippet.  That means that somehow Python
has to know what kind of namespace to use for local environments, and not
use the standard dictionary.  Maybe we can simply have it use a
'.clear()'ed .__copy__ of the specified environment.

  exec 'foo()' in globals(), mylocals

would then call foo and within foo, the local env't would be
mylocals.__copy__.clear().  

Anyway, something for those-with-the-patches to keep in mind.  

--david




From tismer@appliedbiometrics.com  Sun May  2 14:00:37 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 02 May 1999 15:00:37 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com>


David Ascher wrote:
[Marc:> 
> > Since you put out to objectives, I'd like to propose a little
> > different approach...
> >
> > 1. Have eval/exec accept any mapping object as input
> >
> > 2. Make those two copy the content of the mapping object into real
> >    dictionaries
> >
> > 3. Provide a hook into the dictionary implementation that can be
> >    used to redirect KeyErrors and use that redirection to forward
> >    the request to the original mapping objects

I don't think that this proposal would give so much new
value. Since a mapping can also be implemented in arbitrary
ways, say by functions, a mapping is not necessarily finite
and might not be changeable into a dict.

[David:>
> Interesting counterproposal.  I'm not sure whether any of the proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation process to
> allow case-insensitivity to work throughout?

Case-independant namespaces seem to be a minor point,
nice to have for interfacing to other products, but then,
in a function, I see no benefit in changing the semantics
of function locals? The lookup of foreign symbols would 
always be through a mapping object. If you take COM for 
instance, your access to a COM wrapper for an arbitrary
object would be through properties of this object. After
assignment to a local function variable, why should we
support case-insensitivity at all?

I would think mapping objects would be a great 
simplification of lazy imports in COM, where
we would like to avoid to import really huge
namespaces in one big slurp. Also the wrapper code
could be made quite a lot easier and faster without
so much getattr/setattr trapping.

Does btw. anybody really want to see case-insensitivity
in Python programs? I'm quite happy with it as it is,
and I would even force the use to always use the same
case style after he has touched an external property
once. Example for Excel: You may write "xl.workbooks"
in lowercase, but then you have to stay with it.
This would keep Python source clean for, say, PyLint.

my 0.02 Euro - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From MHammond@skippinet.com.au  Sun May  2 00:28:11 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Sun, 2 May 1999 09:28:11 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat>

> I'm not sure whether any of the
> proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation
> process to
> allow case-insensitivity to work throughout?

Why not?  I pictured case insensitive namespaces working so that they
retain the case of the first assignment, but all lookups would be
case-insensitive.

Ohh - right!  Python itself would need changing to support this.  I suppose
that faced with code such as:

def func():
  if spam:
    Spam=1

Python would generate code that refers to "spam" as a local, and "Spam" as
a global.

Is this why you feel it wont work?

Mark.



From mal@lemburg.com  Sun May  2 20:24:54 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 02 May 1999 21:24:54 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org> <372C4C75.5B7CCAC8@appliedbiometrics.com>
Message-ID: <372CA686.215D71DF@lemburg.com>

Christian Tismer wrote:
> 
> David Ascher wrote:
> [Marc:>
> > > Since you put out the objectives, I'd like to propose a little
> > > different approach...
> > >
> > > 1. Have eval/exec accept any mapping object as input
> > >
> > > 2. Make those two copy the content of the mapping object into real
> > >    dictionaries
> > >
> > > 3. Provide a hook into the dictionary implementation that can be
> > >    used to redirect KeyErrors and use that redirection to forward
> > >    the request to the original mapping objects
> 
> I don't think that this proposal would give so much new
> value. Since a mapping can also be implemented in arbitrary
> ways, say by functions, a mapping is not necessarily finite
> and might not be changeable into a dict.

[Disclaimer: I'm not really keen on having the possibility of
 letting code execute in arbitrary namespace objects... it would
 make code optimizations even less manageable.]

You can easily support infinite mappings by wrapping the
function into an object which returns an empty list
for .items() and then use the hook mentioned in 3 to
redirect the lookup to that function.

The proposal allows one to use such a proxy to simulate any
kind of mapping -- it works much like the __getattr__ hook
provided for instances.
 
> [David:>
> > Interesting counterproposal.  I'm not sure whether any of the proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation process to
> > allow case-insensitivity to work throughout?
> 
> Case-independant namespaces seem to be a minor point,
> nice to have for interfacing to other products, but then,
> in a function, I see no benefit in changing the semantics
> of function locals? The lookup of foreign symbols would
> always be through a mapping object. If you take COM for
> instance, your access to a COM wrapper for an arbitrary
> object would be through properties of this object. After
> assignment to a local function variable, why should we
> support case-insensitivity at all?
>
> I would think mapping objects would be a great
> simplification of lazy imports in COM, where
> we would like to avoid to import really huge
> namespaces in one big slurp. Also the wrapper code
> could be made quite a lot easier and faster without
> so much getattr/setattr trapping.

What do lazy imports have to do with case [in]sensitive
namespaces ? Anyway, how about a simple lazy import
mechanism in the standard distribution, i.e. why not make
all imports lazy ? Since modules are first class objects
this should be easy to implement...
 
> Does btw. anybody really want to see case-insensitivity
> in Python programs? I'm quite happy with it as it is,
> and I would even force the use to always use the same
> case style after he has touched an external property
> once. Example for Excel: You may write "xl.workbooks"
> in lowercase, but then you have to stay with it.
> This would keep Python source clean for, say, PyLint.

"No" and "me too" ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 243 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/




From MHammond@skippinet.com.au  Mon May  3 01:52:41 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Mon, 3 May 1999 10:52:41 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <372CA686.215D71DF@lemburg.com>
Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat>

[Marc]
> [Disclaimer: I'm not really keen on having the possibility of
>  letting code execute in arbitrary namespace objects... it would
>  make code optimizations even less manageable.]

Good point - although surely that would simply mean (certain) optimisations
can't be performed for code executing in that environment?  How to detect
this at "optimization time" may be a little difficult :-)

However, this is the primary purpose of this thread - to workout _if_ it is
a good idea, as much as working out _how_ to do it :-)

> The proposal allows one to use such a proxy to simulate any
> kind of mapping -- it works much like the __getattr__ hook
> provided for instances.

My only problem with Marc's proposal is that there already _is_ an
established mapping protocol, and this doesnt use it; instead it invents a
new one with the benefit being potentially less code breakage.

And without attempting to sound flippant, I wonder how many extension
modules will be affected?  Module init code certainly assumes the module
__dict__ is a dictionary, but none of my code assumes anything about other
namespaces.  Marc's extensions may be a special case, as AFAIK they inject
objects into other dictionaries (ie, new builtins?).  Again, not trying to
downplay this too much, but if it is only a problem for Marc's more
esoteric extensions, I dont feel that should hold up an otherwise solid
proposal.

[Chris, I think?]
> > Case-independant namespaces seem to be a minor point,
> > nice to have for interfacing to other products, but then,
> > in a function, I see no benefit in changing the semantics
> > of function locals? The lookup of foreign symbols would

I disagree here.  Consider Alice, and similar projects, where a (arguably
misplaced, but nonetheless) requirement is that the embedded language be
case-insensitive.  Period.  The Alice people are somewhat special in that
they had the resources to change the interpreters guts.  Most people wont,
and will look for a different language to embedd.

Of course, I agree with you for the specific cases you are talking - COM,
Active Scripting etc.  Indeed, everything I would use this for would prefer
to keep the local function semantics identical.

> > Does btw. anybody really want to see case-insensitivity
> > in Python programs? I'm quite happy with it as it is,
> > and I would even force the use to always use the same
> > case style after he has touched an external property
> > once. Example for Excel: You may write "xl.workbooks"
> > in lowercase, but then you have to stay with it.
> > This would keep Python source clean for, say, PyLint.
>
> "No" and "me too" ;-)

I think we are missing the point a little.  If we focus on COM, we may come
up with a different answer.  Indeed, if we are to focus on COM integration
with Python, there are other areas I would prefer to start with :-)

IMO, we should attempt to come up with a more flexible namespace mechanism
that is in the style of Python, and will not noticeably slowdown Python.
Then COM etc can take advantage of it - much in the same way that Python's
existing namespace model existed pre-COM, and COM had to take advantage of
what it could!

Of course, a key indicator of the likely success is how well COM _can_ take
advantage of it, and how much Alice could have taken advantage of it - I
cant think of any other yardsticks?

Mark.



From mal@lemburg.com  Mon May  3 08:56:53 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 03 May 1999 09:56:53 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <372D56C5.4738DE3D@lemburg.com>

Mark Hammond wrote:
> 
> [Marc]
> > [Disclaimer: I'm not really keen on having the possibility of
> >  letting code execute in arbitrary namespace objects... it would
> >  make code optimizations even less manageable.]
> 
> Good point - although surely that would simply mean (certain) optimisations
> can't be performed for code executing in that environment?  How to detect
> this at "optimization time" may be a little difficult :-)
> 
> However, this is the primary purpose of this thread - to workout _if_ it is
> a good idea, as much as working out _how_ to do it :-)
> 
> > The proposal allows one to use such a proxy to simulate any
> > kind of mapping -- it works much like the __getattr__ hook
> > provided for instances.
> 
> My only problem with Marc's proposal is that there already _is_ an
> established mapping protocol, and this doesnt use it; instead it invents a
> new one with the benefit being potentially less code breakage.

...and that's the key point: you get the intended features and
the core code will not have to be changed in significant ways.
Basically, I think these kind of core extensions should be done
in generic ways, e.g. by letting the eval/exec machinery accept
subclasses of dictionaries, rather than trying to raise the
abstraction level used and slowing things down in general
just to be able to use the feature on very few occasions.

> And without attempting to sound flippant, I wonder how many extension
> modules will be affected?  Module init code certainly assumes the module
> __dict__ is a dictionary, but none of my code assumes anything about other
> namespaces.  Marc's extensions may be a special case, as AFAIK they inject
> objects into other dictionaries (ie, new builtins?).  Again, not trying to
> downplay this too much, but if it is only a problem for Marc's more
> esoteric extensions, I dont feel that should hold up an otherwise solid
> proposal.

My mxTools extension does the assignment in Python, so it wouldn't
be affected. The others only do the usual modinit() stuff.

Before going any further on this thread we may have to ponder a little
more on the objectives that we have. If it's only case-insensitive
lookups then I guess a simple compile time switch exchanging the
implementations of string hash and compare functions would do the
trick. If we're after doing wild things like lookups accross
networks, then a more specific approach is needed.

So what is it that we want in 1.6 ?

> [Chris, I think?]
> > > Case-independant namespaces seem to be a minor point,
> > > nice to have for interfacing to other products, but then,
> > > in a function, I see no benefit in changing the semantics
> > > of function locals? The lookup of foreign symbols would
> 
> I disagree here.  Consider Alice, and similar projects, where a (arguably
> misplaced, but nonetheless) requirement is that the embedded language be
> case-insensitive.  Period.  The Alice people are somewhat special in that
> they had the resources to change the interpreters guts.  Most people wont,
> and will look for a different language to embedd.
> 
> Of course, I agree with you for the specific cases you are talking - COM,
> Active Scripting etc.  Indeed, everything I would use this for would prefer
> to keep the local function semantics identical.

As I understand the needs in COM and AS you are talking about
object attributes, right ? Making these case-insensitive is
a job for a proxy or a __getattr__ hack.
 
> > > Does btw. anybody really want to see case-insensitivity
> > > in Python programs? I'm quite happy with it as it is,
> > > and I would even force the use to always use the same
> > > case style after he has touched an external property
> > > once. Example for Excel: You may write "xl.workbooks"
> > > in lowercase, but then you have to stay with it.
> > > This would keep Python source clean for, say, PyLint.
> >
> > "No" and "me too" ;-)
> 
> I think we are missing the point a little.  If we focus on COM, we may come
> up with a different answer.  Indeed, if we are to focus on COM integration
> with Python, there are other areas I would prefer to start with :-)
> 
> IMO, we should attempt to come up with a more flexible namespace mechanism
> that is in the style of Python, and will not noticeably slowdown Python.
> Then COM etc can take advantage of it - much in the same way that Python's
> existing namespace model existed pre-COM, and COM had to take advantage of
> what it could!
> 
> Of course, a key indicator of the likely success is how well COM _can_ take
> advantage of it, and how much Alice could have taken advantage of it - I
> cant think of any other yardsticks?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 242 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/




From fredrik@pythonware.com  Mon May  3 15:01:10 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 16:01:10 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com>

scriptics is positioning tcl as a perl killer:

    http://www.scriptics.com/scripting/perl.html

afaict, unicode and event handling are the two
main thingies missing from python 1.5.

-- unicode: is on its way.

-- event handling: asynclib/asynchat provides an
awesome framework for event-driven socket pro-
gramming.  however, Python still lacks good cross-
platform support for event-driven access to files
and pipes.  are threads good enough, or would it
be cool to have something similar to Tcl's fileevent
stuff in Python?

-- regexps: has anyone compared the new uni-
code-aware regexp package in Tcl with pcre?

comments?

</F>

btw, the rebol folks have reached 2.0:
    http://www.rebol.com/

maybe 1.6 should be renamed to Python 6.0?



From akuchlin@cnri.reston.va.us  Mon May  3 16:14:15 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:14:15 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
 <005b01be956d$66d48450$f29b12c2@pythonware.com>
Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us>

Fredrik Lundh writes:
>-- regexps: has anyone compared the new uni-
>code-aware regexp package in Tcl with pcre?

	I looked at it a bit when Tcl 8.1 was in beta; it derives from
Henry Spencer's 1998-vintage code, which seems to try to do a lot of
optimization and analysis.  It may even compile DFAs instead of NFAs
when possible, though it's hard for me to be sure.  This might give it
a substantial speed advantage over engines that do less analysis, but
I haven't benchmarked it.  The code is easy to read, but difficult to
understand because the theory underlying the analysis isn't explained
in the comments; one feels there should be an accompanying paper to
explain how everything works, and it's why I'm not sure if it really
is producing DFAs for some expressions.

	Tcl seems to represent everything as UTF-8 internally, so
there's only one regex engine; there's .  The code is scattered over
more files:

amarok generic>ls re*.[ch]
regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
amarok generic>wc -l re*.[ch]
     742 regc_color.c
     170 regc_cvec.c
    1010 regc_lex.c
     781 regc_locale.c
    1528 regc_nfa.c
    2124 regcomp.c
      85 regcustom.h
     627 rege_dfa.c
      82 regerror.c
      18 regerrs.h
     308 regex.h
     952 regexec.c
      25 regfree.c
      56 regfronts.c
     388 regguts.h
    8896 total
amarok generic>

	This would be an issue for using it with Python, since all
these files would wind up scattered around the Modules directory.  For
comparison, pypcre.c is around 4700 lines of code.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Things need not have happened to be true. Tales and dreams are the
shadow-truths that will endure when mere facts are dust and ashes, and forgot.
    -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_



From guido@CNRI.Reston.VA.US  Mon May  3 16:32:09 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 11:32:09 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT."
 <14125.47524.196878.583460@amarok.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com>
 <14125.47524.196878.583460@amarok.cnri.reston.va.us>
Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us>

> 	I looked at it a bit when Tcl 8.1 was in beta; it derives from
> Henry Spencer's 1998-vintage code, which seems to try to do a lot of
> optimization and analysis.  It may even compile DFAs instead of NFAs
> when possible, though it's hard for me to be sure.  This might give it
> a substantial speed advantage over engines that do less analysis, but
> I haven't benchmarked it.  The code is easy to read, but difficult to
> understand because the theory underlying the analysis isn't explained
> in the comments; one feels there should be an accompanying paper to
> explain how everything works, and it's why I'm not sure if it really
> is producing DFAs for some expressions.
> 
> 	Tcl seems to represent everything as UTF-8 internally, so
> there's only one regex engine; there's .

Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
point the regex engine was compiled twice, once for 8-bit chars and
once for 16-bit chars.  But this may have changed.

I've noticed that Perl is taking the same position (everything is
UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
from 8-bit bytes.  Python is currently in the Java camp.  This might
be a good time to make sure that we're still convinced that this is
the right thing to do!

> The code is scattered over
> more files:
> 
> amarok generic>ls re*.[ch]
> regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
> regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
> regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
> amarok generic>wc -l re*.[ch]
>      742 regc_color.c
>      170 regc_cvec.c
>     1010 regc_lex.c
>      781 regc_locale.c
>     1528 regc_nfa.c
>     2124 regcomp.c
>       85 regcustom.h
>      627 rege_dfa.c
>       82 regerror.c
>       18 regerrs.h
>      308 regex.h
>      952 regexec.c
>       25 regfree.c
>       56 regfronts.c
>      388 regguts.h
>     8896 total
> amarok generic>
> 
> 	This would be an issue for using it with Python, since all
> these files would wind up scattered around the Modules directory.  For
> comparison, pypcre.c is around 4700 lines of code.

I'm sure that if it's good code, we'll find a way.  Perhaps a more
interesting question is whether it is Perl5 compatible.  I contacted
Henry Spencer at the time and he was willing to let us use his code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin@cnri.reston.va.us  Mon May  3 16:56:46 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:56:46 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
 <005b01be956d$66d48450$f29b12c2@pythonware.com>
 <14125.47524.196878.583460@amarok.cnri.reston.va.us>
 <199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
>point the regex engine was compiled twice, once for 8-bit chars and
>once for 16-bit chars.  But this may have changed.

	It doesn't seem to currently; the code in tclRegexp.c looks
like this:

    /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the
     * matches from character to byte offsets.
     */
    regexpPtr->string = string;
    Tcl_DStringInit(&stringBuffer);
    uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer);
    numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar);
    /* Perform the regexp match. */
    result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1,
            ((string > start) ? REG_NOTBOL : 0));

	ISTR the Spencer engine does, however, define a small and
large representation for NFAs and have two versions of the engine, one
for each representation.  Perhaps that's what you're thinking of.

>I've noticed that Perl is taking the same position (everything is
>UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
>from 8-bit bytes.  Python is currently in the Java camp.  This might
>be a good time to make sure that we're still convinced that this is
>the right thing to do!

	I don't know.  There's certainly the fundamental dichotomy
that strings are sometimes used to represent characters, where
changing encodings on input and output is reasonably, and sometimes
used to hold chunks of binary data, where any changes are incorrect.
Perhaps Paul Prescod is right, and we should try to get some other
data type (array.array()) for holding binary data, as distinct from
strings.

>I'm sure that if it's good code, we'll find a way.  Perhaps a more
>interesting question is whether it is Perl5 compatible.  I contacted
>Henry Spencer at the time and he was willing to let us use his code.

	Mostly Perl-compatible, though it doesn't look like the 5.005
features are there, and I haven't checked for every single 5.004
feature.  Adding missing features might be problematic, because I
don't really understand what the code is doing at a high level.  Also,
is there a user community for this code?  Do any other projects use
it?  Philip Hazel has been quite helpful with PCRE, an important thing
when making modifications to the code.
 
	Should I make a point of looking at what using the Spencer
engine would entail?  It might not be too difficult (an evening or
two, maybe?) to write a re.py that sat on top of the Spencer code;
that would at least let us do some benchmarking.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In Einstein's theory of relativity the observer is a man who sets out in quest
of truth armed with a measuring-rod. In quantum theory he sets out with a
sieve.
    -- Sir Arthur Eddington




From guido@CNRI.Reston.VA.US  Mon May  3 17:02:22 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 12:02:22 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT."
 <14125.49911.982236.754340@amarok.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>
 <14125.49911.982236.754340@amarok.cnri.reston.va.us>
Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us>

> 	Should I make a point of looking at what using the Spencer
> engine would entail?  It might not be too difficult (an evening or
> two, maybe?) to write a re.py that sat on top of the Spencer code;
> that would at least let us do some benchmarking.

Surely this would be more helpful than weeks of specilative emails --
go for it!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Mon May  3 18:10:55 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:10:55 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us>
Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com>

> Also, is there a user community for this code?

how about comp.lang.tcl ;-)

</F>



From fredrik@pythonware.com  Mon May  3 18:15:00 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:15:00 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>             <14125.49911.982236.754340@amarok.cnri.reston.va.us>  <199905031602.MAA05829@eric.cnri.reston.va.us>
Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com>

talking about regexps, here's another thing that
would be quite nice to have in 1.6 (available from
the Python level, that is).  or is it already in there
somewhere?

</F>

...

http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873

Tcl 8.1b3 Request:  Generated by Scriptics' bug entry form at

Submitted by:  Frederic BONNET
OperatingSystem:  Windows 98
CustomShell:  Applied patch to the regexp engine (the exec part)
Synopsis:  regexp improvements

DesiredBehavior:
    As previously requested by Don Libes:
    
    > I see no way for Tcl_RegExpExec to indicate "could match" meaning
    > "could match if more characters arrive that were suitable for a
    > match".  This is required for a class of applications involving
    > matching on a stream required by Expect's interact command.  Henry
    > assured me that this facility would be in the engine (I'm not the only
    > one that needs it).  Note that it is not sufficient to add one more
    > return value to Tcl_RegExpExec (i.e., 2) because one needs to know
    > both if something matches now and can match later.  I recommend
    > another argument (canMatch *int) be added to Tcl_RegExpExec.

/patch info follows/

...



From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Mon May  3 23:28:23 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Mon, 3 May 1999 18:28:23 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us>

I've been using Jitterbug for a couple of weeks now as my bug database
for Mailman and JPython.  So it was easy enough for me to set up a
database for Python bug reports.  Guido is in the process of tailoring 
the Jitterbug web interface to his liking and will announce it to the
appropriate forums when he's ready.

In the meantime, I've created YAML that you might be interested in.
All bug reports entered into Jitterbug will be forwarded to
python-bugs-list@python.org.  You are invited to subscribe to the list 
by visiting

    http://www.python.org/mailman/listinfo/python-bugs-list

Enjoy,
-Barry


From jeremy@cnri.reston.va.us  Mon May  3 23:30:10 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Mon,  3 May 1999 18:30:10 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
References: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>

Pretty low volume list, eh?


From MHammond@skippinet.com.au  Tue May  4 00:28:39 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Tue, 4 May 1999 09:28:39 +1000
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>
Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat>

ha - we wish.  More likely to be full of detailed bug reports about how 1/2
!= 0.5, or that "def foo(baz=[])" is buggy, etc :-)

Mark.

> Pretty low volume list, eh?



From tim_one@email.msn.com  Tue May  4 06:16:17 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:16:17 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <000701be95ed$3d594180$dca22299@tim>

[Guido & Andrew on Tcl's new regexp code]
> I'm sure that if it's good code, we'll find a way.  Perhaps a more
> interesting question is whether it is Perl5 compatible.  I contacted
> Henry Spencer at the time and he was willing to let us use his code.

Haven't looked at the code, but did read the manpage just now:

    http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm

WRT Perl5 compatibility, it sez:

    Incompatibilities of note include `\b', `\B', the lack of special
    treatment for a trailing newline, the addition of complemented
    bracket expressions to the things affected by newline-sensitive
    matching, the restrictions on parentheses and back references in
    lookahead constraints, and the longest/shortest-match (rather than
    first-match) matching semantics.

So some gratuitous differences, and maybe a killer:  Guido hasn't had much
kind to say about "longest" (aka POSIX) matching semantics.  An example from
the page:

    (week|wee)(night|knights)
    matches all ten characters of `weeknights'

which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
'night'.

It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
is correct; indeed, it's a pain to get that behavior any other way!

otoh-it's-potentially-very-much-faster-ly y'rs  - tim




From tim_one@email.msn.com  Tue May  4 06:51:01 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:51:01 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <000701be95ed$3d594180$dca22299@tim>
Message-ID: <000901be95f2$195556c0$dca22299@tim>

[Tim]
> ...
> It's the *natural* semantics if Andrew's suspicion that it's
> compiling a DFA is correct ...

More from the man page:

    AREs report the longest/shortest match for the RE, rather than
    the first found in a specified search order. This may affect some
    RREs which were written in the expectation that the first match
    would be reported. (The careful crafting of RREs to optimize the
    search order for fast matching is obsolete (AREs examine all possible
    matches in parallel, and their performance is largely insensitive to
    their complexity) but cases where the search order was exploited to
    deliberately find a match which was not the longest/shortest will
    need rewriting.)

Nails it, yes?  Now, in 10 seconds, try to remember a regexp where this
really matters <wink>.

Note in passing that IDLE's colorizer regexp *needs* to search for
triple-quoted strings before single-quoted ones, else the P/P semantics
would consider """ to be an empty single-quoted string followed by a double
quote.  This isn't a case where it matters in a bad way, though!  The
"longest" rule picks the correct alternative regardless of the order in
which they're written.

at-least-in-that-specific-regex<0.1-wink>-ly y'rs  - tim




From guido@CNRI.Reston.VA.US  Tue May  4 13:26:04 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 08:26:04 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT."
 <000701be95ed$3d594180$dca22299@tim>
References: <000701be95ed$3d594180$dca22299@tim>
Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us>

[Tim]
> So some gratuitous differences, and maybe a killer:  Guido hasn't had much
> kind to say about "longest" (aka POSIX) matching semantics.
> 
> An example from the page:
> 
>     (week|wee)(night|knights)
>     matches all ten characters of `weeknights'
> 
> which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
> 'night'.
> 
> It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
> is correct; indeed, it's a pain to get that behavior any other way!

Possibly contradicting what I once said about DFAs (I have no idea
what I said any more :-): I think we shouldn't be hung up about the
subtleties of DFA vs. NFA; for most people, the Perl-compatibility
simply means that they can use the same metacharacters.  My guess is
that people don'y so much translate long Perl regexp's to Python but
simply transport their (always incomplete -- Larry Wall *wants* it
that way :-) knowledge of Perl regexps to Python.  My meta-guess is
that this is also Henry Spencer's and John Ousterhout's guess.  As for
Larry Wall, I guess he really doesn't care :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@cnri.reston.va.us  Tue May  4 17:14:41 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Tue,  4 May 1999 12:14:41 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
References: <000701be95ed$3d594180$dca22299@tim>
 <199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Possibly contradicting what I once said about DFAs (I have no idea
>what I said any more :-): I think we shouldn't be hung up about the
>subtleties of DFA vs. NFA; for most people, the Perl-compatibility
>simply means that they can use the same metacharacters.  My guess is

	I don't like slipping in such a change to the semantics with
no visible change to the module name or interface.  On the other hand,
if it's not NFA-based, then it can provide POSIX semantics without
danger of taking exponential time to determine the longest match.
BTW, there's an interesting reference, I assume to this code, in
_Mastering Regular Expressions_; Spencer is quoted on page 121 as
saying it's "at worst quadratic in text size.".

	Anyway, we can let it slide until a Python interface gets written.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In the black shadow of the Baba Yaga babies screamed and mothers miscarried;
milk soured and men went mad.
    -- In SANDMAN #38: "The Hunt"



From guido@CNRI.Reston.VA.US  Tue May  4 17:19:06 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 12:19:06 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT."
 <14127.6410.646122.342115@amarok.cnri.reston.va.us>
References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us>
 <14127.6410.646122.342115@amarok.cnri.reston.va.us>
Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us>

> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

Not sure if that was the same code -- this is *new* code, not
Spencer's old code.  I think Friedl's book is older than the current
code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Wed May  5 06:37:02 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 01:37:02 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <000701be96b9$4e434460$799e2299@tim>

I've consistently found that the best way to kill a thread is to rename it
accurately <wink>.

Agree w/ Guido that few people really care about the differing semantics.

Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage
anyway:  code will definitely break.  Like

    \b(?:
        (?P<keyword>and|if|else|...) |
        (?P<identifier>[a-zA-Z_]\w*)
    )\b

The (special)|(general) idiom relies on left-to-right match-and-out
searching of alternatives to do its job correctly.  Not to mention that \b
is not a word-boundary assertion in the new pkg (talk about pointlessly
irritating differences!  at least this one could be easily hidden via
brainless preprocessing).

Over the long run, moving to a DFA locks Python out of the directions Perl
is *moving*, namely embedding all sorts of runtime gimmicks in regexps that
exploit knowing the "state of the match so far".  DFAs don't work that way.
I don't mind losing those possibilities, because I think the regexp
sublanguage is strained beyond its limits already.  But that's a decision
with Big Consequences, so deserves some thought.

I'd definitely like the (sometimes dramatically) increased speed a DFA can
offer (btw, this code appears to use a lazily-generated DFA, to avoid the
exponential *compile*-time a straightforward DFA implementation can
suffer -- the code is very complex and lacks any high-level internal docs,
so we better hope Henry stays in love with it <0.5 wink>).

> ...
> My guess is that people don't so much translate long Perl regexp's
> to Python but simply transport their (always incomplete -- Larry Wall
> *wants* it that way :-) knowledge of Perl regexps to Python.

This is directly proportional to the number of feeble CGI programmers Python
attracts <wink>.  The good news is that they wouldn't know an NFA from a DFA
if Larry bit Henry on the ass ...

> My meta-guess is that this is also Henry Spencer's and John
> Ousterhout's guess.

I think Spencer strongly favors DFA semantics regardless of fashion, and
Ousterhout is a pragmatist.  So I trust JO's judgment more <0.9 wink>.

> As for Larry Wall, I guess he really doesn't care :-)

I expect he cares a lot!  Because a DFA would prevent Perl from going even
more insane in its present direction.


About the age of the code, postings to comp.lang.tcl have Henry saying he
was working on the alpha version intensely as recently as Decemeber ('98).
A few complaints about the alpha release trickled in, about regexp compile
speed and regexp matching speed in specific cases.  Perhaps paradoxically,
the latter were about especially simple regexps with long fixed substrings
(where this mountain of sophisticated machinery is likely to get beat cold
by an NFA with some fixed-substring lookahead smarts -- which latter Henry
intended to graft into this pkg too).

[Andrew]
> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

[Guido]
> Not sure if that was the same code -- this is *new* code, not
> Spencer's old code.  I think Friedl's book is older than the current
> code.

I expect this is an invariant, though:  it's not natural for a DFA to know
where subexpression matches begin and end, and there's a pile of xxx_dissect
functions in regexec.c that use what strongly appear to be worst-case
quadratic-time algorithms for figuring that out after it's known that the
overall expression has *a* match.  Expect too, but don't know, that only
pathological cases are actually expensive.


Question:  has this package been released in any other context, or is it
unique to Tcl?  I searched in vain for an announcement (let alone code) from
Henry, or any discussion of this code outside the Tcl world.

whatever-happens-i-vote-we-let-them-debug-it<wink>-ly y'rs  - tim




From gstein@lyra.org  Wed May  5 07:22:20 1999
From: gstein@lyra.org (Greg Stein)
Date: Tue, 4 May 1999 23:22:20 -0700 (PDT)
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <000701be96b9$4e434460$799e2299@tim>
Message-ID: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>

On Wed, 5 May 1999, Tim Peters wrote:
>...
> Question:  has this package been released in any other context, or is it
> unique to Tcl?  I searched in vain for an announcement (let alone code) from
> Henry, or any discussion of this code outside the Tcl world.

Apache uses it.

However, the Apache guys have considered possibility updating the thing. I
gather that they have a pretty old snapshot. Another guy mentioned PCRE
and I pointed out that Python uses it for its regex support. In other
words, if Apache *does* update the code, then it may be that Apache will
drop the HS engine in favor of PCRE.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/




From Ivan.Porres@abo.fi  Wed May  5 09:29:21 1999
From: Ivan.Porres@abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 11:29:21 +0300
Subject: [Python-Dev] Python for Small Systems patch
Message-ID: <37300161.8DFD1D7F@abo.fi>

Python for Small Systems is a minimal version of the python interpreter,
intended to run on small embedded systems with a limited amount of
memory. 

Since there is some interest in the newsgroup, we have decide to release
an alpha version of the patch. You can download the patch from the
following page: 

http://www.abo.fi/~iporres/python

There is no documentation about the changes, but I guess that it is not
so difficult to figure out what Raul has been doing. 

There are some simple examples in the Demo/hitachi directory. The
configure scripts are broken. We plan to modify the configure scripts 
for cross-compilation. We are still testing, cleaning
and trying to reduce the memory requirements of the patched interpreter.
We also plan to write some documentation.

Please send comments to Raul (rparra@abo.fi) or to me (iporres@abo.fi),

Regards,
Ivan


-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
Åbo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemminkäinengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres


From tismer@appliedbiometrics.com  Wed May  5 12:52:24 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 13:52:24 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi>
Message-ID: <373030F8.21B73451@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Python for Small Systems is a minimal version of the python interpreter,
> intended to run on small embedded systems with a limited amount of
> memory.
> 
> Since there is some interest in the newsgroup, we have decide to release
> an alpha version of the patch. You can download the patch from the
> following page:
> 
> http://www.abo.fi/~iporres/python
> 
> There is no documentation about the changes, but I guess that it is not
> so difficult to figure out what Raul has been doing.

Ivan,
small Python is a very interesting thing,
thanks for the preview.

But, aren't 12600 lines of diff a little too much
to call it "not difficult to figure out"? :-)

The very last line was indeed helpful:

+++ Pss/miniconfigure	Tue Mar 16 16:59:42 1999
@@ -0,0 +1 @@
+./configure --prefix="/home/rparra/python/Python-1.5.1"
--without-complex --without-float --without-long --without-file
--without-libm --without-libc --without-fpectl --without-threads
--without-dec-threads --with-libs=

But I'd be interested in a brief list
of which other features are out, and even more which
structures were changed. Would that be possible?

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From Ivan.Porres@abo.fi  Wed May  5 14:17:17 1999
From: Ivan.Porres@abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 16:17:17 +0300
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com>
Message-ID: <373044DD.FE4499E@abo.fi>

Christian Tismer wrote:
> Ivan,
> small Python is a very interesting thing,
> thanks for the preview.
> 
> But, aren't 12600 lines of diff a little too much
> to call it "not difficult to figure out"? :-)

Raul Parra (rpb), the author of the patch, got the "source scissors"
(#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
embedded system with some RAM, no keyboard, no screen and no OS. An
example application can be a printer where the print jobs are python
bytecompiled scripts (instead of postscript).

We plan to write some documentation about the patch. Meanwhile, here are
some of the changes:

WITHOUT_PARSER, WITHOUT_COMPILER
Defining WITHOUT_PARSER removes the parser. This has a lot of
implications (no eval() !) but saves a lot of memory. The interpreter
can only execute byte-compiled scripts, that is PyCodeObjects. 

Most embedded processors have poor floating point capabilities. (They
can not compete with DSP's):

WITHOUT-COMPLEX
Removes support for complex numbers

WITHOUT-LONG
Removes long numbers

WITHOUT-FLOAT
Removes floating point numbers

Dependences with the OS:

WITHOUT-FILE
Removes file objects. No file, no print, no input, no interactive
prompt. This is not to bad in a device without hard disk, keyboard or
screen...

WITHOUT-GETPATH
Removes dependencies with os path.(Probabily this change should be
integrated with WITHOUT-FILE)

These changes render most of the standard modules unusable.
There are no fundamental changes on the interpter, just cut and cut....

Ivan
-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
Åbo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemminkäinengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres


From tismer@appliedbiometrics.com  Wed May  5 14:31:05 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 15:31:05 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi>
Message-ID: <37304819.AD636B67@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Christian Tismer wrote:
> > Ivan,
> > small Python is a very interesting thing,
> > thanks for the preview.
> >
> > But, aren't 12600 lines of diff a little too much
> > to call it "not difficult to figure out"? :-)
> 
> Raul Parra (rpb), the author of the patch, got the "source scissors"
> (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
> embedded system with some RAM, no keyboard, no screen and no OS. An
> example application can be a printer where the print jobs are python
> bytecompiled scripts (instead of postscript).
> 
> We plan to write some documentation about the patch. Meanwhile, here are
> some of the changes:

Many thanks, this is really interesting

> These changes render most of the standard modules unusable.
> There are no fundamental changes on the interpter, just cut and cut....

I see. A last thing which I'm curious about is the executable
size. If this can be compared to a Windows dll at all. Did you 
compile without the changes for your target as well? 
How is the ratio? The python15.dll file contains everything
of core Python and is about 560 KB large.
If your engine goes down to, say below 200 KB, this could
be a great thing for embedding Python into other apps.

ciao & thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Wed May  5 15:55:40 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Wed, 5 May 1999 10:55:40 -0400 (EDT)
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
References: <199905041226.IAA07627@eric.cnri.reston.va.us>
 <000701be96b9$4e434460$799e2299@tim>
Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us>

>>>>> "TP" == Tim Peters <tim_one@email.msn.com> writes:

    TP> Over the long run, moving to a DFA locks Python out of the
    TP> directions Perl is *moving*, namely embedding all sorts of
    TP> runtime gimmicks in regexps that exploit knowing the "state of
    TP> the match so far".  DFAs don't work that way.  I don't mind
    TP> losing those possibilities, because I think the regexp
    TP> sublanguage is strained beyond its limits already.  But that's
    TP> a decision with Big Consequences, so deserves some thought.

I know zip about the internals of the various regexp package.  But as
far as the Python level interface, would it be feasible to support
both as underlying regexp engines underneath re.py?  The idea would be 
that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
Then all the rest of the magic happens behind the scenes, with
appropriate exceptions thrown if there are syntax mismatches in the
regexp that can't be worked around by preprocessors, etc.

Or would that be more confusing than yet another different regexp
module?

-Barry


From tim_one@email.msn.com  Wed May  5 16:55:20 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 11:55:20 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>
Message-ID: <000601be970f$adef5740$a59e2299@tim>

[Tim]
> Question:  has this package [Tcl's 8.1 regexp support] been released in
> any other context, or is it unique to Tcl?  I searched in vain for an
> announcement (let alone code) from Henry, or any discussion of this code
> outside the Tcl world.

[Greg Stein]
> Apache uses it.
>
> However, the Apache guys have considered possibility updating the thing. I
> gather that they have a pretty old snapshot. Another guy mentioned PCRE
> and I pointed out that Python uses it for its regex support. In other
> words, if Apache *does* update the code, then it may be that Apache will
> drop the HS engine in favor of PCRE.

Hmm.  I just downloaded the Apache 1.3.4 source to check on this, and it
appears to be using a lightly massaged version of Spencer's old (circa
'92-'94) just-POSIX regexp package.  Henry has been distributing regexp pkgs
for a loooong time <wink>.

The Tcl 8.1 regexp pkg is much hairier.  If the Apache folk want to switch
in order to get the Perl regexp syntax extensions, this Tcl version is worth
looking at too.  If they want to switch for some other reason, it would be
good to know what that is!

The base pkg Apache uses is easily available all over the web; the pkg Tcl
8.1 is using I haven't found anywhere except in the Tcl download (which is
why I'm wondering about it -- so far, it doesn't appear to be distributed by
Spencer himself, in a non-Tcl-customized form).

looks-like-an-entirely-new-pkg-to-me-ly y'rs  - tim




From beazley@cs.uchicago.edu  Wed May  5 17:54:45 1999
From: beazley@cs.uchicago.edu (David Beazley)
Date: Wed, 5 May 1999 11:54:45 -0500 (CDT)
Subject: [Python-Dev] My (possibly delusional) book project
Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu>

Although this is a little off-topic for the developer list, I want to
fill people in on a new Python book project.  A few months ago, 
I was approached about doing a new Python reference book and I've
since decided to proceed with the project (after all, an increased
presence at the bookstore is probably a good thing :-).

In any event, my "vision" for this book is to take the material in the
Python tutorial, language reference, library reference, and extension
guide and squeeze it into a compact book no longer than 300 pages (and
hopefully without having to use a 4-point font).  Actually, what I'm
really trying to do is write something in a style similar to the K&R C
Programming book (very terse, straight to the point, and technically
accurate). The book's target audience is experienced/expert
programmers.

With this said, I would really like to get feedback from the developer
community about this project in a few areas.  First, I want to make
sure the language reference is in sync with the latest version of
Python, that it is as accurate as possible, and that it doesn't leave
out any important topics or recent developments.  Second, I would be
interested in knowing how to emphasize certain topics (for instance,
should I emphasize class-based exceptions over string-based exceptions
even though most books only cover the former case?).  The other big
area is the library reference.  Given the size of the library, I'm
going to cut a number of modules out.  However, the choice of what to
cut is not entirely clear (for now, it's a judgment call on my part).

All of the work in progress for this project is online at:

   http://rustler.cs.uchicago.edu/~beazley/essential/reference.html

I would love to get constructive feedback about this from other
developers.  Of course, I'll keep people posted in any case.

Cheers,

Dave



From tim_one@email.msn.com  Thu May  6 06:43:16 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 6 May 1999 01:43:16 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us>
Message-ID: <000d01be9783$57543940$2ca22299@tim>

[Tim notes that moving to a DFA regexp engine would rule out some future
 aping of Perl mistakes <wink>]

[Barry "The Great Compromiser" Warsaw]
> I know zip about the internals of the various regexp package.  But as
> far as the Python level interface, would it be feasible to support
> both as underlying regexp engines underneath re.py?  The idea would be
> that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
> re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
> Then all the rest of the magic happens behind the scenes, with
> appropriate exceptions thrown if there are syntax mismatches in the
> regexp that can't be worked around by preprocessors, etc.
>
> Or would that be more confusing than yet another different regexp
> module?

It depends some on what percentage of the Python distribution Guido wants to
devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of
code in Modules/, where regexp packages already consume more than anything
else.

It's a lot of delicate, difficult code.  Someone would need to step up and
champion each alternative package.  I haven't asked Andrew lately, but I'd
bet half a buck the thrill of supporting pcre has waned.

If there were competing packages, your suggested interface is fine.  I just
doubt the Python developers will support more than one (Andrew may still be
young, but he can't possibly still be naive enough to sign up for two of
these nightmares <wink>).

i'm-so-old-i-never-signed-up-for-one-ly y'rs  - tim




From rushing@nightmare.com  Thu May 13 07:34:19 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Wed, 12 May 1999 23:34:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905070507.BAA22545@python.org>
References: <199905070507.BAA22545@python.org>
Message-ID: <14138.28243.553816.166686@seattle.nightmare.com>

[list has been quiet, thought I'd liven things up a bit. 8^)]

I'm not sure if this has been brought up before in other forums, but
has there been discussion of separating the Python and C invocation
stacks, (i.e., removing recursive calls to the intepreter) to
facilitate coroutines or first-class continuations?

One of the biggest barriers to getting others to use asyncore/medusa
is the need to program in continuation-passing-style (callbacks,
callbacks to callbacks, state machines, etc...).  Usually there has to
be an overriding requirement for speed/scalability before someone will
even look into it.  And even when you do 'get' it, there are limits to
how inside-out your thinking can go. 8^)

If Python had coroutines/continuations, it would be possible to hide
asyncore-style select()/poll() machinery 'behind the scenes'.  I
believe that Concurrent ML does exactly this...

Other advantages might be restartable exceptions, different threading
models, etc...

-Sam
rushing@nightmare.com
rushing@eGroups.net



From mal@lemburg.com  Thu May 13 09:23:13 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 13 May 1999 10:23:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <373A8BF1.AE124BF@lemburg.com>

rushing@nightmare.com wrote:
> 
> [list has been quiet, thought I'd liven things up a bit. 8^)]

Well, there certainly is enough on the todo list... it's probably
the usual "ain't got no time" thing.

> I'm not sure if this has been brought up before in other forums, but
> has there been discussion of separating the Python and C invocation
> stacks, (i.e., removing recursive calls to the intepreter) to
> facilitate coroutines or first-class continuations?

Wouldn't it be possible to move all the C variables passed to
eval_code() via the execution frame ? AFAIK, the frame is
generated on every call to eval_code() and thus could also
be generated *before* calling it.

> One of the biggest barriers to getting others to use asyncore/medusa
> is the need to program in continuation-passing-style (callbacks,
> callbacks to callbacks, state machines, etc...).  Usually there has to
> be an overriding requirement for speed/scalability before someone will
> even look into it.  And even when you do 'get' it, there are limits to
> how inside-out your thinking can go. 8^)
> 
> If Python had coroutines/continuations, it would be possible to hide
> asyncore-style select()/poll() machinery 'behind the scenes'.  I
> believe that Concurrent ML does exactly this...
> 
> Other advantages might be restartable exceptions, different threading
> models, etc...

Don't know if moving the C stack stuff into the frame objects
will get you the desired effect: what about other things having
state (e.g. connections or files), that are not even touched
by this mechanism ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 232 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/




From rushing@nightmare.com  Thu May 13 10:40:19 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Thu, 13 May 1999 02:40:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373A8BF1.AE124BF@lemburg.com>
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <373A8BF1.AE124BF@lemburg.com>
Message-ID: <14138.38550.89759.752058@seattle.nightmare.com>

M.-A. Lemburg writes:

 > Wouldn't it be possible to move all the C variables passed to
 > eval_code() via the execution frame ? AFAIK, the frame is
 > generated on every call to eval_code() and thus could also
 > be generated *before* calling it.

I think this solves half of the problem.  The C stack is both a value
stack and an execution stack (i.e., it holds variables and return
addresses).  Getting rid of arguments (and a return value!) gets rid
of the need for the 'value stack' aspect.

In aiming for an enter-once, exit-once VM, the thorniest part is to
somehow allow python->c->python calls.  The second invocation could
never save a continuation because its execution context includes a C
frame.  This is a general problem, not specific to Python; I probably
should have thought about it a bit before posting...

 > Don't know if moving the C stack stuff into the frame objects
 > will get you the desired effect: what about other things having
 > state (e.g. connections or files), that are not even touched
 > by this mechanism ?

I don't think either of those cause 'real' problems (i.e., nothing
should crash that assumes an open file or socket), but there may be
other stateful things that might.  I don't think that refcounts would
be a problem - a saved continuation wouldn't be all that different
from an exception traceback.

-Sam

p.s. Here's a tiny VM experiment I wrote a while back, to explain
what I mean by 'stackless':

http://www.nightmare.com/stuff/machine.h
http://www.nightmare.com/stuff/machine.c

Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context
onto heap-allocated data structures rather than calling the VM
recursively.



From skip@mojam.com (Skip Montanaro)  Thu May 13 12:38:39 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 13 May 1999 07:38:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>

    Sam> I'm not sure if this has been brought up before in other forums,
    Sam> but has there been discussion of separating the Python and C
    Sam> invocation stacks, (i.e., removing recursive calls to the
    Sam> intepreter) to facilitate coroutines or first-class continuations?

I thought Guido was working on that for the mobile agent stuff he was
working on at CNRI.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Thu May 13 16:10:52 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Thu, 13 May 1999 11:10:52 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip@mojam.com> writes:

    SM> I thought Guido was working on that for the mobile agent stuff
    SM> he was working on at CNRI.

Nope, we decided that we could accomplish everything we needed without 
this.  We occasionally revisit this but Guido keeps insisting it's a
lot of work for not enough benefit :-)

-Barry


From guido@CNRI.Reston.VA.US  Thu May 13 16:19:10 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 13 May 1999 11:19:10 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT."
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us>

Interesting topic!  While I 'm on the road, a few short notes.

> I thought Guido was working on that for the mobile agent stuff he was
> working on at CNRI.

Indeed.  At least I planned on working on it.  I ended up abandoning
the idea because I expected it would be a lot of work and I never had
the time (same old story indeed).

Sam also hit it on the nail: the hardest problem is what to do about
all the places where C calls back into Python.

I've come up with two partial solutions: (1) allow for a way to
arrange for a call to be made immediately after you return to the VM
from C; this would take care of apply() at least and a few other
"tail-recursive" cases; (2) invoke a new VM when C code needs a Python
result, requiring it to return.  The latter clearly breaks certain
uses of coroutines but could probably be made to work most of the
time.  Typical use of the 80-20 rule.

And I've just come up with a third solution: a variation on (1) where
you arrange *two* calls: one to Python and then one to C, with the
result of the first.  (And a bit saying whether you want the C call to 
be made even when an exception happened.)

In general, I still think it's a cool idea, but I also still think
that continuations are too complicated for most programmers.  (This
comes from the realization that they are too complicated for me!)
Corollary: even if we had continuations, I'm not sure if this would
take away the resistance against asyncore/asynchat.  Of course I could 
be wrong.

Different suggestion: it would be cool to work on completely
separating out the VM from the rest of Python, through some kind of
C-level API specification.  Two things should be possiblw with this
new architecture: (1) small platform ports could cut out the
interactive interpreter, the parser and compiler, and certain data
types such as long, complex and files; (2) there could be alternative
pluggable VMs with certain desirable properties such as
platform-specific optimization (Christian, are you listening? :-).

I think the most challenging part might be defining an API for passing 
in the set of supported object types and operations.  E.g. the
EXEC_STMT opcode needs to be be implemented in a way that allows
"exec" to be absent from the language.  Perhaps an __exec__ function
(analogous to __import__) is the way to go.  The set of built-in
functions should also be passed in, so that e.g. one can easily leave
out open(), eval() and comppile(), complex(), long(), float(), etc.

I think it would be ideal if no #ifdefs were needed to remove features
(at least not in the VM code proper).  Fortunately, the VM doesn't
really know about many object types -- frames, fuctions, methods,
classes, ints, strings, dictionaries, tuples, tracebacks, that may be
all it knows.  (Lists?)

Gotta run,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Thu May 13 20:50:44 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 13 May 1999 21:50:44 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>  <199905131519.LAA01097@eric.cnri.reston.va.us>
Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com>

> In general, I still think it's a cool idea, but I also still think
> that continuations are too complicated for most programmers.  (This
> comes from the realization that they are too complicated for me!)

in an earlier life, I used non-preemtive threads (that is,
explicit yields) and co-routines to do some really cool
stuff with very little code.  looks like a stack-less inter-
preter would make it trivial to implement that.

might just be nostalgia, but I think I would give an arm
or two to get that (not necessarily my own, though ;-)

</F>



From rushing@nightmare.com  Fri May 14 03:00:09 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Thu, 13 May 1999 19:00:09 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
 <14138.60284.584739.711112@anthem.cnri.reston.va.us>
Message-ID: <14139.30970.644343.612721@seattle.nightmare.com>

Guido van Rossum writes:
  > I've come up with two partial solutions: (1) allow for a way to
  > arrange for a call to be made immediately after you return to the
  > VM from C; this would take care of apply() at least and a few
  > other "tail-recursive" cases; (2) invoke a new VM when C code
  > needs a Python result, requiring it to return.  The latter clearly
  > breaks certain uses of coroutines but could probably be made to
  > work most of the time.  Typical use of the 80-20 rule.

I know this is disgusting, but could setjmp/longjmp 'automagically'
force a 'recursive call' to jump back into the top-level loop?  This
would put some serious restraint on what C called from Python could
do...

I think just about any Scheme implementation has to solve this same
problem... I'll dig through my collection of them for ideas.

  > In general, I still think it's a cool idea, but I also still think
  > that continuations are too complicated for most programmers.  (This
  > comes from the realization that they are too complicated for me!)
  > Corollary: even if we had continuations, I'm not sure if this would
  > take away the resistance against asyncore/asynchat.  Of course I could 
  > be wrong.

Theoretically, you could have a bit of code that looked just like
'normal' imperative code, that would actually be entering and exiting
the context for non-blocking i/o.  If it were done right, the same
exact code might even run under 'normal' threads.

Recently I've written an async server that needed to talk to several
other RPC servers, and a mysql server.  Pseudo-example, with
possibly-async calls in UPPERCASE:

  auth, archive = db.FETCH_USER_INFO (user)
  if verify_login(user,auth):
    rpc_server = self.archive_servers[archive]
    group_info = rpc_server.FETCH_GROUP_INFO (group)
    if valid (group_info):
      return rpc_server.FETCH_MESSAGE (message_number)
    else:
      ...
   else:
     ...

This code in CPS is a horrible, complicated mess, it takes something
like 8 callback methods, variables and exceptions have to be passed
around in 'continuation' objects.  It's hairy because there are three
levels of callback state.  Ugh.

If Python had closures, then it would be a *little* easier, but would
still make the average Pythoneer swoon.  Closures would let you put
the above logic all in one method, but the code would still be
'inside-out'.

  > Different suggestion: it would be cool to work on completely
  > separating out the VM from the rest of Python, through some kind of
  > C-level API specification.

I think this is a great idea.  I've been staring at python bytecodes a
bit lately thinking about how to do something like this, for some
subset of Python.

[...]

Ok, we've all seen the 'stick'.  I guess I should give an example of
the 'carrot': I think that a web server built on such a Python could
have the performance/scalability of thttpd, with the
ease-of-programming of Roxen.  As far as I know, there's nothing like
it out there.  Medusa would be put out to pasture. 8^)

-Sam



From guido@CNRI.Reston.VA.US  Fri May 14 13:03:31 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 08:03:31 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT."
 <14139.30970.644343.612721@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>
 <14139.30970.644343.612721@seattle.nightmare.com>
Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us>

> I know this is disgusting, but could setjmp/longjmp 'automagically'
> force a 'recursive call' to jump back into the top-level loop?  This
> would put some serious restraint on what C called from Python could
> do...

Forget about it.  setjmp/longjmp are invitations to problems.  I also
assume that they would interfere badly with C++.

> I think just about any Scheme implementation has to solve this same
> problem... I'll dig through my collection of them for ideas.

Anything that assumes knowledge about how the C compiler and/or the
CPU and OS lay out the stack is a no-no, because it means that the
first thing one has to do for a port to a new architecture is figure
out how the stack is laid out.  Another thread in this list is porting 
Python to microplatforms like PalmOS.  Typically the scheme Hackers
are not afraid to delve deep into the machine, but I refuse to do that
-- I think it's too risky.

>   > In general, I still think it's a cool idea, but I also still think
>   > that continuations are too complicated for most programmers.  (This
>   > comes from the realization that they are too complicated for me!)
>   > Corollary: even if we had continuations, I'm not sure if this would
>   > take away the resistance against asyncore/asynchat.  Of course I could 
>   > be wrong.
> 
> Theoretically, you could have a bit of code that looked just like
> 'normal' imperative code, that would actually be entering and exiting
> the context for non-blocking i/o.  If it were done right, the same
> exact code might even run under 'normal' threads.

Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines
with regular threads.  (I think in cooperation with Steve Majewski.)

> Recently I've written an async server that needed to talk to several
> other RPC servers, and a mysql server.  Pseudo-example, with
> possibly-async calls in UPPERCASE:
> 
>   auth, archive = db.FETCH_USER_INFO (user)
>   if verify_login(user,auth):
>     rpc_server = self.archive_servers[archive]
>     group_info = rpc_server.FETCH_GROUP_INFO (group)
>     if valid (group_info):
>       return rpc_server.FETCH_MESSAGE (message_number)
>     else:
>       ...
>    else:
>      ...
> 
> This code in CPS is a horrible, complicated mess, it takes something
> like 8 callback methods, variables and exceptions have to be passed
> around in 'continuation' objects.  It's hairy because there are three
> levels of callback state.  Ugh.

Agreed.

> If Python had closures, then it would be a *little* easier, but would
> still make the average Pythoneer swoon.  Closures would let you put
> the above logic all in one method, but the code would still be
> 'inside-out'.

I forget how this worked :-(

>   > Different suggestion: it would be cool to work on completely
>   > separating out the VM from the rest of Python, through some kind of
>   > C-level API specification.
> 
> I think this is a great idea.  I've been staring at python bytecodes a
> bit lately thinking about how to do something like this, for some
> subset of Python.
> 
> [...]
> 
> Ok, we've all seen the 'stick'.  I guess I should give an example of
> the 'carrot': I think that a web server built on such a Python could
> have the performance/scalability of thttpd, with the
> ease-of-programming of Roxen.  As far as I know, there's nothing like
> it out there.  Medusa would be put out to pasture. 8^)

I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
they do that Apache doesn't?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Fri May 14 14:16:13 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 May 1999 15:16:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>             <14139.30970.644343.612721@seattle.nightmare.com>  <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com>

> I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
> they do that Apache doesn't?

http://www.roxen.com/

a lean and mean secure web server written in Pike
(http://pike.idonex.se/), from a company here in
Linköping.

</F>



From tismer@appliedbiometrics.com  Fri May 14 16:15:20 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 14 May 1999 17:15:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>
 <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com>


Guido van Rossum wrote:

[setjmp/longjmp -no-no]

> Forget about it.  setjmp/longjmp are invitations to problems.  I also
> assume that they would interfere badly with C++.
> 
> > I think just about any Scheme implementation has to solve this same
> > problem... I'll dig through my collection of them for ideas.
> 
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.
...

I agree that this is generally bad. While it's a cakewalk
to do a stack swap for the few (X86 based:) platforms where
I work with. This is much less than a thread change.

But on the general issues:
Can the Python-calls-C and C-calls-Python problem just be solved
by turning the whole VM state into a data structure, including
a Python call stack which is independent? Maybe this has been
mentioned already.

This might give a little slowdown, but opens possibilities
like continuation-passing style, and context switches
between different interpreter states would be under direct
control.

Just a little dreaming: Not using threads, but just tiny
interpreter incarnations with local state, and a special
C call or better a new opcode which activates the next
state in some list (of course a Python list).
This would automagically produce ICON iterators (duck)
and coroutines (cover).
If I guess right, continuation passing could be done
by just shifting tiny tuples around. Well, Tim, help me :-)

[closures]

> > I think this is a great idea.  I've been staring at python bytecodes a
> > bit lately thinking about how to do something like this, for some
> > subset of Python.

Lumberjack? How is it going? [to Sam]

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From bwarsaw@cnri.reston.va.us (Barry A. Warsaw)  Fri May 14 16:32:51 1999
From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw)
Date: Fri, 14 May 1999 11:32:51 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
 <14138.60284.584739.711112@anthem.cnri.reston.va.us>
 <14139.30970.644343.612721@seattle.nightmare.com>
 <199905141203.IAA01808@eric.cnri.reston.va.us>
 <001701be9e0b$f1bc4930$f29b12c2@pythonware.com>
Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    FL> a lean and mean secure web server written in Pike
    FL> (http://pike.idonex.se/), from a company here in
    FL> Linköping.

Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
original came on board to add Pike support, which has a syntax similar 
enough to C to be easily integrated.  I think I've had as much success 
convincing him to use Python as he's had convincing me to use Pike :-)

-Barry


From gstein@lyra.org  Fri May 14 22:54:02 1999
From: gstein@lyra.org (Greg Stein)
Date: Fri, 14 May 1999 14:54:02 -0700
Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?)
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
 <14138.60284.584739.711112@anthem.cnri.reston.va.us>
 <14139.30970.644343.612721@seattle.nightmare.com>
 <199905141203.IAA01808@eric.cnri.reston.va.us>
 <001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us>
Message-ID: <373C9B7A.3676A910@lyra.org>

Barry A. Warsaw wrote:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
> 
>     FL> a lean and mean secure web server written in Pike
>     FL> (http://pike.idonex.se/), from a company here in
>     FL> Linköping.
> 
> Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
> original came on board to add Pike support, which has a syntax similar
> enough to C to be easily integrated.  I think I've had as much success
> convincing him to use Python as he's had convincing me to use Pike :-)

<HistoricalNote>

Heh. Pike is an outgrowth of the MUD world's LPC programming language. A
guy named "Profezzorn" started a project (in '94?) to redevelop an LPC
compiler/interpreter ("driver") from scratch to avoid some licensing
constraints. The project grew into a generalized network handler, since
MUDs' typical designs are excellent for these tasks. From there, you get
the Roxen web server.

</HistoricalNote>

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From rushing@nightmare.com  Sat May 15 00:36:11 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Fri, 14 May 1999 16:36:11 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
References: <199905070507.BAA22545@python.org>
 <14138.28243.553816.166686@seattle.nightmare.com>
 <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
 <14138.60284.584739.711112@anthem.cnri.reston.va.us>
 <14139.30970.644343.612721@seattle.nightmare.com>
 <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <14140.44469.848840.740112@seattle.nightmare.com>

Guido van Rossum writes:
 > > If Python had closures, then it would be a *little* easier, but would
 > > still make the average Pythoneer swoon.  Closures would let you put
 > > the above logic all in one method, but the code would still be
 > > 'inside-out'.
 > 
 > I forget how this worked :-(

[with a faked-up lambda-ish syntax]

def thing (a):
  return do_async_job_1 (a,
    lambda (b):
      if (a>1):
        do_async_job_2a (b,
          lambda (c):
            [...]
          )
      else:
        do_async_job_2b (a,b,
          lambda (d,e,f):
            [...]
          )
     )

The call to do_async_job_1 passes 'a', and a callback, which is
specified 'in-line'.  You can follow the logic of something like this
more easily than if each lambda is spun off into a different
function/method.

 > > I think that a web server built on such a Python could have the
 > > performance/scalability of thttpd, with the ease-of-programming
 > > of Roxen.  As far as I know, there's nothing like it out there.
 > > Medusa would be put out to pasture. 8^)
 > 
 > I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
 > they do that Apache doesn't?

thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance
and scalability, but suffer from the same programmability problem as
Medusa (only worse, 'cause they're in C).

Roxen is written in Pike, a c-like language with gc, threads,
etc... Roxen is I think now the official 'GNU Web Server'.

Here's an interesting web-server comparison chart:

http://www.acme.com/software/thttpd/benchmarks.html

-Sam



From guido@CNRI.Reston.VA.US  Sat May 15 03:23:24 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 22:23:24 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT."
 <14140.44469.848840.740112@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
 <14140.44469.848840.740112@seattle.nightmare.com>
Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us>

> def thing (a):
>   return do_async_job_1 (a,
>     lambda (b):
>       if (a>1):
>         do_async_job_2a (b,
>           lambda (c):
>             [...]
>           )
>       else:
>         do_async_job_2b (a,b,
>           lambda (d,e,f):
>             [...]
>           )
>      )
> 
> The call to do_async_job_1 passes 'a', and a callback, which is
> specified 'in-line'.  You can follow the logic of something like this
> more easily than if each lambda is spun off into a different
> function/method.

I agree that it is still ugly.

> http://www.acme.com/software/thttpd/benchmarks.html

I see.  Any pointers to a graph of thttp market share?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Sat May 15 08:51:00 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <000701be9ea7$acab7f40$159e2299@tim>

[GvR]
> ...
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.

The Icon language needs a bit of platform-specific context-switching
assembly code to support its full coroutine features, although its
bread-and-butter generators ("semi coroutines") don't need anything special.

The result is that Icon ports sometimes limp for a year before they support
full coroutines, waiting for someone wizardly enough to write the necessary
code.  This can, in fact, be quite difficult; e.g., on machines with HW
register windows (where "the stack" can be a complicated beast half buried
in hidden machine state, sometimes needing kernel privilege to uncover).

Not attractive.  Generators are, though <wink>.

threads-too-ly y'rs  - tim




From tim_one@email.msn.com  Sat May 15 08:51:03 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000801be9ea7$ae45f560$159e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim




From gstein@lyra.org  Sat May 15 09:17:15 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 15 May 1999 01:17:15 -0700
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
 <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us>
Message-ID: <373D2D8B.390C523C@lyra.org>

Guido van Rossum wrote:
> ...
> > http://www.acme.com/software/thttpd/benchmarks.html
> 
> I see.  Any pointers to a graph of thttp market share?

thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That
puts it at #6. However, it is interesting to note that 60k of those
sites are in the .uk domain. I can't figure out who is running it, but I
would guess that a large UK-based ISP is hosting a bunch of domains on
thttpd.

It is somewhat difficult to navigate the various reports (and it never
fails that the one you want is not present), but the data is from
Netcraft's survey at: http://www.netcraft.com/survey/

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From tim_one@email.msn.com  Sat May 15 17:43:20 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 12:43:20 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000701be9ef2$0a9713e0$659e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim




From rushing@nightmare.com  Sun May 16 12:10:18 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Sun, 16 May 1999 04:10:18 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <81365478@toto.iv>
Message-ID: <14142.40867.103424.764346@seattle.nightmare.com>

Tim Peters writes:
 > I'm not a fan of continuations myself; coroutines can be
 > implemented faithfully via threads (I posted a rather complete set
 > of Python classes for that in the pre-DejaNews days, a bit more
 > flexible than Icon's coroutines); and:

Continuations are more powerful than coroutines, though I admit
they're a bit esoteric.  I programmed in Scheme for years without
seeing the need for them.  But when you need 'em, you *really* need
'em.  No way around it.

For my purposes (massively scalable single-process servers and
clients) threads don't cut it... for example I have a mailing-list
exploder that juggles up to 2048 simultaneous SMTP connections.  I
think it can go higher - I've tested select() on FreeBSD with 16,000
file descriptors.

[...]

BTW, I have actually made progress borrowing a bit of code from SCM.
It uses the stack-copying technique, along with setjmp/longjmp.  It's
too ugly and unportable to be a real candidate for inclusion in
Official Python.  [i.e., if it could be made to work it should be
considered a stopgap measure for the desperate].

I haven't tested it thoroughly, but I have successfully saved and
invoked (and reinvoked) a continuation.  Caveat: I have to turn off
Py_DECREF in order to keep it from crashing.

  | >>> import callcc
  | >>> saved = None
  | >>> def thing(n):
  | ...     if n == 2:
  | ...             global saved
  | ...             saved = callcc.new()
  | ...     print 'n==',n
  | ...     if n == 0:
  | ...             print 'Done!'
  | ...     else:
  | ...             thing (n-1)
  | ... 
  | >>> thing (5)
  | n== 5
  | n== 4
  | n== 3
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved
  | <Continuation object at 80d30d0>
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> 

I will probably not be able to work on this for a while (baby due any
day now), so anyone is welcome to dive right in.  I don't have much
experience wading through gdb tracking down reference bugs, I'm hoping
a brave soul will pick up where I left off. 8^)

http://www.nightmare.com/stuff/python-callcc.tar.gz
ftp://www.nightmare.com/stuff/python-callcc.tar.gz

-Sam



From tismer@appliedbiometrics.com  Sun May 16 16:31:01 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 17:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com>


rushing@nightmare.com wrote:

[...]

> BTW, I have actually made progress borrowing a bit of code from SCM.
> It uses the stack-copying technique, along with setjmp/longjmp.  It's
> too ugly and unportable to be a real candidate for inclusion in
> Official Python.  [i.e., if it could be made to work it should be
> considered a stopgap measure for the desperate].

I tried it and built it as a Win32 .pyd file, and it seems to
work, but...

> I haven't tested it thoroughly, but I have successfully saved and
> invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> Py_DECREF in order to keep it from crashing.

Indeed, and this seems to be a problem too hard to solve
without lots of work.
Since you keep a snapshot of the current machine stack,
it contains a number of object references which have been
valid when the snapshot was taken, but many are most
probably invalid when you restart the continuation.
I guess, incref-ing all current alive objects on
the interpreter stack would be the minimum, maybe more.

A tuple of necessary references could be used as an
attribute of a Continuation object. I will look
how difficult this is.

ciao - chris


-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sun May 16 19:31:01 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 20:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com>
Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com>


Christian Tismer wrote:
> 
> rushing@nightmare.com wrote:
[...]

> > I haven't tested it thoroughly, but I have successfully saved and
> > invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> > Py_DECREF in order to keep it from crashing.

It is possible, but a little hard.
To take a working snapshot of the current thread's
stack, one needs not only the stack snapshot which 
continue.c provides, but also a restorable copy of
all frame objects involved so far.
A copy of the current frame chain must be built, with
proper reference counting of all involved elements.
And this is the crux: The current stack pointer of the
VM is not present in the frame objects, but hangs
around somewhere on the machine stack.
Two solutions:

1) modify PyFrameObject by adding a field which holds
   the stack pointer, when a function is called. 
   I don't like to change the VM in any way for this.
2) use the lasti field which holds the last VM instruction
   offset. Then scan the opcodes of the code object
   and calculate the current stack level. This is possible
   since Guido's code generator creates code with the stack
   level lexically bound to the code offset.

Now we can incref all the referenced objects in the frame.
This must be done for the whole chain, which is copied and
relinked during that. This chain is then held as a
property of the continuation object.

To throw the continuation, the current frame chain must
be cleared, and the saved one is inserted, together with
the machine stack operation which Sam has already.

A little hefty, isn't it?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tim_one@email.msn.com  Mon May 17 06:42:59 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 17 May 1999 01:42:59 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <000f01bea028$1f75c360$fb9e2299@tim>

[Sam]
> Continuations are more powerful than coroutines, though I admit
> they're a bit esoteric.

"More powerful" is a tedious argument you should always avoid <wink>.

> I programmed in Scheme for years without seeing the need for them.
> But when you need 'em, you *really* need 'em.  No way around it.
>
> For my purposes (massively scalable single-process servers and
> clients) threads don't cut it... for example I have a mailing-list
> exploder that juggles up to 2048 simultaneous SMTP connections.  I
> think it can go higher - I've tested select() on FreeBSD with 16,000
> file descriptors.

The other point being that you want to avoid "inside out" logic, though,
right?  Earlier you posted a kind of ideal:

    Recently I've written an async server that needed to talk to several
    other RPC servers, and a mysql server.  Pseudo-example, with
    possibly-async calls in UPPERCASE:

      auth, archive = db.FETCH_USER_INFO (user)
      if verify_login(user,auth):
          rpc_server = self.archive_servers[archive]
          group_info = rpc_server.FETCH_GROUP_INFO (group)
          if valid (group_info):
              return rpc_server.FETCH_MESSAGE (message_number)
          else:
              ...
          else:
              ...

I assume you want to capture a continuation object in the UPPERCASE methods,
store it away somewhere, run off to your select/poll/whatever loop, and have
it invoke the stored continuation objects as the data they're waiting for
arrives.

If so, that's got to be the nicest use for continuations I've seen!  All
invisible to the end user.  I don't know how to fake it pleasantly without
threads, either, and understand that threads aren't appropriate for resource
reasons.  So I don't have a nice alternative.

> ...
>   | >>> import callcc
>   | >>> saved = None
>   | >>> def thing(n):
>   | ...     if n == 2:
>   | ...             global saved
>   | ...             saved = callcc.new()
>   | ...     print 'n==',n
>   | ...     if n == 0:
>   | ...             print 'Done!'
>   | ...     else:
>   | ...             thing (n-1)
>   | ...
>   | >>> thing (5)
>   | n== 5
>   | n== 4
>   | n== 3
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved
>   | <Continuation object at 80d30d0>
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>>

Suppose the driver were in a script instead:

thing(5)           # line 1
print repr(saved)  # line 2
saved.throw(0)     # line 3
saved.throw(0)     # line 4

Then the continuation would (eventually) "return to" the "print repr(saved)"
and we'd get an infinite output tail of:

Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
...

and never reach line 4.  Right?  That's the part that Guido hates <wink>.

takes-one-to-know-one-ly y'rs  - tim




From tismer@appliedbiometrics.com  Mon May 17 08:07:22 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 09:07:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <373FC02A.69F2D912@appliedbiometrics.com>


Tim Peters wrote:

[to Sam]

> The other point being that you want to avoid "inside out" logic, though,
> right?  Earlier you posted a kind of ideal:
> 
>     Recently I've written an async server that needed to talk to several
>     other RPC servers, and a mysql server.  Pseudo-example, with
>     possibly-async calls in UPPERCASE:
> 
>       auth, archive = db.FETCH_USER_INFO (user)
>       if verify_login(user,auth):
>           rpc_server = self.archive_servers[archive]
>           group_info = rpc_server.FETCH_GROUP_INFO (group)
>           if valid (group_info):
>               return rpc_server.FETCH_MESSAGE (message_number)
>           else:
>               ...
>           else:
>               ...
> 
> I assume you want to capture a continuation object in the UPPERCASE methods,
> store it away somewhere, run off to your select/poll/whatever loop, and have
> it invoke the stored continuation objects as the data they're waiting for
> arrives.
> 
> If so, that's got to be the nicest use for continuations I've seen!  All
> invisible to the end user.  I don't know how to fake it pleasantly without
> threads, either, and understand that threads aren't appropriate for resource
> reasons.  So I don't have a nice alternative.

It can always be done with threads, but also without. Tried it
last night, with proper refcounting, and it wasn't too easy
since I had to duplicate the Python frame chain.

...

> Suppose the driver were in a script instead:
> 
> thing(5)           # line 1
> print repr(saved)  # line 2
> saved.throw(0)     # line 3
> saved.throw(0)     # line 4
> 
> Then the continuation would (eventually) "return to" the "print repr(saved)"
> and we'd get an infinite output tail of:
> 
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!

This is at the moment exactly what happens, with the difference that
after some repetitions we GPF due to dangling references
to too often decref'ed objects. My incref'ing prepares for
just one re-incarnation and should prevend a second call.
But this will be solved, soon.

> and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yup. With a little counting, it was easy to survive:

def main():
    global a
    a=2
    thing (5)
    a=a-1
    if a:
        saved.throw (0)

Weird enough and needs a much better interface.
But finally I'm quite happy that it worked so smoothly
after just a couple of hours (well, about six :)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From rushing@nightmare.com  Mon May 17 10:46:29 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Mon, 17 May 1999 02:46:29 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim>
References: <14142.40867.103424.764346@seattle.nightmare.com>
 <000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <14143.56604.21827.891993@seattle.nightmare.com>

Tim Peters writes:
 > [Sam]
 > > Continuations are more powerful than coroutines, though I admit
 > > they're a bit esoteric.
 > 
 > "More powerful" is a tedious argument you should always avoid <wink>.

More powerful in the sense that you can use continuations to build
lots of different control structures (coroutines, backtracking,
exceptions), but not vice versa.

Kinda like a better tool for blowing one's own foot off. 8^)

 > Suppose the driver were in a script instead:
 > 
 > thing(5)           # line 1
 > print repr(saved)  # line 2
 > saved.throw(0)     # line 3
 > saved.throw(0)     # line 4
 > 
 > Then the continuation would (eventually) "return to" the "print repr(saved)"
 > and we'd get an infinite output tail [...]
 > 
 > and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yes... the continuation object so far isn't very usable.  It needs a
driver of some kind around it.  In the Scheme world, there are two
common ways of using continuations - let/cc and call/cc.  [call/cc is what
is in the standard, it's official name is call-with-current-continuation]

let/cc stores the continuation in a variable binding, while
introducing a new scope.  It requires a change to the underlying
language:

(+ 1
  (let/cc escape
    (...)
    (escape 34)))
=> 35

'escape' is a function that when called will 'resume' with whatever
follows the let/cc clause.  In this case it would continue with the
addition...

call/cc is a little trickier, but doesn't require any change to the
language...  instead of making a new binding directly, you pass in
a function that will receive the binding:

(+ 1
   (call/cc
     (lambda (escape)
       (...)
       (escape 34))))
=> 35

In words, it's much more frightening: "call/cc is a function, that
when called with a function as an argument, will pass that function an
argument that is a new function, which when called with a value will
resume the computation with that value as the result of the entire
expression"  Phew.

In Python, an example might look like this:

SAVED = None
def save_continuation (k):
  global SAVED
  SAVED = k

def thing():
  [...]
  value = callcc (lambda k: save_continuation(k))

# or more succinctly:
def thing():
  [...]
  value = callcc (save_continuation)

In order to do useful work like passing values back and forth between
coroutines, we have to have some way of returning a value from the
continuation when it is reinvoked.

I should emphasize that most folks will never see call/cc 'in the
raw', it will usually have some nice wrapper around to implement
whatever construct is needed.

-Sam



From arw@ifu.net  Mon May 17 19:06:18 1999
From: arw@ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 14:06:18 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <37405A99.1DBAF399@ifu.net>

The illustrious Sam Rushing avers:
>Continuations are more powerful than coroutines, though I admit
>they're a bit esoteric.  I programmed in Scheme for years without
>seeing the need for them.  But when you need 'em, you *really* need
>'em.  No way around it.

Frankly, I think I thought I understood this once but now I know I
don't.
How're continuations more powerful than coroutines?
And why can't they be implemented using threads (and semaphores etc)?

...I'm not promising I'll understand the answer...
    -- Aaron Watters

===
I taught I taw a putty-cat!




From gmcm@hypernet.com  Mon May 17 20:18:43 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 14:18:43 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
Message-ID: <1285153546-166193857@hypernet.com>

The estimable Aaron Watters queries:
> The illustrious Sam Rushing avers:
> >Continuations are more powerful than coroutines, though I admit
> >they're a bit esoteric.  I programmed in Scheme for years without
> >seeing the need for them.  But when you need 'em, you *really* need
> >'em.  No way around it.
> 
> Frankly, I think I thought I understood this once but now I know I
> don't. How're continuations more powerful than coroutines? And why
> can't they be implemented using threads (and semaphores etc)?

I think Sam's (immediate <wink>) problem is that he can't afford 
threads - he may have hundreds to thousands of these suckers.

As a fuddy-duddy old imperative programmer, I'm inclined to think 
"state machine". But I'd guess that functional-ophiles probably see 
that as inelegant. (Safe guess - they see _anything_ that isn't 
functional as inelegant!).

crude-but-not-rude-ly y'rs

- Gordon


From jeremy@cnri.reston.va.us  Mon May 17 19:43:34 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 17 May 1999 14:43:34 -0400 (EDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
References: <37405A99.1DBAF399@ifu.net>
Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us>

>>>>> "AW" == Aaron Watters <arw@ifu.net> writes:

  AW> The illustrious Sam Rushing avers:
  >> Continuations are more powerful than coroutines, though I admit
  >> they're a bit esoteric.  I programmed in Scheme for years without
  >> seeing the need for them.  But when you need 'em, you *really*
  >> need 'em.  No way around it.

  AW> Frankly, I think I thought I understood this once but now I know
  AW> I don't.  How're continuations more powerful than coroutines?
  AW> And why can't they be implemented using threads (and semaphores
  AW> etc)?

I think I understood, too.  I'm hoping that someone will debug my
answer and enlighten us both.

A continuation is a mechanism for making control flow explicit.  A
continuation is a means of naming and manipulating "the rest of the
program."   In Scheme terms, the continuation is the function that the 
value of the current expression should be passed to.  The call/cc
mechanisms lets you capture the current continuation and explicitly
call on it.  The most typical use of call/cc is non-local exits, but
it gives you incredible flexibility for implementing your control
flow.

I'm fuzzy on coroutines, as I've only seen them in "Structure
Programming" (which is as old as I am :-) and never actually used
them.  The basic idea is that when a coroutine calls another
coroutine, control is transfered to the second coroutine at the point
at which it last left off (by itself calling another coroutine or by
detaching, which returns control to the lexically enclosing scope).

It seems to me that coroutines are an example of the kind of control
structure that you could build with continuations.  It's not clear
that the reverse is true.

I have to admit that I'm a bit unclear on the motivation for all
this.  As Gordon said, the state machine approach seems like it would
be a good approach.

Jeremy


From klm@digicool.com  Mon May 17 20:08:57 1999
From: klm@digicool.com (Ken Manheimer)
Date: Mon, 17 May 1999 15:08:57 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com>

Jeremy Hylton:

> I have to admit that I'm a bit unclear on the motivation for all
> this.  As Gordon said, the state machine approach seems like it would
> be a good approach.

If i understand what you mean by state machine programming, it's pretty
inherently uncompartmented, all the combinations of state variables need
to be accounted for, so the number of states grows factorially on the
number of state vars, in general it's awkward.  The advantage of going
with what functional folks come up with, like continuations, is that it
tends to be well compartmented - functional.  (Come to think of it, i
suppose that compartmentalization as opposed to state is their mania.)

As abstract as i can be (because i hardly know what i'm talking about)
(but i have done some specifically finite state machine programming, and
did not enjoy it),

Ken
klm@digicool.com


From arw@ifu.net  Mon May 17 20:20:13 1999
From: arw@ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 15:20:13 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com>
Message-ID: <37406BED.95AEB896@ifu.net>

The ineffible Gordon McMillan retorts:

> As a fuddy-duddy old imperative programmer, I'm inclined to think
> "state machine". But I'd guess that functional-ophiles probably see
> that as inelegant. (Safe guess - they see _anything_ that isn't
> functional as inelegant!).

As a fellow fuddy-duddy I'd agree except that if you write properlylayered
software you have to unrole and rerole all those layers for every
transition of the multi-level state machine, and even though with proper
discipline it can be implemented without becoming hideous, it still adds
significant overhead compared to "stop right here and come back later"
which could be implemented using threads/coroutines(?)/continuations.
I think this is particularly true in Python with the relatively high
function
call overhead.  Or maybe I'm out in left field doing cartwheels...

I guess the question of interest is why are threads insufficient?  I guess

they have system limitations on the number of threads or other limitations

that wouldn't be a problem with continuations?  If there aren't a *lot* of

situations where coroutines are vital, I'd be hesitant to do major
surgery.
But I'm a fuddy-duddy.

   -- Aaron Watters

===
I did! I did!




From tismer@appliedbiometrics.com  Mon May 17 21:03:01 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 22:03:01 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net>
Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com>


Aaron Watters wrote:
> 
> The ineffible Gordon McMillan retorts:
> 
> > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > "state machine". But I'd guess that functional-ophiles probably see
> > that as inelegant. (Safe guess - they see _anything_ that isn't
> > functional as inelegant!).
> 
> As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> software you have to unrole and rerole all those layers for every
> transition of the multi-level state machine, and even though with proper
> discipline it can be implemented without becoming hideous, it still adds
> significant overhead compared to "stop right here and come back later"
> which could be implemented using threads/coroutines(?)/continuations.

Coroutines are most elegant here, since (fir a simple example)
they are a symmetric pair of functions which call each other.
There is neither the one-pulls, the other pushes asymmetry, nor
the need to maintain state and be controlled by a supervisor
function.

> I think this is particularly true in Python with the relatively high
> function
> call overhead.  Or maybe I'm out in left field doing cartwheels...
> I guess the question of interest is why are threads insufficient?  I guess
> they have system limitations on the number of threads or other limitations
> that wouldn't be a problem with continuations?  If there aren't a *lot* of
> situations where coroutines are vital, I'd be hesitant to do major
> surgery.

For me (as always) most interesting is the possible speed of
coroutines. They involve no threads overhead, no locking,
no nothing. Python supports it better than expected. If the
stack level of two code objects is the same at a switching point,
the whole switch is nothing more than swapping two frame objects,
and we're done. This might be even cheaper than general call/cc,
like a function call. Sam's prototype works already, with no change to
the
interpreter (but knowledge of Python frames, and a .dll of course).

I think we'll continue a while.

continuously - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From gmcm@hypernet.com  Mon May 17 23:17:25 1999
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 17:17:25 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com>
Message-ID: <1285142823-166838954@hypernet.com>

Co-Christian-routines Tismer continues:

> Aaron Watters wrote:
> > 
> > The ineffible Gordon McMillan retorts:
> > 
> > > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > > "state machine". But I'd guess that functional-ophiles probably see
> > > that as inelegant. (Safe guess - they see _anything_ that isn't
> > > functional as inelegant!).
> > 
> > As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> > software you have to unrole and rerole all those layers for every
> > transition of the multi-level state machine, and even though with proper
> > discipline it can be implemented without becoming hideous, it still adds
> > significant overhead compared to "stop right here and come back later"
> > which could be implemented using threads/coroutines(?)/continuations.
> 
> Coroutines are most elegant here, since (fir a simple example)
> they are a symmetric pair of functions which call each other.
> There is neither the one-pulls, the other pushes asymmetry, nor the
> need to maintain state and be controlled by a supervisor function.

Well, the state maintains you, instead of the other way 'round. (Any 
other ex-Big-Blue-ers out there that used to play these games with 
checkpoint and SyncSort?).

I won't argue elegance. Just a couple points:

- there's an art to writing state machines which is largely 
unrecognized (most of them are unnecessarily horrid).

- a multiplexed solution (vs a threaded solution) requires that 
something be inside out. In one case it's your code, in the other, 
your understanding of the problem. Neither is trivial.

Not to be discouraging - as long as your solution doesn't involve 
using regexps on bytecode <wink>, I say go for it!

- Gordon


From guido@CNRI.Reston.VA.US  Tue May 18 05:03:34 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 18 May 1999 00:03:34 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT."
 <14143.56604.21827.891993@seattle.nightmare.com>
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>
 <14143.56604.21827.891993@seattle.nightmare.com>
Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us>

Sam (& others),

I thought I understood what continuations were, but the examples of
what you can do with them so far don't clarify the matter at all.

Perhaps it would help to explain what a continuation actually does
with the run-time environment, instead of giving examples of how to
use them and what the result it?

Here's a start of my own understanding (brief because I'm on a 28.8k
connection which makes my ordinary typing habits in Emacs very
painful).

1. All program state is somehow contained in a single execution stack.
This includes globals (which are simply name bindings in the botton
stack frame).  It also includes a code pointer for each stack frame
indicating where the function corresponding to that stack frame is
executing (this is the return address if there is a newer stack frame, 
or the current instruction for the newest frame).

2. A continuation does something equivalent to making a copy of the
entire execution stack.  This can probably be done lazily.  There are
probably lots of details.  I also expect that Scheme's semantic model
is different than Python here -- e.g. does it matter whether deep or
shallow copies are made?  I.e. are there mutable *objects* in Scheme?
(I know there are mutable and immutable *name bindings* -- I think.)

3. Calling a continuation probably makes the saved copy of the
execution stack the current execution state; I presume there's also a
way to pass an extra argument.

4. Coroutines (which I *do* understand) are probably done by swapping
between two (or more) continuations.

5. Other control constructs can be done by various manipulations of
continuations.  I presume that in many situations the saved
continuation becomes the main control locus permanently, and the
(previously) current stack is simply garbage-collected.  Of course the 
lazy copy makes this efficient.



If this all is close enough to the truth, I think that continuations
involving C stack frames are definitely out -- as Tim Peters
mentioned, you don't know what the stuff on the C stack of extensions
refers to.  (My guess would be that Scheme implementations assume that
any pointers on the C stack point to Scheme objects, so that C stack
frames can be copied and conservative GC can be used -- this will
never happen in Python.)

Continuations involving only Python stack frames might be supported,
if we can agree on the the sharing / copying semantics.  This is where 
I don't know enough see questions at #2 above).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Tue May 18 05:46:12 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:46:12 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37406BED.95AEB896@ifu.net>
Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim>

[Aaron Watters]
> ...
> I guess the question of interest is why are threads insufficient?  I
> guess they have system limitations on the number of threads or other
> limitations that wouldn't be a problem with continuations?

Sam is mucking with thousands of simultaneous I/O-bound socket connections,
and makes a good case that threads simply don't fly here (each one consumes
a stack, kernel resources, etc).  It's unclear (to me) that thousands of
continuations would be *much* better, though, by the time Christian gets
done making thousands of copies of the Python stack chain.

> If there aren't a *lot* of situations where coroutines are vital, I'd
> be hesitant to do major surgery.  But I'm a fuddy-duddy.

Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the
docs.  They're very well written and describe the problem space exquisitely.
I don't have any problems like that I need to solve, but it's interesting to
ponder!

alas-no-time-for-it-now-ly y'rs  - tim




From tim_one@email.msn.com  Tue May 18 05:45:52 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:52 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com>
Message-ID: <000301bea0e9$4fd473a0$829e2299@tim>

[Christian Tismer]
> ...
> Yup. With a little counting, it was easy to survive:
>
> def main():
>     global a
>     a=2
>     thing (5)
>     a=a-1
>     if a:
>         saved.throw (0)

Did "a" really need to be global here?  I hope you see the same behavior
without the "global a"; e.g., this Scheme:

(define -cont- #f)

(define thing
  (lambda (n)
    (if (= n 2) (call/cc (lambda (k) (set! -cont- k))))
    (display "n == ") (display n) (newline)
    (if (= n 0)
	(begin (display "Done!") (newline))
	(thing (- n 1)))))

(define main
  (lambda ()
    (let ((a 2))
      (thing 5)
      (display "a is ") (display a) (newline)
      (set! a (- a 1))
      (if (> a 0)
	  (-cont- #f)))))

(main)

prints:

n == 5
n == 4
n == 3
n == 2
n == 1
n == 0
Done!
a is 2
n == 2
n == 1
n == 0
Done!
a is 1

Or does brute-force frame-copying cause the continuation to set "a" back to
2 each time?

> Weird enough

Par for the continuation course!  They're nasty when eaten raw.

> and needs a much better interface.

Ya, like screw 'em and use threads <wink>.

> But finally I'm quite happy that it worked so smoothly
> after just a couple of hours (well, about six :)

Yup!  Playing with Python internals is a treat.

to-be-continued-ly y'rs  - tim




From tim_one@email.msn.com  Tue May 18 05:45:57 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:57 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com>
Message-ID: <000401bea0e9$51e467e0$829e2299@tim>

[Sam]
>>> Continuations are more powerful than coroutines, though I admit
>>> they're a bit esoteric.

[Tim]
>> "More powerful" is a tedious argument you should always avoid <wink>.

[Sam]
> More powerful in the sense that you can use continuations to build
> lots of different control structures (coroutines, backtracking,
> exceptions), but not vice versa.

"More powerful" is a tedious argument you should always avoid <frown -- I'm
not touching this, but you can fight it out now with Aaron et alia <wink>>.

>> Then the continuation would (eventually) "return to" the
>> "print repr(saved)" and we'd get an infinite output tail [...]
>> and never reach line 4.  Right?

> Yes... the continuation object so far isn't very usable.

But it's proper behavior for a continuation all the same!  So this aspect
shouldn't be "fixed".

> ...
> let/cc stores the continuation in a variable binding, while
> introducing a new scope.  It requires a change to the underlying
> language:

Isn't this often implemented via a macro, though, so that

   (let/cc name code)

"acts like"

    (call/cc (lambda (name) code))

?  I haven't used a Scheme with native let/cc, but poking around it appears
that the real intent is to support exception-style function exits with a
mechanism cheaper than 1st-class continuations:  twice saw the let/cc object
(the thingie bound to "name") defined as being invalid the instant after
"code" returns, so it's an "up the call stack" gimmick.  That doesn't sound
powerful enough for what you're after.

> [nice let/cc call/cc tutorialette]
> ...
> In order to do useful work like passing values back and forth between
> coroutines, we have to have some way of returning a value from the
> continuation when it is reinvoked.

Somehow, I suspect that's the least of our problems <0.5 wink>.  If
continuations are in Python's future, though, I agree with the need as
stated.

> I should emphasize that most folks will never see call/cc 'in the
> raw', it will usually have some nice wrapper around to implement
> whatever construct is needed.

Python already has well-developed exception and thread facilities, so it's
hard to make a case for continuations as a catch-all implementation
mechanism.  That may be the rub here:  while any number of things *can* be
implementated via continuations, I think very few *need* to be implemented
that way, and full-blown continuations aren't easy to implement efficiently
& portably.

The Icon language was particularly concerned with backtracking searches, and
came up with generators as another clearer/cheaper implementation technique.
When it went on to full-blown coroutines, it's hard to say whether
continuations would have been a better approach.  But the coroutine
implementation it has is sluggish and buggy and hard to port, so I doubt
they could have done noticeably worse.

Would full-blown coroutines be powerful enough for your needs?

assuming-the-practical-defn-of-"powerful-enough"-ly y'rs  - tim




From rushing@nightmare.com  Tue May 18 06:18:06 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Mon, 17 May 1999 22:18:06 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim>
References: <14143.56604.21827.891993@seattle.nightmare.com>
 <000401bea0e9$51e467e0$829e2299@tim>
Message-ID: <14144.61765.308962.101884@seattle.nightmare.com>

Tim Peters writes:
 > Isn't this often implemented via a macro, though, so that
 > 
 >    (let/cc name code)
 > 
 > "acts like"
 > 
 >     (call/cc (lambda (name) code))

Yup, they're equivalent, in the sense that given one you can make a
macro to do the other.  call/cc is preferred because it doesn't
require a new binding construct.

 > ?  I haven't used a Scheme with native let/cc, but poking around it
 > appears that the real intent is to support exception-style function
 > exits with a mechanism cheaper than 1st-class continuations: twice
 > saw the let/cc object (the thingie bound to "name") defined as
 > being invalid the instant after "code" returns, so it's an "up the
 > call stack" gimmick.  That doesn't sound powerful enough for what
 > you're after.

Except that since the escape procedure is 'first-class' it can be
stored away and invoked (and reinvoked) later.  [that's all that
'first-class' means: a thing that can be stored in a variable,
returned from a function, used as an argument, etc..]

I've never seen a let/cc that wasn't full-blown, but it wouldn't
surprise me.

 > The Icon language was particularly concerned with backtracking
 > searches, and came up with generators as another clearer/cheaper
 > implementation technique.  When it went on to full-blown
 > coroutines, it's hard to say whether continuations would have been
 > a better approach.  But the coroutine implementation it has is
 > sluggish and buggy and hard to port, so I doubt they could have
 > done noticeably worse.

Many Scheme implementors either skip it, or only support non-escaping
call/cc (i.e., exceptions in Python).

 > Would full-blown coroutines be powerful enough for your needs?

Yes, I think they would be.  But I think with Python it's going to
be just about as hard, either way.

-Sam



From rushing@nightmare.com  Tue May 18 06:48:29 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Mon, 17 May 1999 22:48:29 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <51325225@toto.iv>
Message-ID: <14144.63787.502454.111804@seattle.nightmare.com>

Aaron Watters writes:
 > Frankly, I think I thought I understood this once but now I know I
 > don't.

8^)  That's what I said when I backed into the idea via medusa a
couple of years ago.

 > How're continuations more powerful than coroutines?  And why can't
 > they be implemented using threads (and semaphores etc)?

My understanding of the original 'coroutine' (from Pascal?) was that
it allows two procedures to 'resume' each other.  The classic
coroutine example is the 'samefringe' problem: given two trees of
differing structure, are they equal in the sense that a traversal of
the leaves results in the same list?  Coroutines let you do this
efficiently, comparing leaf-by-leaf without storing the whole tree.

continuations can do coroutines, but can also be used to implement
backtracking, exceptions, threads... probably other stuff I've never
heard of or needed.

The reason that Scheme and ML are such big fans of continuations is
because they can be used to implement all these other features.  Look
at how much try/except and threads complicate other language
implementations.  It's like a super-tool-widget - if you make sure
it's in your toolbox, you can use it to build your circular saw and
lathe from scratch.

Unfortunately there aren't many good sites on the web with good
explanatory material.  The best reference I have is "Essentials of
Programming Languages".  For those that want to play with some of
these ideas using little VM's written in Python:

  http://www.nightmare.com/software.html#EOPL

-Sam



From rushing@nightmare.com  Tue May 18 06:56:37 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Mon, 17 May 1999 22:56:37 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <13631823@toto.iv>
Message-ID: <14144.65355.400281.123856@seattle.nightmare.com>

Jeremy Hylton writes:
 > I have to admit that I'm a bit unclear on the motivation for all
 > this.  As Gordon said, the state machine approach seems like it would
 > be a good approach.

For simple problems, state machines are ideal.  Medusa uses state
machines that are built out of Python methods.  But past a certain
level of complexity, they get too hairy to understand.  A really good
example can be found in /usr/src/linux/net/ipv4.  8^)

-Sam



From rushing@nightmare.com  Tue May 18 08:05:20 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Tue, 18 May 1999 00:05:20 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <60057226@toto.iv>
Message-ID: <14145.927.588572.113256@seattle.nightmare.com>

Guido van Rossum writes:
 > Perhaps it would help to explain what a continuation actually does
 > with the run-time environment, instead of giving examples of how to
 > use them and what the result it?

This helped me a lot, and is the angle used in "Essentials of
Programming Languages":

Usually when folks refer to a 'stack', they're refering to an
*implementation* of the stack data type: really an optimization that
assumes an upper bound on stack size, and that things will only be
pushed and popped in order.

If you were to implement a language's variable and execution stacks
with actual data structures (linked lists), then it's easy to see
what's needed: the head of the list represents the current state.  As
functions exit, they pop things off the list.

The reason I brought this up (during a lull!) was that Python is
already paying all of the cost of heap-allocated frames, and it didn't
seem to me too much of a leap from there.

 > 1. All program state is somehow contained in a single execution stack.
Yup.

 > 2. A continuation does something equivalent to making a copy of the
 > entire execution stack.
Yup.
 > I.e. are there mutable *objects* in Scheme?
 > (I know there are mutable and immutable *name bindings* -- I think.)

Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!,
all the things that make it 'impure'.

I think shallow copies are what's expected.  In the examples I have,
the continuation is kept in a 'register', and call/cc merely packages
it up with a little function wrapper.  You are allowed to stomp all
over lexical variables with "set!".

 > 3. Calling a continuation probably makes the saved copy of the
 > execution stack the current execution state; I presume there's also a
 > way to pass an extra argument.
Yup.
 > 4. Coroutines (which I *do* understand) are probably done by swapping
 > between two (or more) continuations.
Yup.  Here's an example in Scheme:

http://www.nightmare.com/stuff/samefringe.scm

Somewhere I have an example of coroutines being used for parsing, very
elegant.  Something like one coroutine does lexing, and passes tokens
one-by-one to the next level, which passes parsed expressions to a
compiler, or whatever.  Kinda like pipes.

 > 5. Other control constructs can be done by various manipulations of
 > continuations.  I presume that in many situations the saved
 > continuation becomes the main control locus permanently, and the
 > (previously) current stack is simply garbage-collected.  Of course
 > the lazy copy makes this efficient.

Yes... I think backtracking would be an example of this.  You're doing
a search on a large space (say a chess game).  After a certain point
you want to try a previous fork, to see if it's promising, but you
don't want to throw away your current work.  Save it, then unwind back
to the previous fork, try that option out... if it turns out to be
better then toss the original.

 > If this all is close enough to the truth, I think that
 > continuations involving C stack frames are definitely out -- as Tim
 > Peters mentioned, you don't know what the stuff on the C stack of
 > extensions refers to.  (My guess would be that Scheme
 > implementations assume that any pointers on the C stack point to
 > Scheme objects, so that C stack frames can be copied and
 > conservative GC can be used -- this will never happen in Python.)

I think you're probably right here - usually there are heavy
restrictions on what kind of data can pass through the C interface.
But I know of at least one Scheme (mzscheme/PLT) that uses
conservative gc and has c/c++ interfaces. [... dig dig ...]

From this:

http://www.cs.rice.edu/CS/PLT/packages/doc/insidemz/node21.htm#exceptions

and looking at the code it looks like they enforce the restriction
exactly as you described in an earlier mail: call/cc is safe for
c->scheme calls only if they invoke a new top-level machine.

 > Continuations involving only Python stack frames might be
 > supported, if we can agree on the the sharing / copying semantics.
 > This is where I don't know enough see questions at #2 above).

Woo Hoo!  Where do I send the Shrubbery?

-Sam



From rushing@nightmare.com  Tue May 18 08:17:11 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Tue, 18 May 1999 00:17:11 -0700 (PDT)
Subject: [Python-Dev] another good motivation
Message-ID: <14145.4917.164756.300678@seattle.nightmare.com>

"Escaping the event loop: an alternative control structure for multi-threaded GUIs"

http://cs.nyu.edu/phd_students/fuchs/
http://cs.nyu.edu/phd_students/fuchs/gui.ps

-Sam



From tismer@appliedbiometrics.com  Tue May 18 14:46:53 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 15:46:53 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <000901bea0e9$5aa2dec0$829e2299@tim>
Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Aaron Watters]
> > ...
> > I guess the question of interest is why are threads insufficient?  I
> > guess they have system limitations on the number of threads or other
> > limitations that wouldn't be a problem with continuations?
> 
> Sam is mucking with thousands of simultaneous I/O-bound socket connections,
> and makes a good case that threads simply don't fly here (each one consumes
> a stack, kernel resources, etc).  It's unclear (to me) that thousands of
> continuations would be *much* better, though, by the time Christian gets
> done making thousands of copies of the Python stack chain.

Well, what he needs here are coroutines and just a single frame
object for every minithread (I think this is a "fiber"?).
If these fibers later do deep function calls before they switch,
there will of course be more frames then.

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Tue May 18 15:35:30 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 16:35:30 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>
 <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <37417AB2.80920595@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> Sam (& others),
> 
> I thought I understood what continuations were, but the examples of
> what you can do with them so far don't clarify the matter at all.
> 
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?
> 
> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
> 
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).  It also includes a code pointer for each stack frame
> indicating where the function corresponding to that stack frame is
> executing (this is the return address if there is a newer stack frame,
> or the current instruction for the newest frame).

Right. For now, this information is on the C stack for each called
function, although almost completely available in the frame chain.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.  I also expect that Scheme's semantic model
> is different than Python here -- e.g. does it matter whether deep or
> shallow copies are made?  I.e. are there mutable *objects* in Scheme?
> (I know there are mutable and immutable *name bindings* -- I think.)

To make it lazy, a gatekeeper must be put on top of the two
splitted frames, which catches the event that one of them
returns. It appears to me that this it the same callcc.new()
object which catches this, splitting frames when hit by a return.

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.
> 
> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.

Right, which is just two or three assignments.

> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

Yes, great. It looks like that switching continuations
is not more expensive than a single Python function call.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

This would mean to avoid creating incompatible continuations.
A continutation may not switch to a frame chain which was created
by a different VM incarnation since this would later on
corrupt the machine stack. One way to assure that would be
a thread-safe function in sys, similar to sys.exc_info()
which gives an id for the current interpreter. continuations
living somewhere in globals would be marked by the interpreter
which created them, and reject to be thrown if they don't match.

The necessary interpreter support appears to be small:

Extend the PyFrame structure by two fields:
  - interpreter ID  (addr of some local variable would do)
  - stack pointer at current instruction.

Change the CALL_FUNCTION opcode to avoid calling eval recursively
in the case of a Python function/method, but the current frame,
build the new one and start over.
RETURN will pop a frame and reload its local variables instead
of returning, as long as there is a frame to pop.

I'm unclear how exceptions should be handled. Are they currently
propagated up across different C calls other than ceval2
recursions?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jeremy@cnri.reston.va.us  Tue May 18 16:05:39 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 18 May 1999 11:05:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com>
References: <60057226@toto.iv>
 <14145.927.588572.113256@seattle.nightmare.com>
Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us>

>>>>> "SR" == rushing  <rushing@nightmare.com> writes:

  SR> Somewhere I have an example of coroutines being used for
  SR> parsing, very elegant.  Something like one coroutine does
  SR> lexing, and passes tokens one-by-one to the next level, which
  SR> passes parsed expressions to a compiler, or whatever.  Kinda
  SR> like pipes.

This is the first example that's used in Structured Programming (Dahl,
Djikstra, and Hoare).  I'd be happy to loan a copy to any of the
Python-dev people who sit nearby.

Jeremy


From tismer@appliedbiometrics.com  Tue May 18 16:31:11 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 17:31:11 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <374187BF.36CC65E7@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

(Hüstel) Actually, I inserted the "global" later. It worked as well
with a local variable, but I didn't understand it. Still don't :-)

> Or does brute-force frame-copying cause the continuation to set "a" back to
> 2 each time?

No, it doesn't. Behavior is exactly the same with or without
global. I'm not sure wether this is a bug or a feature.
I *think* 'a' as a local has a slot in the frame, so it's
actually a different 'a' living in both copies. But this
would not have worked.
Can it be that before a function call, the interpreter
turns its locals into a dict, using fast_to_locals?
That would explain it.
This is not what I think it should be! Locals need to be
copied.

> > and needs a much better interface.
> 
> Ya, like screw 'em and use threads <wink>.

Never liked threads. These fibers are so neat since
they don't need threads, no locking, and they are
available on systems without threads.

> > But finally I'm quite happy that it worked so smoothly
> > after just a couple of hours (well, about six :)
> 
> Yup!  Playing with Python internals is a treat.
> 
> to-be-continued-ly y'rs  - tim

throw(42) - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From skip@mojam.com (Skip Montanaro)  Tue May 18 16:49:42 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 18 May 1999 11:49:42 -0400
Subject: [Python-Dev] Is there another way to solve the continuation problem?
Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>

Okay, from my feeble understanding of the problem it appears that
coroutines/continuations and threads are going to be problematic at best for 
Sam's needs.  Are there other "solutions"?  We know about state machines.
They have the problem that the number of states grows exponentially (?) as
the number of state variables increases.

Can exceptions be coerced into providing the necessary structure without
botching up the application too badly?  Seems that at some point where you
need to do some I/O, you could raise an exception whose second expression
contains the necessary state to get back to where you need to be once the
I/O is ready to go.  The controller that catches the exceptions would use
select or poll to prepare for the I/O then dispatch back to the handlers
using the information from exceptions.

class IOSetup:
    pass

class WaveHands:
    """maintains exception raise info and selects one to go to next"""
    def choose_one(r,w,e):
	pass

    def remember(info):
	pass

def controller(...):
    waiters = WaveHands()
    while 1:
	r, w, e = select([...], [...], [...])
	# using r,w,e, select a waiter to call
	func, place = waiters.choose_one(r,w,e)
	try:
	    func(place)
	except IOSetup, info:
	    waiters.remember(info)


def spam_func(place):
    if place == "spam":
	# whatever I/O we needed to do is ready to go
	bytes = read(some_fd)
	process(bytes)
	# need to read some more from some_fd. args are:
	#    function, target, fd category (r, w), selectable object, 
	raise IOSetup, (spam_func, "eggs" , "r", some_fd)

    elif place == "eggs":
	# that next chunk is ready - get it and proceed...

    elif yadda, yadda, yadda...


One thread, some craftiness needed to construct things.  Seems like it might
isolate some of the statefulness to smaller functional units than a pure
state machine.  Clearly not as clean as continuations would be.  Totally
bogus?  Totally inadequate?  Maybe Sam already does things this way?


Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From tismer@appliedbiometrics.com  Tue May 18 18:23:08 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 19:23:08 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

Actually, the frame-copying was not enough to make this 
all behave correctly. Since I didn't change the interpreter,
the ceval.c incarnations still had copies to the old frames.
The only effect which I achieved with frame copying was
that the refcounts were increased correctly.

I have to remove the hardware stack copying now.
Will try to create a non-recursive version of the interpreter.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From MHammond@skippinet.com.au  Wed May 19 00:16:54 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Wed, 19 May 1999 09:16:54 +1000
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>
Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat>

> Sam's needs.  Are there other "solutions"?  We know about
> state machines.
> They have the problem that the number of states grows
> exponentially (?) as
> the number of state variables increases.

Well, I can give you my feeble understanding of "IO Completion Ports", the
technique Win32 provides to "solve" this problem.

My experience is limited to how we used these in a server product designed
to maintain thousands of long-term client connections each spooling large
chunks of data (MSOffice docs - yes, that large :-).  We too could
obviously not afford a thread per connection.  Searching through NT's
documentation, completion ports are the technique they recommend for
high-performance IO, and it appears to deliver.

NT has the concept of a completion port, which in many ways is like an
"inverted semaphore".  You create a completion port with a "max number of
threads" value.  Then, for every IO object you need to use (files, sockets,
pipes etc) you "attach" it to the completion port, along with an integer
key.  This key is (presumably) unique to the file, and usually a pointer to
some structure maintaing the state of the file (ie, connection)

The general programming model is that you have a small number of threads
(possibly 1), and a large number of io objects (eg files).  Each of these
threads is executing a state machine.  When IO is "ready" for a particular
file, one of the available threads is woken, and passed the "key"
associated with the file.  This key identifies the file, and more
importantly the state of that file.  The thread uses the state to perform
the next IO operation, then immediately go back to sleep.  When that IO
operation completes, some other thread is woken to handle that state
change.  What makes this work of course is that _all_ IO is asynch - not a
single IO call in this whole model can afford to block.  NT provides asynch
IO natively.

This sounds very similar to what Medusa does internally, although the NT
model provides a "thread pooling" scheme built-in.  Although our server
performed very well with a single thread and hundreds of high-volume
connections, we chose to run with a default of 5 threads here.

For those still interested, our project has the multi-threaded state
machine I described above implemented in C.  Most of the work is
responsible for spooling the client request data (possibly 100s of kbs)
before handing that data off to the real server.  When the C code
transitions the client through the state of "send/get from the real
server", we actually set a different completion port.  This other
completion port wakes a thread written in Python.  So our architecture
consists of a C implemented thread-pool managing client connections, and a
different Python implemented thread pool that does the real work for each
of these client connections. (The Python side of the world is bound by the
server we are talking to, so Python performance doesnt matter as much - C
wouldnt buy enough)

This means that our state machines are not that complex.  Each "thread
pool" is managing its own, fairly simple state.  NT automatically allows
you to associate state with the IO object, and as we have multiple thread
pools, each one is simple - the one spooling client data is simple, the one
doing the actual server work is simple.  If we had to have a single,
monolithic state machine managing all aspects of the client spooling, _and_
the server work, it would be horrid.

This is all in a shrink-wrapped relatively cheap "Document Management"
product being targetted (successfully, it appears) at huge NT/Exchange
based sites.  Australia's largest Telco are implementing it, and indeed the
company has VC from Intel!  Lots of support from MS, as it helps compete
with Domino.  Not bad for a little startup - now they are wondering what to
do with this Python-thingy they now have in their product that noone else
has ever heard off; but they are planning on keeping it for now :-)
[Funnily, when they started, they didnt think they even _needed_ a server,
so I said "Ill just knock up a little one in Python", and we havent looked
back :-]

Mark.



From tim_one@email.msn.com  Wed May 19 01:48:00 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 20:48:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim>

[GvR]
> ...
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?

Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme
and its implementation:

ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html

You can pick up a lot from that fast.  Is Steven (Majewski) on this list?
He doped most of this out years ago.

> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
>
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).

Better to think of name resolution following lexical links.  Lexical
closures with indefinite extent are common in Scheme, so much so that name
resolution is (at least conceptually) best viewed as distinct from execution
stacks.

Here's a key:  continuations are entirely about capturing control flow
state, and nothing about capturing binding or data state.  Indeed, mutating
bindings and/or non-local data are the ways distinct invocations of a
continuation communicate with each other, and for this reason true
functional languages generally don't support continuations of the call/cc
flavor.

> It also includes a code pointer for each stack frame indicating where
> the function corresponding to that stack frame is executing (this is
> the return address if there is a newer stack frame, or the current
> instruction for the newest frame).

Yes, although the return address is one piece of information in the current
frame's continuation object -- continuations are used internally for
"regular calls" too.  When a function returns, it passes control thru its
continuation object.  That process restores-- from the continuation
object --what the caller needs to know (in concept:  a pointer to *its*
continuation object, its PC, its name-resolution chain pointer, and its
local eval stack).

Another key point:  a continuation object is immutable.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.

The point of the above is to get across that for Scheme-calling-Scheme,
creating a continuation object copies just a small, fixed number of pointers
(the current continuation pointer, the current name-resolution chain
pointer, the PC), plus the local eval stack.  This is for a "stackless"
interpreter that heap-allocates name-mapping and execution-frame and
continuation objects.  Half the literature is devoted to optimizing one or
more of those away in special cases (e.g., for continuations provably
"up-level", using a stack + setjmp/longjmp instead).

> I also expect that Scheme's semantic model is different than Python
> here -- e.g. does it matter whether deep or shallow copies are made?
> I.e. are there mutable *objects* in Scheme? (I know there are mutable
> and immutable *name bindings* -- I think.)

Same as Python here; Scheme isn't a functional language; has mutable
bindings and mutable objects; any copies needed should be shallow, since
it's "a feature" that invoking a continuation doesn't restore bindings or
object values (see above re communication).

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.

Right, except "stack" is the wrong mental model in the presence of
continuations; it's a general rooted graph (A calls B, B saves a
continuation pointing back to A, B goes on to call A, A saves a continuation
pointing back to B, etc).  If the explicitly saved continuations are never
*invoked*, control will eventually pop back to the root of the graph, so in
that sense there's *a* stack implicit at any given moment.

> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.
>
> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

There's much less copying going on in Scheme-to-Scheme than you might think;
other than that, right on.

> If this all is close enough to the truth, I think that continuations
> involving C stack frames are definitely out -- as Tim Peters
> mentioned, you don't know what the stuff on the C stack of extensions
> refers to.  (My guess would be that Scheme implementations assume that
> any pointers on the C stack point to Scheme objects, so that C stack
> frames can be copied and conservative GC can be used -- this will
> never happen in Python.)

"Scheme" has become a generic term covering dozens of implementations with
varying semantics, and a quick tour of the web suggests that cross-language
Schemes generally put severe restrictions on continuations across language
boundaries.  Most popular seems to be to outlaw them by decree.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

I'd like to go back to examples of what they'd be used for <wink> -- but
fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
and "lambdaness" and syntax-extension facilities, I'm unsure they're going
to work out reasonably in Python practice; it's not enough that they can be
very useful in Scheme, and Sam is highly motivated to go to extremes here.

give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim




From tismer@appliedbiometrics.com  Wed May 19 02:10:15 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 03:10:15 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea191$3f4d1a20$2e9e2299@tim>
Message-ID: <37420F77.48E9940F@appliedbiometrics.com>


Tim Peters wrote:
...

> > Continuations involving only Python stack frames might be supported,
> > if we can agree on the the sharing / copying semantics.  This is where
> > I don't know enough see questions at #2 above).
> 
> I'd like to go back to examples of what they'd be used for <wink> -- but
> fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
> and "lambdaness" and syntax-extension facilities, I'm unsure they're going
> to work out reasonably in Python practice; it's not enough that they can be
> very useful in Scheme, and Sam is highly motivated to go to extremes here.
> 
> give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim

I've put quite many hours into a non-recursive ceval.c
already. Should I continue? At least this would be a little
improvement, also if the continuation thing will not be born. ?

- chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From rushing@nightmare.com  Wed May 19 03:52:04 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Tue, 18 May 1999 19:52:04 -0700 (PDT)
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <101382377@toto.iv>
Message-ID: <14146.8395.754509.591141@seattle.nightmare.com>

Skip Montanaro writes:
 > Can exceptions be coerced into providing the necessary structure
 > without botching up the application too badly?  Seems that at some
 > point where you need to do some I/O, you could raise an exception
 > whose second expression contains the necessary state to get back to
 > where you need to be once the I/O is ready to go.  The controller
 > that catches the exceptions would use select or poll to prepare for
 > the I/O then dispatch back to the handlers using the information
 > from exceptions.

 > [... code ...]

Well, you just re-invented the 'Reactor' pattern! 8^)

http://www.cs.wustl.edu/~schmidt/patterns-ace.html

 > One thread, some craftiness needed to construct things.  Seems like
 > it might isolate some of the statefulness to smaller functional
 > units than a pure state machine.  Clearly not as clean as
 > continuations would be.  Totally bogus?  Totally inadequate?  Maybe
 > Sam already does things this way?

What you just described is what Medusa does (well, actually, 'Python'
does it now, because the two core libraries that implement this are
now in the library - asyncore.py and asynchat.py).  asyncore doesn't
really use exceptions exactly that way, and asynchat allows you to add 
another layer of processing (basically, dividing the input into
logical 'lines' or 'records' depending on a 'line terminator').

The same technique is at the heart of many well-known network servers,
including INND, BIND, X11, Squid, etc..  It's really just a state
machine underneath (with python functions or methods implementing the
'states').  As long as things don't get too complex.  Python
simplifies things enough to allow one to 'push the difficulty
envelope' a bit further than one could reasonably tolerate in C.  For
example, Squid implements async HTTP (server and client, because it's
a proxy) - but stops short of trying to implement async FTP.  Medusa
implements async FTP, but it's the largest file in the Medusa
distribution, weighing in at a hefty 32KB.

The hard part comes when you want to plug different pieces and
protocols together.  For example, building a simple HTTP or FTP server
is relatively easy, but building an HTTP server *that proxied to an
FTP server* is much more difficult.  I've done these kinds of things,
viewing each as a challenge; but past a certain point it boggles.

The paper I posted about earlier by Matthew Fuchs has a really good
explanation of this, but in the context of GUI event loops... I think
it ties in neatly with this discussion because at the heart of any X11
app is a little guy manipulating a file descriptor.

-Sam



From tim_one@email.msn.com  Wed May 19 06:41:39 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:39 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com>
Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim>

[Sam]
> ...
> Except that since the escape procedure is 'first-class' it can be
> stored away and invoked (and reinvoked) later.  [that's all that
> 'first-class' means: a thing that can be stored in a variable,
> returned from a function, used as an argument, etc..]
>
> I've never seen a let/cc that wasn't full-blown, but it wouldn't
> surprise me.

The let/cc's in question were specifically defined to create continuations
valid only during let/cc's dynamic extent, so that, sure, you could store
them away, but trying to invoke one later could be an error.  It's in that
sense I meant they weren't "first class".

Other flavors of Scheme appear to call this concept "weak continuation", and
use a different verb to invoke it (like call-with-escaping-continuation, or
call/ec).  Suspect the let/cc oddballs I found were simply confused
implementations (there are a lot of amateur Scheme implementations out
there!).

>> Would full-blown coroutines be powerful enough for your needs?

> Yes, I think they would be.  But I think with Python it's going to
> be just about as hard, either way.

Most people on this list are comfortable with coroutines already because
they already understand them -- Jeremy can even reach across the hall and
hand Guido a helpful book <wink>.  So pondering coroutines increase the
number of brain cells willing to think about the implementation.

continuation-examples-leave-people-still-going-"huh?"-after-an-
    hour-of-explanation-ly y'rs  - tim




From tim_one@email.msn.com  Wed May 19 06:41:45 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:45 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com>
Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim>

[Christian Tismer]
>>> ...
>>> Yup. With a little counting, it was easy to survive:
>>>
>>> def main():
>>>     global a
>>>     a=2
>>>     thing (5)
>>>     a=a-1
>>>     if a:
>>>         saved.throw (0)

[Tim]
>> Did "a" really need to be global here?  I hope you see the same behavior
>> without the "global a";
[which he does, but for mysterious reasons]

[Christian]
> Actually, the frame-copying was not enough to make this
> all behave correctly. Since I didn't change the interpreter,
> the ceval.c incarnations still had copies to the old frames.
> The only effect which I achieved with frame copying was
> that the refcounts were increased correctly.

All right!  Now you're closer to the real solution <wink>; i.e., copying
wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
when we entered main originally a set of bindings was created for its
locals, and it is that very same set of bindings to which the continuation
returns.  So the continuation *should* reuse them -- making a copy of the
locals is semantically hosed.

This is clearer in Scheme because its "stack" holds *only* control-flow info
(bindings follow a chain of static links, independent of the current "call
stack"), so there's no temptation to run off copying bindings too.

elegant-and-baffling-for-the-price-of-one<wink>-ly y'rs  - tim




From tim_one@email.msn.com  Wed May 19 06:41:56 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:56 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com>
Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim>

[Christian Tismer]
> I've put quite many hours into a non-recursive ceval.c
> already.

Does that mean 6 or 600 <wink>?

> Should I continue? At least this would be a little improvement, also
> if the continuation thing will not be born. ?

Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
so my belief is it's worth pursuing.

but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim




From arw@ifu.net  Wed May 19 14:04:53 1999
From: arw@ifu.net (Aaron Watters)
Date: Wed, 19 May 1999 09:04:53 -0400
Subject: [Python-Dev] continuations and C extensions?
Message-ID: <3742B6F5.C6CB7313@ifu.net>

the immutable GvR intones:
> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where

> I don't know enough see questions at #2 above).

What if there are native C calls mixed in (eg, list.sort calls back to
myclass.__cmp__ which decides to do a call/cc).  One of the really
big advantages of Python in my book is the relative simplicity of
embedding
and extensions, and this is generally one of the failings of lisp
implementations.
I understand lots of scheme implementations purport
to be extendible and embeddable, but in practice you can't do it with
*existing* code -- there is always a show stopper involving having to
change the way some Oracle library which you don't have the source for
does memory management or something... I've known several grad students
who have been bitten by this...  I think having to unroll the C stack
safely
might be one problem area.

With, eg, a netscape nsapi embedding you can actually get into netscape
code calls my code calls netscape code calls my code... suspends in a
continuation?  How would that work?  [my ignorance is torment!]

Threading and extensions are probably also problematic, but at least
it's
better understood, I think.  Just kvetching.  Sorry.
   -- Aaron Watters

ps: Of course there are valid reasons and excellent advantages
  to having continuations, but it's also interesting to consider the
possible cost.
  There ain't no free lunch.




From tismer@appliedbiometrics.com  Wed May 19 20:30:18 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 21:30:18 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
Message-ID: <3743114A.220FFA0B@appliedbiometrics.com>


Tim Peters wrote:
...
> [Christian]
> > Actually, the frame-copying was not enough to make this
> > all behave correctly. Since I didn't change the interpreter,
> > the ceval.c incarnations still had copies to the old frames.
> > The only effect which I achieved with frame copying was
> > that the refcounts were increased correctly.
> 
> All right!  Now you're closer to the real solution <wink>; i.e., copying
> wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
> when we entered main originally a set of bindings was created for its
> locals, and it is that very same set of bindings to which the continuation
> returns.  So the continuation *should* reuse them -- making a copy of the
> locals is semantically hosed.

I tried the most simple thing, and this seemed to be duplicating
the current state of the machine. The frame holds the stack,
and references to all objects.
By chance, the locals are not in a dict, but unpacked into
the frame. (Sometimes I agree with Guido, that optimization
is considered harmful :-)

> This is clearer in Scheme because its "stack" holds *only* control-flow info
> (bindings follow a chain of static links, independent of the current "call
> stack"), so there's no temptation to run off copying bindings too.

The Python stack, besides its intermingledness with the machine
stack, is basically its chain of frames. The value stack pointer
still hides in the machine stack, but that's easy to change.
So the real Scheme-like part is this chain, methinks, with
the current bytecode offset and value stack info.

Making a copy of this in a restartable way means to increase
the refcount of all objects in a frame. Would it be correct
to undo the effect of fast locals before splitting, and redoing
it on activation?

Or do I need to rethink the whole structure? What should
be natural for Python, it at all?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jeremy@cnri.reston.va.us  Wed May 19 20:46:49 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 15:46:49 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
 <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:

  [Tim Peters]
  >> This is clearer in Scheme because its "stack" holds *only*
  >> control-flow info (bindings follow a chain of static links,
  >> independent of the current "call stack"), so there's no
  >> temptation to run off copying bindings too.

  CT> The Python stack, besides its intermingledness with the machine
  CT> stack, is basically its chain of frames. The value stack pointer
  CT> still hides in the machine stack, but that's easy to change.  So
  CT> the real Scheme-like part is this chain, methinks, with the
  CT> current bytecode offset and value stack info.

  CT> Making a copy of this in a restartable way means to increase the
  CT> refcount of all objects in a frame. Would it be correct to undo
  CT> the effect of fast locals before splitting, and redoing it on
  CT> activation?

Wouldn't it be easier to increase the refcount on the frame object?
Then you wouldn't need to worry about the recounts on all the objects
in the frame, because they would only be decrefed when the frame is
deallocated. 

It seems like the two other things you would need are some way to get
a copy of the current frame and a means to invoke eval_code2 with an
already existing stack frame instead of a new one.

(This sounds too simple, so it's obviously wrong.  I'm just not sure
where.  Is the problem that you really need a seperate stack/graph to
hold the frames?  If we leave them on the Python stack, it could be
hard to dis-entangle value objects from control objects.)

Jeremy


From tismer@appliedbiometrics.com  Wed May 19 21:10:16 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 22:10:16 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
 <3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>
Message-ID: <37431AA8.BC77C615@appliedbiometrics.com>


Jeremy Hylton wrote:

[TP+CT about frame copies et al]

> Wouldn't it be easier to increase the refcount on the frame object?
> Then you wouldn't need to worry about the recounts on all the objects
> in the frame, because they would only be decrefed when the frame is
> deallocated.

Well, the frame is supposed to be run twice, since there are
two incarnations of interpreters working on it: The original one,
and later, when it is thown, another one (or the same, but, in
principle). 
The frame could have been in any state, with a couple
of objects on the stack. My splitting function can be invoked
in some nested context, so I have a current opcode position,
and a current stack position.
Running this once leaves the stack empty, since all the objects are
decrefed. Running this a second time gives a GPF, since the stack is
empty.
Therefore, I made a copy which means to create a duplicate frame
with an extra refcound for all the objects. This makes sure
that both can be restarted at any time.

> It seems like the two other things you would need are some way to get
> a copy of the current frame and a means to invoke eval_code2 with an
> already existing stack frame instead of a new one.

Well, that's exactly where I'm working on.

> (This sounds too simple, so it's obviously wrong.  I'm just not sure
> where.  Is the problem that you really need a seperate stack/graph to
> hold the frames?  If we leave them on the Python stack, it could be
> hard to dis-entangle value objects from control objects.)

Oh, perhaps I should explain it a bit clearer?
What did you mean by the Python stack? The hardware machine stack?

What do we have at the moment:
The stack is the linked list of frames. Every frame has a
local Python evaluation stack. Calls of Python functions produce
a new frame, and the old one is put beneath. This is the control
stack. The additional info on the hardware stack happens to be
a parallel friend of this chain, and currently holds extra info,
but this is an artifact. Adding the current Python stack level
to the frame makes the hardware stack totally unnecessary.

There is a possible speed loss, anyway.
Today, the recursive call of ceval2 is optimized and quite
fast. The non-recursive Version will have to copy variables
in and out from the frames, instead, so there is of course
a little speed penalty to pay.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Wed May 19 22:38:07 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 23:38:07 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > I've put quite many hours into a non-recursive ceval.c
> > already.
> 
> Does that mean 6 or 600 <wink>?

6, or 10, or 20, if I count the time from the first
start with Sam's code, maybe.

> 
> > Should I continue? At least this would be a little improvement, also
> > if the continuation thing will not be born. ?
> 
> Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
> so my belief is it's worth pursuing.
> 
> but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim

Right. Who'se faces? :-)

On the stackless thing, what should I do.
I started to insert minimum patches, but it turns out
that I have to change frames a little (extending).

I can make quite small changes to the interpreter to replace
the recursive calls, but this involves extra flags in some cases,
where the interpreter is called the first time and so on.

What has more probability to be included into a future Python:
Tweaking the current thing only minimally, to make it as similar
as possible as the former?
Or do as much redesign as I think is needed to do it in
a clean way. This would mean to split eval_code2 into two functions,
where one is the interpreter kernel, and one is the frame manager.

There are also other places which do quite deep function calls
and finally call eval_code2. I think these should return a frame
object now. I could convince them to call or return frame,
depending on a flag, but it would be clean to rename the functions,
let them always deal with frames, and put the original function
on top of it.

Short, I can do larger changes which clean this all a bit up,
or I can make small changes which are more tricky to grasp,
but give just small diffs.

How to touch untouchable code the best? :-)

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jeremy@cnri.reston.va.us  Wed May 19 22:49:38 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 17:49:38 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com>
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
 <37432F3F.2694DA0E@appliedbiometrics.com>
Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>

I think it makes sense to avoid being obscure or unclear in order to
minimize the size of the patch or the diff.  Realistically, it's
unlikely that anything like your original patch is going to make it
into the CVS tree.  It's primary value is as proof of concept and as
code that the rest of us can try out.  If you make large changes, but
they are clearer, you'll help us out a lot.

We can worry about minimizing the impact of the changes on the
codebase after, after everyone has figured out what's going on and
agree that its worth doing.

feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
Jeremy



From tismer@appliedbiometrics.com  Wed May 19 23:25:20 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 00:25:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
 <37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>
Message-ID: <37433A50.31E66CB1@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> I think it makes sense to avoid being obscure or unclear in order to
> minimize the size of the patch or the diff.  Realistically, it's
> unlikely that anything like your original patch is going to make it
> into the CVS tree.  It's primary value is as proof of concept and as
> code that the rest of us can try out.  If you make large changes, but
> they are clearer, you'll help us out a lot.

Many many thanks. This is good advice.
I will make absolutely clear what's going on, keep
parts untouched as possible, cut out parts which must
change, and I will not look into speed too much.

Better have a function call more and a bit less optimization,
but a clear and rock-solid introduction of a concept.

> We can worry about minimizing the impact of the changes on the
> codebase after, after everyone has figured out what's going on and
> agree that its worth doing.
> 
> feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
> Jeremy

Hihi - the new little slot with local variables of the 
interpreter happens to have the name "continuation".
Maybe I'd better rename it to "activation record"?.

Now, there is no longer a recoursive call. Instead, a frame
object is returned, which is waiting to be activated
by a dispatcher.

Some more ideas are popping up. Right now, only the recursive
calls can vanish. Callbacks from C code which is called by
the interpreter whcih is called by... is still a problem.

But it might perhaps vanish completely. We have to see
how much the cost is. But if I can manage to let the interpreter
duck and cover also on every call to a builtin? The interpreter
again returns to the dispatcher which then calls the builtin.
Well, if that builtin happens to call to the interpreter again,
it will be a dispatcher again. The machine stack grows a little,
but since everything is saved in the frames, these stacks are
no longer related. This means, the principle works with existing
extension modules, since interpreter-world and C-stack world
are decoupled.
To avoid stack growth, of course a number of builtins would
be better changed, but it is no must in the first place.
execfile for instance is a candidate which needn't call the
interpreter. It could equally parse the file, generate the
code object, build a frame and just return it. This is what
the dispatcher likes: returned frames are put on the chain
and fired.

waah, my bus - running - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tim_one@email.msn.com  Thu May 20 00:56:33 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 19:56:33 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000701bea253$3a182a00$179e2299@tim>

I'm home sick today, so tortured myself <0.9 wink>.

Sam mentioned using coroutines to compare the fringes of two trees, and I
picked a simpler problem:  given a nested list structure, generate the leaf
elements one at a time, in left-to-right order.  A solution to Sam's problem
can be built on that, by getting a generator for each tree and comparing the
leaves a pair at a time until there's a difference.

Attached are solutions in Icon, Python and Scheme.  I have the least
experience with Scheme, but browsing around didn't find a better Scheme
approach than this.

The Python solution is the least satisfactory, using an explicit stack to
simulate recursion by hand; if you didn't know the routine's purpose in
advance, you'd have a hard time guessing it.

The Icon solution is very short and simple, and I'd guess obvious to an
average Icon programmer.  It uses the subset of Icon ("generators") that
doesn't require any C-stack trickery.  However, alone of the three, it
doesn't create a function that could be explicitly called from several
locations to produce "the next" result; Icon's generators are tied into
Icon's unique control structures to work their magic, and breaking that
connection requires moving to full-blown Icon coroutines.  It doesn't need
to be that way, though.

The Scheme solution was the hardest to write, but is a largely mechanical
transformation of a recursive fringe-lister that constructs the entire
fringe in one shot.  Continuations are used twice:  to enable the recursive
routine to resume itself where it left off, and to get each leaf value back
to the caller.  Getting that to work required rebinding non-local
identifiers in delicate ways.  I doubt the intent would be clear to an
average Scheme programmer.

So what would this look like in Continuation Python?  Note that each place
the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and
up-level references are very common.  Two functions are defined at top
level, but seven more at various levels of nesting; the latter can't be
pulled up to the top because they refer to vrbls local to the top-level
functions.  Another (at least initially) discouraging thing to note is that
Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro
facilities.

may-not-be-as-fun-as-it-sounds<wink>-ly y'rs  - tim

Here's the Icon:

procedure main()
    x := [[1, [[2, 3]]], [4], [], [[[5]], 6]]
    every writes(fringe(x), " ")
    write()
end

procedure fringe(node)
    if type(node) == "list" then
        suspend fringe(!node)
    else
        suspend node
end

Here's the Python:

from types import ListType

class Fringe:
    def __init__(self, value):
        self.stack = [(value, 0)]

    def __getitem__(self, ignored):
        while 1:
            # find topmost pending list with something to do
            while 1:
                if not self.stack:
                    raise IndexError
                v, i = self.stack[-1]
                if i < len(v):
                    break
                self.stack.pop()

            this = v[i]
            self.stack[-1] = (v, i+1)
            if type(this) is ListType:
                self.stack.append((this, 0))
            else:
                break

        return this

testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]]

for x in Fringe(testcase):
    print x,
print

Here's the Scheme:

(define list->generator
  ; Takes a list as argument.
  ; Returns a generator g such that each call to g returns
  ; the next element in the list's symmetric-order fringe.
  (lambda (x)
    (letrec {(produce-value #f) ; set to return-to continuation
             (looper
              (lambda (x)
                (cond
                  ((null? x) 'nada) ; ignore null
                  ((list? x)
                   (looper (car x))
                   (looper (cdr x)))
                  (else
                   ; want to produce this non-list fringe elt,
                   ; and also resume here
                   (call/cc
                    (lambda (here)
                      (set! getnext
                            (lambda () (here 'keep-going)))
                      (produce-value x)))))))
             (getnext
              (lambda ()
                (looper x)
                ; have to signal end of sequence somehow;
                ; assume false isn't a legitimate fringe elt
                (produce-value #f)))}

      ; return niladic function that returns next value
      (lambda ()
        (call/cc
         (lambda (k)
           (set! produce-value k)
           (getnext)))))))

(define display-fringe
  (lambda (x)
    (letrec ((g (list->generator x))
             (thiselt #f)
             (looper
              (lambda ()
                (set! thiselt (g))
                (if thiselt
                    (begin
                      (display thiselt) (display " ")
                      (looper))))))
      (looper))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))

(display-fringe test-case)




From MHammond@skippinet.com.au  Thu May 20 01:14:24 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Thu, 20 May 1999 10:14:24 +1000
Subject: [Python-Dev] Interactive Debugging of Python
Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat>

All this talk about stack frames and manipulating them at runtime has
reminded me of one of my biggest gripes about Python.  When I say "biggest
gripe", I really mean "biggest surprise" or "biggest shame".

That is, Python is very interactive and dynamic.  However, when I am
debugging Python, it seems to lose this.  There is no way for me to
effectively change a running program.  Now with VC6, I can do this with C.
Although it is slow and a little dumb, I can change the C side of my Python
world while my program is running, but not the Python side of the world.

Im wondering how feasable it would be to change Python code _while_ running
under the debugger.  Presumably this would require a way of recompiling the
current block of code, patching this code back into the object, and somehow
tricking the stack frame to use this new block of code; even if a first-cut
had to restart the block or somesuch...

Any thoughts on this?

Mark.



From tim_one@email.msn.com  Thu May 20 03:41:03 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 22:41:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000901bea26a$34526240$179e2299@tim>

[Christian Tismer]
> I tried the most simple thing, and this seemed to be duplicating
> the current state of the machine. The frame holds the stack,
> and references to all objects.
> By chance, the locals are not in a dict, but unpacked into
> the frame. (Sometimes I agree with Guido, that optimization
> is considered harmful :-)

I don't see that the locals are a problem here -- provided you simply leave
them alone <wink>.

> The Python stack, besides its intermingledness with the machine
> stack, is basically its chain of frames.

Right.

> The value stack pointer still hides in the machine stack, but
> that's easy to change.

I'm not sure what "value stack" means here, or "machine stack".  The latter
means the C stack?  Then I don't know which values you have in mind that are
hiding in it (the locals are, as you say, unpacked in the frame, and the
evaluation stack too).  By "evaluation stack" I mean specifically
f->f_valuestack; the current *top* of stack pointer (specifically
stack_pointer) lives in the C stack -- is that what we're talking about?
Whichever, when we're talking about the code, let's use the names the code
uses <wink>.

> So the real Scheme-like part is this chain, methinks, with
> the current bytecode offset and value stack info.

Curiously, f->f_lasti is already materialized every time we make a call, in
order to support tracing.  So if capturing a continuation is done via a
function call (hard to see any other way it could be done <wink>), a
bytecode offset is already getting saved in the frame object.

> Making a copy of this in a restartable way means to increase
> the refcount of all objects in a frame.

You later had a vision of splitting the frame into two objects -- I think.
Whichever part the locals live in should not be copied at all, but merely
have its (single) refcount increased.  The other part hinges on details of
your approach I don't know.  The nastiest part seems to be f->f_valuestack,
which conceptually needs to be (shallow) copied in the current frame and in
all other frames reachable from the current frame's continuation (the chain
rooted at f->f_back today); that's the sum total (along with the same
frames' bytecode offsets) of capturing the control flow state.

> Would it be correct to undo the effect of fast locals before
> splitting, and redoing it on activation?

Unsure what splitting means, but in any case I can't conceive of a reason
for doing anything to the locals.  Their values aren't *supposed* to get
restored upon continuation invocation, so there's no reason to do anything
with their values upon continuation creation either.  Right?  Or are we
talking about different things?

almost-as-good-as-pantomimem<wink>-ly y'rs  - tim




From rushing@nightmare.com  Thu May 20 05:04:20 1999
From: rushing@nightmare.com (rushing@nightmare.com)
Date: Wed, 19 May 1999 21:04:20 -0700 (PDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <50692631@toto.iv>
Message-ID: <14147.34175.950743.79464@seattle.nightmare.com>

Tim Peters writes:
 > The Scheme solution was the hardest to write, but is a largely
 > mechanical transformation of a recursive fringe-lister that
 > constructs the entire fringe in one shot.  Continuations are used
 > twice: to enable the recursive routine to resume itself where it
 > left off, and to get each leaf value back to the caller.  Getting
 > that to work required rebinding non-local identifiers in delicate
 > ways.  I doubt the intent would be clear to an average Scheme
 > programmer.

It's the only way to do it - every example I've seen of using call/cc
looks just like it.

I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
people.  The following should be equivalent:

(define (list->generator x)
  (let ((produce-value #f))

    (define (looper x)
      (cond ((null? x) 'nada)
	    ((list? x)
	     (looper (car x))
	     (looper (cdr x)))
	    (else
	     (call/cc
	      (lambda (here)
		(set! getnext (lambda () (here 'keep-going)))
		(produce-value x))))))

    (define (getnext)
      (looper x)
      (produce-value #f))

    (lambda ()
      (call/cc
       (lambda (k)
	 (set! produce-value k)
	 (getnext))))))

(define (display-fringe x)
  (let ((g (list->generator x)))
    (let loop ((elt (g)))
      (if elt
	  (begin
             (display elt)
             (display " ")
             (loop (g)))))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))
(display-fringe test-case)

 > So what would this look like in Continuation Python?

Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
do this without having the feature to play with.  This presumes a
function "call_cc" that behaves like Scheme's.  I believe the extra
level of indirection is necessary. (i.e., call_cc takes a function as
an argument that takes a continuation function)

class list_generator:

    def __init__ (x):
        self.x = x
        self.k_suspend = None
        self.k_produce = None

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

    def __call__ (self):
        # call self.resume() with a continuation
        # that will return the next fringe element
        return call_cc (self.resume)

    def resume (self, k_produce):
        self.k_produce = k_produce
        if self.k_suspend:
            # resume the suspended walk
            self.k_suspend (None)
        else:
            self.walk (self.x)

    def suspend (self, k_suspend):
        self.k_suspend = k_suspend
        # return a value for __call__
        self.k_produce (self.item)

Variables hold continuations have a 'k_' prefix.  In real life it
might be possible to put the suspend/call/resume machinery in a base
class (Generator?), and override 'walk' as you please.

-Sam



From tim_one@email.msn.com  Thu May 20 08:21:45 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 03:21:45 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <001d01bea291$6b3efbc0$179e2299@tim>

[Sam, takes up the Continuation Python Challenge]

Thanks, Sam!  I think this is very helpful.

> ...
> It's the only way to do it - every example I've seen of using call/cc
> looks just like it.

Same here -- alas <0.5 wink>.

> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
> people.  The following should be equivalent:

I confess I stopped paying attention to Scheme after R4RS, and largely
because the std decreed that *so* many forms were optional.  Your rework is
certainly nicer, but internal defines and named let are two that R4RS
refused to require, so I always avoided them.  BTW, I *am* a compiler, so
that never bothered me <wink>.

>> So what would this look like in Continuation Python?

> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
> do this without having the feature to play with.

Fully understood.  It's also really hard to implement the feature without
knowing how someone who wants it would like it to behave.  But I don't think
anyone is getting graded on this, so let's have fun <wink>.

Ack!  I have to sleep.  Will study the code in detail later, but first
impression was it looked good!  Especially nice that it appears possible to
package up most of the funky call_cc magic in a base class, so that
non-wizards could reuse it by following a simple protocol.

great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo-
    from-scratch-every-time-ly y'rs  - tim




From skip@mojam.com (Skip Montanaro)  Thu May 20 14:27:59 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 20 May 1999 09:27:59 -0400 (EDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
References: <50692631@toto.iv>
 <14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>

    Sam> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
    Sam> people.

Sam, you are aware of course that the timbot *is* a compiler, right? ;-)

    >> So what would this look like in Continuation Python?

    Sam> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
    Sam> do this without having the feature to play with.

The thought that it's unlikely one could arrive at a reasonable
approximation of a correct solution for such a small problem without the
ability to "play with" it is sort of scary.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From tismer@appliedbiometrics.com  Thu May 20 15:10:32 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 16:10:32 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <008b01bea255$b80cf790$0801a8c0@bobcat>
Message-ID: <374417D8.8DBCB617@appliedbiometrics.com>


Mark Hammond wrote:
> 
> All this talk about stack frames and manipulating them at runtime has
> reminded me of one of my biggest gripes about Python.  When I say "biggest
> gripe", I really mean "biggest surprise" or "biggest shame".
> 
> That is, Python is very interactive and dynamic.  However, when I am
> debugging Python, it seems to lose this.  There is no way for me to
> effectively change a running program.  Now with VC6, I can do this with C.
> Although it is slow and a little dumb, I can change the C side of my Python
> world while my program is running, but not the Python side of the world.
> 
> Im wondering how feasable it would be to change Python code _while_ running
> under the debugger.  Presumably this would require a way of recompiling the
> current block of code, patching this code back into the object, and somehow
> tricking the stack frame to use this new block of code; even if a first-cut
> had to restart the block or somesuch...
> 
> Any thoughts on this?

I'm writing a prototype of a stackless Python, which means that
you will be able to access the current state of the interpreter
completely.
The inner interpreter loop will be isolated from the frame
dispatcher. It will break whenever the ticker goes zero.
If you set the ticker to one, you will be able to single
step on every opcode, have the value stack, the frame chain,
everything.
I think, with this you can do very much.
But tell me if you want a callback hook somewhere.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Thu May 20 17:52:21 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 18:52:21 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com>

Cleaning up, clarifying, trying to understand...

Tim Peters wrote:
> 
> [Christian Tismer]
> > I tried the most simple thing, and this seemed to be duplicating
> > the current state of the machine. The frame holds the stack,
> > and references to all objects.
> > By chance, the locals are not in a dict, but unpacked into
> > the frame. (Sometimes I agree with Guido, that optimization
> > is considered harmful :-)
> 
> I don't see that the locals are a problem here -- provided you simply leave
> them alone <wink>.

This depends on wether I have to duplicate frames
or not. Below...

> > The Python stack, besides its intermingledness with the machine
> > stack, is basically its chain of frames.
> 
> Right.
> 
> > The value stack pointer still hides in the machine stack, but
> > that's easy to change.
> 
> I'm not sure what "value stack" means here, or "machine stack".  The latter
> means the C stack?  Then I don't know which values you have in mind that are
> hiding in it (the locals are, as you say, unpacked in the frame, and the
> evaluation stack too).  By "evaluation stack" I mean specifically
> f->f_valuestack; the current *top* of stack pointer (specifically
> stack_pointer) lives in the C stack -- is that what we're talking about?

Exactly!

> Whichever, when we're talking about the code, let's use the names the code
> uses <wink>.

The evaluation stack pointer is a local variable in the
C stack and must be written to the frame to become independant
from the C stack. Sounds better now?

> 
> > So the real Scheme-like part is this chain, methinks, with
> > the current bytecode offset and value stack info.
> 
> Curiously, f->f_lasti is already materialized every time we make a call, in
> order to support tracing.  So if capturing a continuation is done via a
> function call (hard to see any other way it could be done <wink>), a
> bytecode offset is already getting saved in the frame object.

You got me. I'm just completing what is partially there.

> > Making a copy of this in a restartable way means to increase
> > the refcount of all objects in a frame.
> 
> You later had a vision of splitting the frame into two objects -- I think.

My wrong wording. Not splitting, but duplicting. If a frame is the
current state, I make it two frames to have two current states.
One will be saved, the other will be run. This is what I call
"splitting".
Actually, splitting must occour whenever a frame can be reached twice,
in order to keep elements alive.

> Whichever part the locals live in should not be copied at all, but merely
> have its (single) refcount increased.  The other part hinges on details of
> your approach I don't know.  The nastiest part seems to be f->f_valuestack,
> which conceptually needs to be (shallow) copied in the current frame and in
> all other frames reachable from the current frame's continuation (the chain
> rooted at f->f_back today); that's the sum total (along with the same
> frames' bytecode offsets) of capturing the control flow state.

Well, I see. You want one locals and one globals, shared by two
incarnations. Gets me into trouble.

> > Would it be correct to undo the effect of fast locals before
> > splitting, and redoing it on activation?
> 
> Unsure what splitting means, but in any case I can't conceive of a reason
> for doing anything to the locals.  Their values aren't *supposed* to get
> restored upon continuation invocation, so there's no reason to do anything
> with their values upon continuation creation either.  Right?  Or are we
> talking about different things?

Let me explain. What Python does right now is:
When a function is invoked, all local variables are copied
into fast_locals, well of course just references are copied
and counts increased. These fast locals give a lot of speed
today, we must have them.
You are saying I have to share locals between frames. Besides
that will be a reasonable slowdown, since an extra structure
must be built and accessed indirectly (right now, i's all fast,
living in the one frame buffer), I cannot say that I'm convinced
that this is what we need.

Suppose you have a function

def f(x):
    # do something
    ...
    # in some context, wanna have a snapshot
    global snapshot  # initialized to None
    if not snapshot:
        snapshot = callcc.new()
    # continue computation
    x = x+1
    ...

What I want to achieve is that I can run this again, from my
snapshot. But with shared locals, my parameter x of the
snapshot would have changed to x+1, which I don't find useful.
I want to fix a state of the current frame and still think
it should "own" its locals. Globals are borrowed, anyway.
Class instances will anyway do what you want, since
the local "self" is a mutable object.

How do you want to keep computations independent
when locals are shared? For me it's just easier to
implement and also to think with the shallow copy.
Otherwise, where is my private place?
Open for becoming convinced, of course :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From jeremy@cnri.reston.va.us  Thu May 20 20:26:30 1999
From: jeremy@cnri.reston.va.us (Jeremy Hylton)
Date: Thu, 20 May 1999 15:26:30 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
References: <000901bea26a$34526240$179e2299@tim>
 <37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:

  CT> What I want to achieve is that I can run this again, from my
  CT> snapshot. But with shared locals, my parameter x of the snapshot
  CT> would have changed to x+1, which I don't find useful.  I want to
  CT> fix a state of the current frame and still think it should "own"
  CT> its locals. Globals are borrowed, anyway.  Class instances will
  CT> anyway do what you want, since the local "self" is a mutable
  CT> object.

  CT> How do you want to keep computations independent when locals are
  CT> shared? For me it's just easier to implement and also to think
  CT> with the shallow copy.  Otherwise, where is my private place?
  CT> Open for becoming convinced, of course :-)

I think you're making things a lot more complicated by trying to
instantiate new variable bindings for locals every time you create a
continuation.  Can you give an example of why that would be helpful?
(Ok.  I'm not sure I can offer a good example of why it would be
helpful to share them, but it makes intuitive sense to me.)

The call_cc mechanism is going to let you capture the current
continuation, save it somewhere, and call on it again as often as you
like.  Would you get a fresh locals each time you used it?  or just
the first time?  If only the first time, it doesn't seem that you've
gained a whole lot.

Also, all the locals that are references to mutable objects are
already effectively shared.  So it's only a few oddballs like ints
that are an issue.

Jeremy


From tim_one@email.msn.com  Thu May 20 23:04:04 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:04 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>
Message-ID: <000601bea30c$ad51b220$9d9e2299@tim>

[Tim]
> So what would this look like in Continuation Python?

[Sam]
> Here's my first hack at it.  Most likely wrong.  It is
> REALLY HARD to do this without having the feature to play with.

[Skip]
> The thought that it's unlikely one could arrive at a reasonable
> approximation of a correct solution for such a small problem without the
> ability to "play with" it is sort of scary.

Yes it is.  But while the problem is small, it's not easy, and only the Icon
solution wrote itself (not a surprise -- Icon was designed for expressing
this kind of algorithm, and the entire language is actually warped towards
it).  My first stab at the Python stack-fiddling solution had bugs too, but
I conveniently didn't post that <wink>.

After studying Sam's code, I expect it *would* work as written, so it's a
decent bet that it's a reasonable approximation to a correct solution as-is.

A different Python approach using threads can be built using

    Demo/threads/Generator.py

from the source distribution.  To make that a fair comparison, I would have
to post the supporting machinery from Generator.py too -- and we can ask
Guido whether Generator.py worked right the first time he tried it <wink>.

The continuation solution is subtle, requiring real expertise; but the
threads solution doesn't fare any better on that count (building the support
machinery with threads is also a baffler if you don't have thread
expertise).  If we threw Python metaclasses into the pot too, they'd be a
third kind of nightmare for the non-expert.

So, if you're faced with this kind of task, there's simply no easy way to
get it done.  Thread- and (it appears) continuation- based machinery can be
crafted once by an expert, then packaged into an easy-to-use protocol for
non-experts.

All in all, I view continuations as a feature most people should actively
avoid!  I think it has that status in Scheme too (e.g., the famed Schemer's
SICP textbook doesn't even mention call/cc).  Its real value (if any <wink>)
is as a Big Invisible Hammer for certified wizards.  Where call_cc leaks
into the user's view of the world I'd try to hide it; e.g., where Sam has

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

I'd do

    def walk(self, x):
        if type(x) == type([]):
            for item in x:
                self.walk(item)
        else:
            self.put(x)

where "put" is inherited from the base class (part of the protocol) and
hides the call_cc business.  Do enough of this, and we'll rediscover why
Scheme demands that tail calls not push a new stack frame <0.9 wink>.

the-tradeoffs-are-murky-ly y'rs  - tim




From tim_one@email.msn.com  Thu May 20 23:04:09 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:09 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <000701bea30c$af7a1060$9d9e2299@tim>

[Christian]
[... clarified stuff ... thanks! ... much clearer ...]
> ...
> If a frame is the current state, I make it two frames to have two
> current states.  One will be saved, the other will be run. This is
> what I call "splitting".  Actually, splitting must occour whenever
> a frame can be reached twice, in order to keep elements alive.

That part doesn't compute:  if a frame can be reached by more than one path,
its refcount must be at least equal to the number of its immediate
predecessors, and its refcount won't fall to 0 before it becomes
unreachable.  So while you may need to split stuff for *some* reasons, I
can't see how keeping elements alive could be one of those reasons (unless
you're zapping frame contents *before* the frame itself is garbage?).

> ...
> Well, I see. You want one locals and one globals, shared by two
> incarnations. Gets me into trouble.

Just clarifying what Scheme does.  Since they've been doing this forever, I
don't want to toss their semantics on a whim <wink>.  It's at least a
conceptual thing:  why *should* locals follow different rules than globals?
If Python2 grows lexical closures, the only thing special about today's
"locals" is that they happen to be the first guys found on the search path.
Conceptually, that's really all they are today too.

Here's the clearest Scheme example I can dream up:

(define k #f)

(define (printi i)
  (display "i is ") (display i) (newline))

(define (test n)
  (let ((i n))
    (printi i)
    (set! i (- i 1))
    (printi i)
    (display "saving continuation") (newline)
    (call/cc (lambda (here) (set! k here)))
    (set! i (- i 1))
    (printi i)
    (set! i (- i 1))
    (printi i)))

No loops, no recursive calls, just a straight chain of fiddle-a-local ops.
Here's some output:

> (test 5)
i is 5
i is 4
saving continuation
i is 3
i is 2
> (k #f)
i is 1
i is 0
> (k #f)
i is -1
i is -2
> (k #f)
i is -3
i is -4
>

So there's no question about what Scheme thinks is proper behavior here.

> ...
> Let me explain. What Python does right now is:
> When a function is invoked, all local variables are copied
> into fast_locals, well of course just references are copied
> and counts increased. These fast locals give a lot of speed
> today, we must have them.

Scheme (most of 'em, anyway) also resolves locals via straight base + offset
indexing.

> You are saying I have to share locals between frames. Besides
> that will be a reasonable slowdown, since an extra structure
> must be built and accessed indirectly (right now, i's all fast,
> living in the one frame buffer),

GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
care where that points *to* <wink -- but, really, it could point into some
other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
due to continuation needs extra setup work?  Scheme saves itself by putting
name-resolution and continuation info into different structures; to mimic
the semantics, Python would need to get the same end effect.

> I cannot say that I'm convinced that this is what we need.
>
> Suppose you have a function
>
> def f(x):
>     # do something
>     ...
>     # in some context, wanna have a snapshot
>     global snapshot  # initialized to None
>     if not snapshot:
>         snapshot = callcc.new()
>     # continue computation
>     x = x+1
>     ...
>
> What I want to achieve is that I can run this again, from my
> snapshot. But with shared locals, my parameter x of the
> snapshot would have changed to x+1, which I don't find useful.

You need a completely fleshed-out example to score points here:  the use of
call/cc is subtle, hinging on details, and fragments ignore too much.  If
you do want the same x,

    commonx = x
    if not snapshot:
         # get the continuation
    # continue computation
    x = commonx
    x = x+1
    ...

That is, it's easy to get it.  But if you *do* want to see changes to the
locals (which is one way for those distinct continuation invocations to
*cooperate* in solving a task -- see below), but the implementation doesn't
allow for it, I don't know what you can do to worm around it short of making
x global too.  But then different *top* level invocations of f will stomp on
that shared global, so that's not a solution either.  Maybe forget functions
entirely and make everything a class method.

> I want to fix a state of the current frame and still think
> it should "own" its locals. Globals are borrowed, anyway.
> Class instances will anyway do what you want, since
> the local "self" is a mutable object.
>
> How do you want to keep computations independent
> when locals are shared? For me it's just easier to
> implement and also to think with the shallow copy.
> Otherwise, where is my private place?
> Open for becoming convinced, of course :-)

I imagine it comes up less often in Scheme because it has no loops:
communication among "iterations" is via function arguments or up-level
lexical vrbls.

So recall your uses of Icon generators instead:  like Python, Icon does have
loops, and two-level scoping, and I routinely build loopy Icon generators
that keep state in locals.  Here's a dirt-simple example I emailed to Sam
earlier this week:

procedure main()
    every result := fib(0, 1) \ 10 do
        write(result)
end

procedure fib(i, j)
    local temp
    repeat {
        suspend i
        temp := i + j
        i := j
        j := temp
    }
end

which prints

0
1
1
2
3
5
8
13
21
34

If Icon restored the locals (i, j, temp) upon each fib resumption, it would
generate a zero followed by an infinite sequence of ones(!).

Think of a continuation as a *paused* computation (which it is) rather than
an *independent* one (which it isn't <wink>), and I think it gets darned
hard to argue.

theory-and-practice-agree-here-in-my-experience-ly y'rs  - tim




From MHammond@skippinet.com.au  Fri May 21 00:01:22 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 09:01:22 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com>
Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat>

> I'm writing a prototype of a stackless Python, which means that
> you will be able to access the current state of the interpreter
> completely.
> The inner interpreter loop will be isolated from the frame
> dispatcher. It will break whenever the ticker goes zero.
> If you set the ticker to one, you will be able to single
> step on every opcode, have the value stack, the frame chain,
> everything.

I think the main point is how to change code when a Python frame already
references it.  I dont think the structure of the frames is as important as
the general concept.  But while we were talking frame-fiddling it seemed a
good point to try and hijack it a little :-)

Would it be possible to recompile just a block of code (eg, just the
current function or method) and patch it back in such a way that the
current frame continues execution of the new code?

I feel this is somewhat related to the inability to change class
implementation for an existing instance.  I know there have been hacks
around this before but they arent completly reliable and IMO it would be
nice if the core Python made it easier to change already running code -
whether that code is in an existing stack frame, or just in an already
created instance, it is very difficult to do.

This has come to try and deflect some conversation away from changing
Python as such towards an attempt at enhancing its _environment_.  To
paraphrase many people before me, even if we completely froze the language
now there would still plenty of work ahead of us :-)

Mark.



From guido@CNRI.Reston.VA.US  Fri May 21 01:06:51 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 20:06:51 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000."
 <00c001bea314$aefc5b40$0801a8c0@bobcat>
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us>

> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

This topic sounds mostly unrelated to the stackless discussion -- in
either case you need to be able to fiddle the contents of the frame
and the bytecode pointer to reflect the changed function.

Some issues:

  - The slots containing local variables may be renumbered after
    recompilation; fortunately we know the name--number mapping so we can
    move them to their new location.  But it is still tricky.

  - Should you be able to edit functions that are present on the call
    stack below the top?  Suppose we have two functions:

	def f():
	    return 1 + g()

	def g():
	    return 0

    Suppose set a break in g(), and then edit the source of f().  We can
    do all sorts of evil to f(): e.g. we could change it to

	    return g() + 2

    which affects the contents of the value stack when g() returns
    (originally, the value stack contained the value 1, now it is empty).
    Or we could even change f() to

	    return 3

    thereby eliminating the call to g() altogether!

What kind of limitations do other systems that support modifying a
"live" program being debugged impose?  Only allowing modification of
the function at the top of the stack might eliminate some problems,
although there are still ways to mess up.  The value stack is not 
always empty even when we only stop at statement boundaries -- e.g. it 
contains 'for' loop indices, and there's also the 'block' stack, which 
contains try-except information.  E.g. what should happen if we change

    def f():
        for i in range(10):
            print 1

stopped at the 'print 1' into

    def f():
        print 1

???

(Ditto for removing or adding a try/except block.)

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I've been thinking a bit about this.  Function objects now have
mutable func_code attributes (and also func_defaults), I think we can
use this.

The hard part is to do the analysis needed to decide which functions
to recompile!  Ideally, we would simply edit a file and tell the
programming environment "recompile this".  The programming environment
would compare the changed file with the old version that it had saved
for this purpose, and notice (for example) that we changed two methods
of class C.  It would then recompile those methods only and stuff the
new code objects in the corresponding function objects.

But what would it do when we changed a global variable?  Say a module
originally contains a statement "x = 0".  Now we change the source
code to say "x = 100".  Should we change the variable x?  Suppose that
x is modified by some of the computations in the module, and the that,
after some computations, the actual value of x was 50.  Should the
"recompile" reset x to 100 or leave it alone?

One option would be to actually change the semantics of the class and
def statements so that they modify an existing class or function
rather than using assignment.  Effectively, this proposal would change
the semantics of

    class A:
        ...some code...

    class A:
        ...some more code...

to be the same as

    class A:
        ...more code...
        ...some more code...
        
This is somewhat similar to the way the module or package commands in
some other dynamic languages work, I think; and I don't think this
would break too much existing code.

The proposal would also change

    def f():
        ...some code...

    def f():
        ...other code...

but here the equivalence is not so easy to express, since I want
different semantics (I don't want the second f's code to be tacked
onto the end of the first f's code).

If we understand that def f(): ... really does the following:

    f = NewFunctionObject()
    f.func_code = ...code object...

then the construct above (def f():... def f(): ...) would do this:

    f = NewFunctionObject()
    f.func_code = ...some code...

    f.func_code = ...other code...

i.e. there is no assignment of a new function object for the second
def.

Of course if there is a variable f but it is not a function, it would
have to be assigned a new function object first.

But in the case of def, this *does* break existing code.  E.g.

# module A
from B import f
.
.
.
if ...some test...:
    def f(): ...some code...

This idiom conditionally redefines a function that was also imported
from some other module.  The proposed new semantics would change B.f
in place!

So perhaps these new semantics should only be invoked when a special
"reload-compile" is asked for...  Or perhaps the programming
environment could do this through source parsing as I proposed
before...

> This has come to try and deflect some conversation away from changing
> Python as such towards an attempt at enhancing its _environment_.  To
> paraphrase many people before me, even if we completely froze the language
> now there would still plenty of work ahead of us :-)

Please, no more posts about Scheme.  Each new post mentioning call/cc
makes it *less* likely that something like that will ever be part of
Python.  "What if Guido's brain exploded?" :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Fri May 21 02:13:28 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 20 May 1999 21:13:28 -0400 (EDT)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
 <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com>

    Guido> What kind of limitations do other systems that support modifying
    Guido> a "live" program being debugged impose?  Only allowing
    Guido> modification of the function at the top of the stack might
    Guido> eliminate some problems, although there are still ways to mess
    Guido> up.

Frame objects maintain pointers to the active code objects, locals and
globals, so modifying a function object's code or globals shouldn't have any
effect on currently executing frames, right?  I assume frame objects do the
usual INCREF/DECREF dance, so the old code object won't get deleted before
the frame object is tossed.

    Guido> But what would it do when we changed a global variable?  Say a
    Guido> module originally contains a statement "x = 0".  Now we change
    Guido> the source code to say "x = 100".  Should we change the variable
    Guido> x?  Suppose that x is modified by some of the computations in the
    Guido> module, and the that, after some computations, the actual value
    Guido> of x was 50.  Should the "recompile" reset x to 100 or leave it
    Guido> alone?

I think you should note the change for users and give them some way to
easily pick between old initial value, new initial value or current value.

    Guido> Please, no more posts about Scheme.  Each new post mentioning
    Guido> call/cc makes it *less* likely that something like that will ever
    Guido> be part of Python.  "What if Guido's brain exploded?" :-)

I agree.  I see call/cc or set! and my eyes just glaze over...

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip@mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583


From MHammond@skippinet.com.au  Fri May 21 02:42:14 1999
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 11:42:14 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat>

[Guido writes...]
> This topic sounds mostly unrelated to the stackless discussion -- in

Sure is - I just saw that as an excuse to try and hijack it <wink>

> Some issues:
>
>   - The slots containing local variables may be renumbered after

Generally, I think we could make something very useful even with a number
of limitations.  For example, I would find a first cut completely
acceptable and a great improvement on today if:

* Only the function at the top of the stack can be recompiled and have the
code reflected while executing.  This function also must be restarted after
such an edit.  If the function uses global variables or makes calls that
restarting will screw-up, then either a) make the code changes _before_
doing this stuff, or b) live with it for now, and help us remove the
limitation :-)

That may make the locals being renumbered easier to deal with, and also
remove some of the problems you discussed about editing functions below the
top.

> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?  Only allowing modification of

I can only speak for VC, and from experience at that - I havent attempted
to find documentation on it.

It accepts most changes while running.  The current line is fine.  If you
create or change the definition of globals (and possibly even the type of
locals?), the "incremental compilation" fails, and you are given the option
of continuing with the old code, or stopping the process and doing a full
build.

When the debug session terminates, some link process (and maybe even
compilation?) is done to bring the .exe on disk up to date with the
changes.

If you do wierd stuff like delete the line being executed, it usually gives
you some warning message before either restarting the function or trying to
pick a line somewhere near the line you deleted.  Either way, it can screw
up, moving the "current" line somewhere else - it doesnt crash the
debugger, but may not do exactly what you expected.  It is still a _huge_
win, and a great feature!

Ironically, I turn this feature _off_ for Python extensions.  Although
changing the C code is great, in 99% of the cases I also need to change
some .py code, and as existing instances are affected I need to restart the
app anyway - so I may as well do a normal build at that time.  ie, C now
lets me debug incrementally, but a far more dynamic language prevents this
feature being useful ;-)

> the function at the top of the stack might eliminate some problems,
> although there are still ways to mess up.  The value stack is not
> always empty even when we only stop at statement boundaries

If we forced a restart would this be better?  Can we reliably reset the
stack to the start of the current function?

> I've been thinking a bit about this.  Function objects now have
> mutable func_code attributes (and also func_defaults), I think we can
> use this.
>
> The hard part is to do the analysis needed to decide which functions
> to recompile!  Ideally, we would simply edit a file and tell the
> programming environment "recompile this".  The programming environment
> would compare the changed file with the old version that it had saved
> for this purpose, and notice (for example) that we changed two methods
> of class C.  It would then recompile those methods only and stuff the
> new code objects in the corresponding function objects.

If this would work for the few changed functions/methods, what would the
impact be of doing it for _every_ function (changed or not)?  Then the
analysis can drop to the module level which is much easier.  I dont think a
slight performace hit is a problem at all when doing this stuff.

> One option would be to actually change the semantics of the class and
> def statements so that they modify an existing class or function
> rather than using assignment.  Effectively, this proposal would change
> the semantics of
>
>     class A:
>         ...some code...
>
>     class A:
>         ...some more code...
>
> to be the same as
>
>     class A:
>         ...more code...
>         ...some more code...

Or extending this (didnt this come up at the latest IPC?)
# .\package\__init__.py
class BigMutha:
  pass

# .\package\something.py
class package.BigMutha:
  def some_category_of_methods():
    ...

# .\package\other.py
class package.BigMutha:
  def other_category_of_methods():
    ...
[Of course, this wont fly as it stands; just a conceptual possibility]

> So perhaps these new semantics should only be invoked when a special
> "reload-compile" is asked for...  Or perhaps the programming
> environment could do this through source parsing as I proposed
> before...

From your interesting summary, I believe this would be the best approach to
get started with.  This way we limit any strange new semantics to what are
clearly debugging related features.  It also means the debug specific
features could attempt more hacks that the "real" environment would never
attempt.

Of course, this isnt to suggest these new semantics arent worth exploring
(even if just for the possibilities of splitting class definitions as my
code attempts to show), but IMO should be seperate from these debugging
features.

> Python.  "What if Guido's brain exploded?" :-)

At least on that particular topic I didnt even consider I was the only one
in fear of that!  But it is good to know that you specifically are too :-)

Mark.



From guido@CNRI.Reston.VA.US  Fri May 21 04:02:49 1999
From: guido@CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 23:02:49 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000."
 <00c501bea32b$277ce3d0$0801a8c0@bobcat>
References: <00c501bea32b$277ce3d0$0801a8c0@bobcat>
Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us>

> Generally, I think we could make something very useful even with a number
> of limitations.  For example, I would find a first cut completely
> acceptable and a great improvement on today if:
> 
> * Only the function at the top of the stack can be recompiled and have the
> code reflected while executing.  This function also must be restarted after
> such an edit.  If the function uses global variables or makes calls that
> restarting will screw-up, then either a) make the code changes _before_
> doing this stuff, or b) live with it for now, and help us remove the
> limitation :-)

OK, restarting the function seems a reasonable compromise and would
seem relatively easy to implement.  Not *real* easy though: it turns
out that eval_code2() is called with a code object as argument, and
it's not entirely trivial to figure out the corresponding function
object from which to grab the new code object.  But it could be done
-- give it a try.  (Don't wait for me, I'm ducking for cover until at
least mid June.)

> Ironically, I turn this feature _off_ for Python extensions.  Although
> changing the C code is great, in 99% of the cases I also need to change
> some .py code, and as existing instances are affected I need to restart the
> app anyway - so I may as well do a normal build at that time.  ie, C now
> lets me debug incrementally, but a far more dynamic language prevents this
> feature being useful ;-)

I hear you.

> If we forced a restart would this be better?  Can we reliably reset the
> stack to the start of the current function?

Yes, no problem.

> If this would work for the few changed functions/methods, what would the
> impact be of doing it for _every_ function (changed or not)?  Then the
> analysis can drop to the module level which is much easier.  I dont think a
> slight performace hit is a problem at all when doing this stuff.

Yes, this would be fine too.

> >"What if Guido's brain exploded?" :-)
> 
> At least on that particular topic I didnt even consider I was the only one
> in fear of that!  But it is good to know that you specifically are too :-)

Have no fear.  I've learned to say no. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Fri May 21 06:36:44 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 21 May 1999 01:36:44 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <000401bea34b$e93fcda0$d89e2299@tim>

[GvR]
> ...
> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?

As an ex-compiler guy, I should have something wise to say about that.
Alas, I've never used a system that allowed more than poking new values into
vrbls, and the thought of any more than that makes me vaguely ill!  Oh,
that's right -- I'm vaguely ill anyway today.  Still-- oooooh -- the
problems.

This later got reduced to restarting the topmost function from scratch.
That has some attraction, especially on the bang-for-buck-o-meter.

> ...
> Please, no more posts about Scheme.  Each new post mentioning call/cc
> makes it *less* likely that something like that will ever be part of
> Python.  "What if Guido's brain exploded?" :-)

What a pussy <wink>.  Really, overall continuations are much less trouble to
understand than threads -- there's only one function in the entire
interface!

OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
way to get pseudo-parallel semantics regardless of OS.

changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs  - tim




From tismer@appliedbiometrics.com  Fri May 21 08:12:05 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:12:05 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
Message-ID: <37450745.21D63A5@appliedbiometrics.com>


Mark Hammond wrote:
> 
> > I'm writing a prototype of a stackless Python, which means that
> > you will be able to access the current state of the interpreter
> > completely.
> > The inner interpreter loop will be isolated from the frame
> > dispatcher. It will break whenever the ticker goes zero.
> > If you set the ticker to one, you will be able to single
> > step on every opcode, have the value stack, the frame chain,
> > everything.
> 
> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

Sure. Since the frame holds a pointer to the code, and the current
IP and SP, your code can easily change it (with care, or GPF:) .
It could even create a fresh code object and let it run only
for the running instance. By instance, I mean a frame which is
running a code object.

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I think this has been difficult, only since information was hiding
in the inner interpreter loop. Gonna change now.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Fri May 21 08:21:22 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:21:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
 <37443DC5.1330EAC6@appliedbiometrics.com> <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>
Message-ID: <37450972.D19E160@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:
> 
>   CT> What I want to achieve is that I can run this again, from my
>   CT> snapshot. But with shared locals, my parameter x of the snapshot
>   CT> would have changed to x+1, which I don't find useful.  I want to
>   CT> fix a state of the current frame and still think it should "own"
>   CT> its locals. Globals are borrowed, anyway.  Class instances will
>   CT> anyway do what you want, since the local "self" is a mutable
>   CT> object.
> 
>   CT> How do you want to keep computations independent when locals are
>   CT> shared? For me it's just easier to implement and also to think
>   CT> with the shallow copy.  Otherwise, where is my private place?
>   CT> Open for becoming convinced, of course :-)
> 
> I think you're making things a lot more complicated by trying to
> instantiate new variable bindings for locals every time you create a
> continuation.  Can you give an example of why that would be helpful?

I'm not sure wether you all understand me, and vice versa.
There is no copying at all, but for the frame.
I copy the frame, which means I also incref all the
objects which it holds. Done. This is the bare minimum
which I must do.

> (Ok.  I'm not sure I can offer a good example of why it would be
> helpful to share them, but it makes intuitive sense to me.)
> 
> The call_cc mechanism is going to let you capture the current
> continuation, save it somewhere, and call on it again as often as you
> like.  Would you get a fresh locals each time you used it?  or just
> the first time?  If only the first time, it doesn't seem that you've
> gained a whole lot.

call_cc does a copy of the state which is the frame. This is
stored away until it is revived. Nothing else happens.
As Guido pointed out, virtually the whole frame chain is
duplicated, but only on demand.

> Also, all the locals that are references to mutable objects are
> already effectively shared.  So it's only a few oddballs like ints
> that are an issue.

Simply look at a frame, what it is. What do you need to do to
run it again with a given state. You have to preserve the stack
variables. And you have to preserve the current locals, since
some of them might even have a copy on the stack, and we want
to stay consistent.

I believe it would become obvious if you tried to implement it.
Maybe I should close my ears and get something ready to show?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Fri May 21 10:00:26 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 11:00:26 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea30c$af7a1060$9d9e2299@tim>
Message-ID: <374520AA.2ADEA687@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian]
> [... clarified stuff ... thanks! ... much clearer ...]

But still not clear enough, I fear.

> > ...
> > If a frame is the current state, I make it two frames to have two
> > current states.  One will be saved, the other will be run. This is
> > what I call "splitting".  Actually, splitting must occour whenever
> > a frame can be reached twice, in order to keep elements alive.
> 
> That part doesn't compute:  if a frame can be reached by more than one path,
> its refcount must be at least equal to the number of its immediate
> predecessors, and its refcount won't fall to 0 before it becomes
> unreachable.  So while you may need to split stuff for *some* reasons, I
> can't see how keeping elements alive could be one of those reasons (unless
> you're zapping frame contents *before* the frame itself is garbage?).

I was saying that under the side condition that I don't want to
change frames as they are now. Maybe that's misconcepted, but
this is what I did:

If a frame as we have it today shall be resumed twice, then
it has to be copied, since:
The stack is in it and has some state which will change
after resuming.

That was the whole problem with my first prototype, which
was done hoping that I don't need to change the interpreter
at all. Wrong, bad, however.

What I actually did was more than seems to be needed:
I made a copy of the whole current frame chain. Later on,
Guido said this can be done on demand. He's right.

[Scheme sample - understood]

> GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
> care where that points *to* <wink -- but, really, it could point into some
> other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
> due to continuation needs extra setup work?  Scheme saves itself by putting
> name-resolution and continuation info into different structures; to mimic
> the semantics, Python would need to get the same end effect.

Point taken. The pointer doesn't save time of access, it just
saves allocating another structure.
So we can use something else without speed loss.

[have to cut a little]

> So recall your uses of Icon generators instead:  like Python, Icon does have
> loops, and two-level scoping, and I routinely build loopy Icon generators
> that keep state in locals.  Here's a dirt-simple example I emailed to Sam
> earlier this week:
> 
> procedure main()
>     every result := fib(0, 1) \ 10 do
>         write(result)
> end
> 
> procedure fib(i, j)
>     local temp
>     repeat {
>         suspend i
>         temp := i + j
>         i := j
>         j := temp
>     }
> end

[prints fib series]

> If Icon restored the locals (i, j, temp) upon each fib resumption, it would
> generate a zero followed by an infinite sequence of ones(!).

Now I'm completely missing the point. Why should I want
to restore anything? At a suspend, which when done by continuations
will be done by temporarily having two identical states, one
is saved and another is continued. The continued one in your example
just returns the current value and immediately forgets about
the locals. The other one is continued later, and of course with
the same locals which were active when going asleep.

> Think of a continuation as a *paused* computation (which it is) rather than
> an *independent* one (which it isn't <wink>), and I think it gets darned
> hard to argue.

No, you get me wrong. I understand what you mean. It is just
the decision wether a frame, which will be reactivated later
as a continuation, should use a reference to locals like
the reference which it has for the globals. This causes me
a major frame redesign.

Current design:
A frame is: back chain, state, code, unpacked locals, globals, stack.

Code and globals are shared. 
State, unpacked locals and stack are private.

Possible new design:
A frame is: back chain, state, code, variables, globals, stack.

variables is: unpacked locals.

This makes the variables into an extra structure which is shared.
Probably a list would be the thing, or abusing a tuple as
a mutable object.

Hmm. I think I should get something ready, and we should
keep this thread short, or we will loose the rest of 
Guido's goodwill (if not already).

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From da@ski.org  Fri May 21 17:27:42 1999
From: da@ski.org (David Ascher)
Date: Fri, 21 May 1999 09:27:42 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <000401bea34b$e93fcda0$d89e2299@tim>
Message-ID: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>

On Fri, 21 May 1999, Tim Peters wrote:

> OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
> way to get pseudo-parallel semantics regardless of OS.

I read about coroutines years ago on c.l.py, but I admit I forgot it all.
Can you explain them briefly in pseudo-python? 

--david



From tim_one@email.msn.com  Sat May 22 05:22:50 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:50 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000401bea40a$c1d2d2c0$659e2299@tim>

[Tim]
> OK.  So how do you feel about coroutines?  Would sure be nice
> to have *some* way to get pseudo-parallel semantics regardless of OS.

[David Ascher]
> I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> Can you explain them briefly in pseudo-python?

How about real Python?  http://www.python.org/tim_one/000169.html contains a
complete coroutine implementation using threads under the covers (& exactly
5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
different object interface (making coroutines objects in their own right
instead of funneling everything through a "coroutine controller" object),
but the ideas are the same in every coroutine language.  The post contains
several executable examples, from simple to "literature standard".

I had forgotten all about this:  it contains solutions to the same "compare
tree fringes" problem Sam mentioned, *and* the generator-based building
block I posted three other solutions for in this thread.  That last looks
like:

# fringe visits a nested list in inorder, and detaches for each non-list
# element; raises EarlyExit after the list is exhausted
def fringe( co, list ):
    for x in list:
        if type(x) is type([]):
            fringe(co, x)
        else:
            co.detach(x)

def printinorder( list ):
    co = Coroutine()
    f = co.create(fringe, co, list)
    try:
        while 1:
            print co.tran(f),
    except EarlyExit:
        pass
    print

printinorder([1,2,3])  # 1 2 3
printinorder([[[[1,[2]]],3]]) # ditto
x = [0, 1, [2, [3]], [4,5], [[[6]]] ]
printinorder(x) # 0 1 2 3 4 5 6

Generators are really "half a coroutine", so this doesn't show the full
power (other examples in the post do).  co.detach is a special way to deal
with this asymmetry.  In the general case you use co.tran all the time,
where (see the post for more info)

    v = co.tran(c [, w])

means "resume coroutine c from the place it last did a co.tran, optionally
passing it the value w, and when somebody does a co.tran back to *me*,
resume me right here, binding v to the value *they* pass to co.tran ).

Knuth complains several times that it's very hard to come up with a
coroutine example that's both simple and clear <0.5 wink>.  In a nutshell,
coroutines don't have a "caller/callee" relationship, they have "we're all
equal partners" relationship, where any coroutine is free to resume any
other one where it left off.  It's no coincidence that making coroutines
easy to use was pioneered by simulation languages!  Just try simulating a
marriage where one partner is the master and the other a slave <wink>.

i-may-be-a-bachelor-but-i-have-eyes-ly y'rs  - tim




From tim_one@email.msn.com  Sat May 22 05:22:55 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:55 -0400
Subject: [Python-Dev] Re: Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000501bea40a$c3d1fe20$659e2299@tim>

Thoughts o' the day:

+ Generators ("semi-coroutines") are wonderful tools and easy to implement
without major changes to the PVM.  Icon calls 'em generators, Sather calls
'em iterators, and they're exactly what you need to implement "for thing in
object:" when object represents a collection that's tricky to materialize.
Python needs something like that.  OTOH, generators are pretty much limited
to that.

+ Coroutines are more general but much harder to implement, because each
coroutine needs its own stack (a generator only has one stack *frame*-- its
own --to worry about), and C-calling-Python can get into the act.  As Sam
said, they're probably no easier to implement than call/cc (but trivial to
implement given call/cc).

+ What may be most *natural* is to forget all that and think about a
variation of Python threads implemented directly via the interpreter,
without using OS threads.  The PVM already knows how to handle thread-state
swapping.  Given Christian's stackless interpreter, and barring C->Python
cases, I suspect Python can fake threads all by itself, in the sense of
interleaving their executions within a single "real" (OS) thread.  Given the
global interpreter lock, Python effectively does only-one-at-a-time anyway.

Threads are harder than generators or coroutines to learn, but

A) Many more people know how to use them already.

B) Generators and coroutines can be implemented using (real or fake)
threads.

C) Python has offered threads since the beginning.

D) Threads offer a powerful mode of control transfer coroutines don't,
namely "*anyone* else who can make progress now, feel encouraged to do so at
my expense".

E) For whatever reasons, in my experience people find threads much easier to
learn than call/cc -- perhaps because threads are *obviously* useful upon
first sight, while it takes a real Zen Experience before call/cc begins to
make sense.

F) Simulated threads could presumably produce much more informative error
msgs (about deadlocks and such) than OS threads, so even people using real
threads could find excellent debugging use for them.

Sam doesn't want to use "real threads" because they're pigs; fake threads
don't have to be.  Perhaps

x = y.SOME_ASYNC_CALL(r, s, t)

could map to e.g.

import config
if config.USE_REAL_THREADS:
    import threading
else:
    from simulated_threading import threading

from config.shared import msg_queue

class Y:
    def __init__(self, ...):
        self.ready = threading.Event()
        ...

    def SOME_ASYNC_CALL(self, r, s, t):
        result = [None]  # mutable container to hold the result
        msg_queue.put((server_of_the_day, r, s, t, self.ready, result))
        self.ready.wait()
        self.ready.clear()
        return result[0]

where some other simulated thread polls the msg_queue and does ready.set()
when it's done processing the msg enqueued by SOME_ASYNC_CALL.  For this to
scale nicely, it's probably necessary for the PVM to cooperate with the
simulated_threading implementation (e.g., a simulated thread that blocks
(like on self.ready.wait()) should be taken out of the collection of
simulated threads the PVM may attempt to resume -- else in Sam's case the
PVM would repeatedly attempt to wake up thousands of blocked threads, and
things would slow to a crawl).

Of course, simulated_threading could be built on top of call/cc or
coroutines too.  The point to making threads the core concept is keeping
Guido's brain from exploding.  Plus, as above, you can switch to "real
threads" by changing an import statement.

making-sure-the-global-lock-support-hair-stays-around-even-if-greg-
    renders-it-moot-for-real-threads<wink>-ly y'rs  - tim




From tismer@appliedbiometrics.com  Sat May 22 17:20:30 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 18:20:30 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim>
Message-ID: <3746D94E.239D0B8E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Tim]
> > OK.  So how do you feel about coroutines?  Would sure be nice
> > to have *some* way to get pseudo-parallel semantics regardless of OS.
> 
> [David Ascher]
> > I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> > Can you explain them briefly in pseudo-python?
> 
> How about real Python?  http://www.python.org/tim_one/000169.html contains a
> complete coroutine implementation using threads under the covers (& exactly
> 5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
> different object interface (making coroutines objects in their own right
> instead of funneling everything through a "coroutine controller" object),
> but the ideas are the same in every coroutine language.  The post contains
> several executable examples, from simple to "literature standard".

What an interesting thread! Unfortunately, all the examples are messed
up since some HTML formatter didn't take care of the python code,
rendering it unreadable. Is there a different version available?

Also, I'd like to read the rest of the threads in 
http://www.python.org/tim_one/ but it seems that only your messages
are archived?
Anyway, the citations in http://www.python.org/tim_one/000146.html
show me that you have been through all of this five years
ago, with a five years younger Guido which sounds a bit
different than today.
I had understood him better if I had known that this
is a re-iteration of a somehow dropped or entombed idea.

(If someone has the original archives from that epoche,
I'd be happy to get a copy. Actually, I'm missing all upto
end of 1996.)

A sort snapshot:
Stackless Python is meanwhile nearly alive, with recursion
avoided in ceval. Of course, some modules are left which
still need work, but enough for a prototype. Frames contain
now all necessry state and are now prepared for execution
and thrown back to the evaluator (elevator?). 

The key idea was to change the deeply nested functions in a 
way, that their last eval_code call happens to be tail recursive.
In ceval.c (and in other not yet changed places), functions
to a lot of preparation, build some parameter, call eval_code
and release the parameter. This was the crux, which I solved
by a new filed in the frame object, where such references
can be stored. The routine can now return with the ready packaged
frame, instead of calling it.

As a minimum facility for future co-anythings,
I provided a hook function for resuming frames, which causes no
overhead in the usual case but allows to override what a frame
does when someone returns control to it. To implement
this is due to some extension module, wether this may
be coroutines or your nice nano-threads, it's possible.

threadedly yours - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sat May 22 20:04:43 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 21:04:43 +0200
Subject: [Python-Dev] How stackless can Python be?
Message-ID: <3746FFCB.CD506BE4@appliedbiometrics.com>

Hi,

to make the core interpreter stackless is one thing.
Turning functions which call the interpreter
from some deep nesting level into versions,
which return a frame object instead which is
to be called, is possible in many cases.

Internals like apply are rather uncomplicated to convert.
CallObjectWithKeywords is done.

What I have *no* good solution for is map.
Map does an iteration over evaluations and keeps
state while it is running. The same applies to reduce,
but it seems to be not used so much. Map is.

I don't see at the moment if map could be a killer
for Tim's nice mini-thread idea. How must map work,
if, for instance, a map is done with a function
which then begins to switch between threads,
before map is done? Can one imagine a problem?

Maybe it is no issue, but I'd really like to
know wether we need a stateless map.
(without replacing it by a for loop :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From tim_one@email.msn.com  Sat May 22 20:35:58 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 15:35:58 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <000501bea48a$51563980$119e2299@tim>

>> http://www.python.org/tim_one/000169.html

[Christian]
> What an interesting thread! Unfortunately, all the examples are messed
> up since some HTML formatter didn't take care of the python code,
> rendering it unreadable. Is there a different version available?
>
> Also, I'd like to read the rest of the threads in
> http://www.python.org/tim_one/ but it seems that only your messages
> are archived?

Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's
all me, all the time, no mercy, no escape <wink>.

It predates the DejaNews archive, but the context can still be found in

http://www.python.org/search/hypermail/python-1994q2/index.html

There's a lot in that quarter about continuations & coroutines, most from
Steven Majewski, who took a serious shot at implementing all this.

Don't have the code in a more usable form; when my then-employer died, most
of my files went with it.

You can save the file as text, though!  The structure of the code is intact,
it's simply that your browswer squashes out the spaces when displaying it.
Nuke the <P> at the start of each code line and what remains is very close
to what was originally posted.

> Anyway, the citations in http://www.python.org/tim_one/000146.html
> show me that you have been through all of this five years
> ago, with a five years younger Guido which sounds a bit
> different than today.
> I had understood him better if I had known that this
> is a re-iteration of a somehow dropped or entombed idea.

You *used* to know that <wink>!  Thought you even got StevenM's old code
from him a year or so ago.  He went most of the way, up until hitting the
C<->Python stack intertwingling barrier, and then dropped it.  Plus Guido
wrote generator.py to shut me up, which works, but is about 3x clumsier to
use and runs about 50x slower than a generator should <wink>.

> ...
> Stackless Python is meanwhile nearly alive, with recursion
> avoided in ceval. Of course, some modules are left which
> still need work, but enough for a prototype. Frames contain
> now all necessry state and are now prepared for execution
> and thrown back to the evaluator (elevator?).
> ...

Excellent!  Running off to a movie & dinner now, but will give a more
careful reading tonight.

co-dependent-ly y'rs  - tim




From tismer@appliedbiometrics.com  Sun May 23 14:07:44 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 23 May 1999 15:07:44 +0200
Subject: [Python-Dev] How stackless can Python be?
References: <3746FFCB.CD506BE4@appliedbiometrics.com>
Message-ID: <3747FDA0.AD3E7095@appliedbiometrics.com>

After a good sleep, I can answer this one by myself.

I wrote:
> to make the core interpreter stackless is one thing.
...
> Internals like apply are rather uncomplicated to convert.
> CallObjectWithKeywords is done.
> 
> What I have *no* good solution for is map.
> Map does an iteration over evaluations and keeps
> state while it is running. The same applies to reduce,
> but it seems to be not used so much. Map is.
...

About stackless map,
and this applies to every extension module
which *wants* to be stackless. We don't have to enforce
everybody to be stackless, but there is a couple of
modules which would benefit from it.

The problem with map is, that it needs to keep state,
while repeatedly calling objects which might call
the interpreter. Even if we kept local variables
in the caller's frame, this would still be not
stateless. The info that a map is running is sitting
on the hardware stack, and that's wrong.

Now a solution. In my last post, I argued that I don't
want to replace map by a slower Python function. But
that gave me the key idea to solve this:

C functions which cannot tail-recursively unwound to
return an executable frame object must instead return
themselves as a frame object. That's it! Frames need
again to be a little extended. They have to spell their
interpreter, which normally is the old eval_code loop.

Anatomy of a standard frame invocation:
A new frame is created, parameters are inserted,
the frame is returned to the frame dispatcher,
which runs the inner eval_code loop until it bails out.
On return, special cases of control flow are handled,
as there are exception, returning, and now also calling.
This is an eval_code frame, since eval_code is its
execution handler.

Anatomy of a map frame invocation:
Map has several phases. The first phases to
argument checking and basic setup.
The last phase is iteration over function calls
and building the result. This phase must be split
off as a second function, eval_map.
A new frame is created, with all temporary variables
placed there. eval_map is inserted as the execution
handler.

Now, I think the analogy is obvious.
By building proper frames, it should be possible
to turn any extension function into a stackless function.

The overall protocol is:
A C function which does a simple computation which cannot
cause an interpreter invocation, may simply evaluate
and return a value.
A C function which might cause an interpreter invocation,
should return a freshly created frame as return value.
- This can be done either in a tail-recursive fashion,
  if the last action of the C function would basically 
  be calling the frame.
- If no tail-recursion is possible, the function must
  return a new frame for itself, with an executor
  for its purpose.

A good stackless candidate is Fredrik's xmlop, which
calls back into the interpreter. If that worked
without the hardware stack, then we could build
ultra-fast XML processors with co-routines!

As a side note: 
The frame structure which I sketched
so far is still made for eval_code in the first place,
but it has all necessary flexibilty for pluggable
interpreters. An extension module can now create
its own frame, with its own execution handler, and
throw it back to the frame dispatcher.
In other words: People can create extensions and
test their own VMs if they want.
This was not my primary intent, but comes for free
as a consequence of having a stackless map.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home


From fredrik@pythonware.com  Sun May 23 14:53:19 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 23 May 1999 15:53:19 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim> <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <031e01bea524$8db41e70$f29b12c2@pythonware.com>

Christian Tismer <tismer@appliedbiometrics.com> wrote:
> (If someone has the original archives from that epoche,
> I'd be happy to get a copy. Actually, I'm missing all upto
> end of 1996.)

http://www.egroups.com/group/python-list/info.html
has it all (almost), starting in 1991.

</F>



From tim_one at email.msn.com  Sat May  1 10:32:30 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 1 May 1999 04:32:30 -0400
Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.)
In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us>
Message-ID: <000801be93ad$27772ea0$7a9e2299@tim>

[Andrew M. Kuchling]
> ...
> A performance improvement project would definitely be a good idea
> for 1.6, and a good sub-topic for python-dev.

To the extent that optimization requires uglification, optimization got
pushed beyond Guido's comfort zone back around 1.4 -- little has made it in
since then.

Not griping; I'm just trying to avoid enduring the same discussions for the
third to twelfth times <wink>.

Anywho, on the theory that a sweeping speedup patch has no chance of making
it in regardless, how about focusing on one subsystem?  In my experience,
the speed issue Python gets beat up the most for is the relative slowness of
function calls.  It would be very good if eval_code2 somehow or other could
manage to invoke a Python function without all the hair of a recursive C
call, and I believe Guido intends to move in that direction for Python2
anyway.  This would be a good time to start exploring that seriously.

inspirationally y'rs  - tim





From da at ski.org  Sun May  2 00:15:32 1999
From: da at ski.org (David Ascher)
Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <37296856.5875AAAF@lemburg.com>
Message-ID: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>

> Since you put out to objectives, I'd like to propose a little
> different approach...
> 
> 1. Have eval/exec accept any mapping object as input
> 
> 2. Make those two copy the content of the mapping object into real
>    dictionaries
> 
> 3. Provide a hook into the dictionary implementation that can be
>    used to redirect KeyErrors and use that redirection to forward
>    the request to the original mapping objects

Interesting counterproposal.  I'm not sure whether any of the proposals on
the table really do what's needed for e.g. case-insensitive namespace
handling.  I can see how all of the proposals so far allow
case-insensitive reference name handling in the global namespace, but
don't we also need to hook into the local-namespace creation process to
allow case-insensitivity to work throughout? 

--david






From da at ski.org  Sun May  2 17:15:57 1999
From: da at ski.org (David Ascher)
Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9905020810270.152-100000@david.ski.org>

On Sun, 2 May 1999, Mark Hammond wrote:

> > I'm not sure whether any of the
> > proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation
> > process to
> > allow case-insensitivity to work throughout?
> 
> Why not?  I pictured case insensitive namespaces working so that they
> retain the case of the first assignment, but all lookups would be
> case-insensitive.
> 
> Ohh - right!  Python itself would need changing to support this.  I suppose
> that faced with code such as:
> 
> def func():
>   if spam:
>     Spam=1
> 
> Python would generate code that refers to "spam" as a local, and "Spam" as
> a global.
> 
> Is this why you feel it wont work?

I hadn't thought of that, to be truthful, but I think it's more generic.
[FWIW, I never much cared for the tag-variables-at-compile-time
optimization in CPython, and wouldn't miss it if were lost.]

The point is that if I eval or exec code which calls a function specifying
some strange mapping as the namespaces (global and current-local) I
presumably want to also specify how local namespaces work for the
function calls within that code snippet.  That means that somehow Python
has to know what kind of namespace to use for local environments, and not
use the standard dictionary.  Maybe we can simply have it use a
'.clear()'ed .__copy__ of the specified environment.

  exec 'foo()' in globals(), mylocals

would then call foo and within foo, the local env't would be
mylocals.__copy__.clear().  

Anyway, something for those-with-the-patches to keep in mind.  

--david





From tismer at appliedbiometrics.com  Sun May  2 15:00:37 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 02 May 1999 15:00:37 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com>


David Ascher wrote:
[Marc:> 
> > Since you put out to objectives, I'd like to propose a little
> > different approach...
> >
> > 1. Have eval/exec accept any mapping object as input
> >
> > 2. Make those two copy the content of the mapping object into real
> >    dictionaries
> >
> > 3. Provide a hook into the dictionary implementation that can be
> >    used to redirect KeyErrors and use that redirection to forward
> >    the request to the original mapping objects

I don't think that this proposal would give so much new
value. Since a mapping can also be implemented in arbitrary
ways, say by functions, a mapping is not necessarily finite
and might not be changeable into a dict.

[David:>
> Interesting counterproposal.  I'm not sure whether any of the proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation process to
> allow case-insensitivity to work throughout?

Case-independant namespaces seem to be a minor point,
nice to have for interfacing to other products, but then,
in a function, I see no benefit in changing the semantics
of function locals? The lookup of foreign symbols would 
always be through a mapping object. If you take COM for 
instance, your access to a COM wrapper for an arbitrary
object would be through properties of this object. After
assignment to a local function variable, why should we
support case-insensitivity at all?

I would think mapping objects would be a great 
simplification of lazy imports in COM, where
we would like to avoid to import really huge
namespaces in one big slurp. Also the wrapper code
could be made quite a lot easier and faster without
so much getattr/setattr trapping.

Does btw. anybody really want to see case-insensitivity
in Python programs? I'm quite happy with it as it is,
and I would even force the use to always use the same
case style after he has touched an external property
once. Example for Excel: You may write "xl.workbooks"
in lowercase, but then you have to stay with it.
This would keep Python source clean for, say, PyLint.

my 0.02 Euro - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Sun May  2 01:28:11 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Sun, 2 May 1999 09:28:11 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat>

> I'm not sure whether any of the
> proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation
> process to
> allow case-insensitivity to work throughout?

Why not?  I pictured case insensitive namespaces working so that they
retain the case of the first assignment, but all lookups would be
case-insensitive.

Ohh - right!  Python itself would need changing to support this.  I suppose
that faced with code such as:

def func():
  if spam:
    Spam=1

Python would generate code that refers to "spam" as a local, and "Spam" as
a global.

Is this why you feel it wont work?

Mark.




From mal at lemburg.com  Sun May  2 21:24:54 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 02 May 1999 21:24:54 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org> <372C4C75.5B7CCAC8@appliedbiometrics.com>
Message-ID: <372CA686.215D71DF@lemburg.com>

Christian Tismer wrote:
> 
> David Ascher wrote:
> [Marc:>
> > > Since you put out the objectives, I'd like to propose a little
> > > different approach...
> > >
> > > 1. Have eval/exec accept any mapping object as input
> > >
> > > 2. Make those two copy the content of the mapping object into real
> > >    dictionaries
> > >
> > > 3. Provide a hook into the dictionary implementation that can be
> > >    used to redirect KeyErrors and use that redirection to forward
> > >    the request to the original mapping objects
> 
> I don't think that this proposal would give so much new
> value. Since a mapping can also be implemented in arbitrary
> ways, say by functions, a mapping is not necessarily finite
> and might not be changeable into a dict.

[Disclaimer: I'm not really keen on having the possibility of
 letting code execute in arbitrary namespace objects... it would
 make code optimizations even less manageable.]

You can easily support infinite mappings by wrapping the
function into an object which returns an empty list
for .items() and then use the hook mentioned in 3 to
redirect the lookup to that function.

The proposal allows one to use such a proxy to simulate any
kind of mapping -- it works much like the __getattr__ hook
provided for instances.
 
> [David:>
> > Interesting counterproposal.  I'm not sure whether any of the proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation process to
> > allow case-insensitivity to work throughout?
> 
> Case-independant namespaces seem to be a minor point,
> nice to have for interfacing to other products, but then,
> in a function, I see no benefit in changing the semantics
> of function locals? The lookup of foreign symbols would
> always be through a mapping object. If you take COM for
> instance, your access to a COM wrapper for an arbitrary
> object would be through properties of this object. After
> assignment to a local function variable, why should we
> support case-insensitivity at all?
>
> I would think mapping objects would be a great
> simplification of lazy imports in COM, where
> we would like to avoid to import really huge
> namespaces in one big slurp. Also the wrapper code
> could be made quite a lot easier and faster without
> so much getattr/setattr trapping.

What do lazy imports have to do with case [in]sensitive
namespaces ? Anyway, how about a simple lazy import
mechanism in the standard distribution, i.e. why not make
all imports lazy ? Since modules are first class objects
this should be easy to implement...
 
> Does btw. anybody really want to see case-insensitivity
> in Python programs? I'm quite happy with it as it is,
> and I would even force the use to always use the same
> case style after he has touched an external property
> once. Example for Excel: You may write "xl.workbooks"
> in lowercase, but then you have to stay with it.
> This would keep Python source clean for, say, PyLint.

"No" and "me too" ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 243 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From MHammond at skippinet.com.au  Mon May  3 02:52:41 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Mon, 3 May 1999 10:52:41 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <372CA686.215D71DF@lemburg.com>
Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat>

[Marc]
> [Disclaimer: I'm not really keen on having the possibility of
>  letting code execute in arbitrary namespace objects... it would
>  make code optimizations even less manageable.]

Good point - although surely that would simply mean (certain) optimisations
can't be performed for code executing in that environment?  How to detect
this at "optimization time" may be a little difficult :-)

However, this is the primary purpose of this thread - to workout _if_ it is
a good idea, as much as working out _how_ to do it :-)

> The proposal allows one to use such a proxy to simulate any
> kind of mapping -- it works much like the __getattr__ hook
> provided for instances.

My only problem with Marc's proposal is that there already _is_ an
established mapping protocol, and this doesnt use it; instead it invents a
new one with the benefit being potentially less code breakage.

And without attempting to sound flippant, I wonder how many extension
modules will be affected?  Module init code certainly assumes the module
__dict__ is a dictionary, but none of my code assumes anything about other
namespaces.  Marc's extensions may be a special case, as AFAIK they inject
objects into other dictionaries (ie, new builtins?).  Again, not trying to
downplay this too much, but if it is only a problem for Marc's more
esoteric extensions, I dont feel that should hold up an otherwise solid
proposal.

[Chris, I think?]
> > Case-independant namespaces seem to be a minor point,
> > nice to have for interfacing to other products, but then,
> > in a function, I see no benefit in changing the semantics
> > of function locals? The lookup of foreign symbols would

I disagree here.  Consider Alice, and similar projects, where a (arguably
misplaced, but nonetheless) requirement is that the embedded language be
case-insensitive.  Period.  The Alice people are somewhat special in that
they had the resources to change the interpreters guts.  Most people wont,
and will look for a different language to embedd.

Of course, I agree with you for the specific cases you are talking - COM,
Active Scripting etc.  Indeed, everything I would use this for would prefer
to keep the local function semantics identical.

> > Does btw. anybody really want to see case-insensitivity
> > in Python programs? I'm quite happy with it as it is,
> > and I would even force the use to always use the same
> > case style after he has touched an external property
> > once. Example for Excel: You may write "xl.workbooks"
> > in lowercase, but then you have to stay with it.
> > This would keep Python source clean for, say, PyLint.
>
> "No" and "me too" ;-)

I think we are missing the point a little.  If we focus on COM, we may come
up with a different answer.  Indeed, if we are to focus on COM integration
with Python, there are other areas I would prefer to start with :-)

IMO, we should attempt to come up with a more flexible namespace mechanism
that is in the style of Python, and will not noticeably slowdown Python.
Then COM etc can take advantage of it - much in the same way that Python's
existing namespace model existed pre-COM, and COM had to take advantage of
what it could!

Of course, a key indicator of the likely success is how well COM _can_ take
advantage of it, and how much Alice could have taken advantage of it - I
cant think of any other yardsticks?

Mark.




From mal at lemburg.com  Mon May  3 09:56:53 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 03 May 1999 09:56:53 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <372D56C5.4738DE3D@lemburg.com>

Mark Hammond wrote:
> 
> [Marc]
> > [Disclaimer: I'm not really keen on having the possibility of
> >  letting code execute in arbitrary namespace objects... it would
> >  make code optimizations even less manageable.]
> 
> Good point - although surely that would simply mean (certain) optimisations
> can't be performed for code executing in that environment?  How to detect
> this at "optimization time" may be a little difficult :-)
> 
> However, this is the primary purpose of this thread - to workout _if_ it is
> a good idea, as much as working out _how_ to do it :-)
> 
> > The proposal allows one to use such a proxy to simulate any
> > kind of mapping -- it works much like the __getattr__ hook
> > provided for instances.
> 
> My only problem with Marc's proposal is that there already _is_ an
> established mapping protocol, and this doesnt use it; instead it invents a
> new one with the benefit being potentially less code breakage.

...and that's the key point: you get the intended features and
the core code will not have to be changed in significant ways.
Basically, I think these kind of core extensions should be done
in generic ways, e.g. by letting the eval/exec machinery accept
subclasses of dictionaries, rather than trying to raise the
abstraction level used and slowing things down in general
just to be able to use the feature on very few occasions.

> And without attempting to sound flippant, I wonder how many extension
> modules will be affected?  Module init code certainly assumes the module
> __dict__ is a dictionary, but none of my code assumes anything about other
> namespaces.  Marc's extensions may be a special case, as AFAIK they inject
> objects into other dictionaries (ie, new builtins?).  Again, not trying to
> downplay this too much, but if it is only a problem for Marc's more
> esoteric extensions, I dont feel that should hold up an otherwise solid
> proposal.

My mxTools extension does the assignment in Python, so it wouldn't
be affected. The others only do the usual modinit() stuff.

Before going any further on this thread we may have to ponder a little
more on the objectives that we have. If it's only case-insensitive
lookups then I guess a simple compile time switch exchanging the
implementations of string hash and compare functions would do the
trick. If we're after doing wild things like lookups accross
networks, then a more specific approach is needed.

So what is it that we want in 1.6 ?

> [Chris, I think?]
> > > Case-independant namespaces seem to be a minor point,
> > > nice to have for interfacing to other products, but then,
> > > in a function, I see no benefit in changing the semantics
> > > of function locals? The lookup of foreign symbols would
> 
> I disagree here.  Consider Alice, and similar projects, where a (arguably
> misplaced, but nonetheless) requirement is that the embedded language be
> case-insensitive.  Period.  The Alice people are somewhat special in that
> they had the resources to change the interpreters guts.  Most people wont,
> and will look for a different language to embedd.
> 
> Of course, I agree with you for the specific cases you are talking - COM,
> Active Scripting etc.  Indeed, everything I would use this for would prefer
> to keep the local function semantics identical.

As I understand the needs in COM and AS you are talking about
object attributes, right ? Making these case-insensitive is
a job for a proxy or a __getattr__ hack.
 
> > > Does btw. anybody really want to see case-insensitivity
> > > in Python programs? I'm quite happy with it as it is,
> > > and I would even force the use to always use the same
> > > case style after he has touched an external property
> > > once. Example for Excel: You may write "xl.workbooks"
> > > in lowercase, but then you have to stay with it.
> > > This would keep Python source clean for, say, PyLint.
> >
> > "No" and "me too" ;-)
> 
> I think we are missing the point a little.  If we focus on COM, we may come
> up with a different answer.  Indeed, if we are to focus on COM integration
> with Python, there are other areas I would prefer to start with :-)
> 
> IMO, we should attempt to come up with a more flexible namespace mechanism
> that is in the style of Python, and will not noticeably slowdown Python.
> Then COM etc can take advantage of it - much in the same way that Python's
> existing namespace model existed pre-COM, and COM had to take advantage of
> what it could!
> 
> Of course, a key indicator of the likely success is how well COM _can_ take
> advantage of it, and how much Alice could have taken advantage of it - I
> cant think of any other yardsticks?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 242 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From fredrik at pythonware.com  Mon May  3 16:01:10 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 16:01:10 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com>

scriptics is positioning tcl as a perl killer:

    http://www.scriptics.com/scripting/perl.html

afaict, unicode and event handling are the two
main thingies missing from python 1.5.

-- unicode: is on its way.

-- event handling: asynclib/asynchat provides an
awesome framework for event-driven socket pro-
gramming.  however, Python still lacks good cross-
platform support for event-driven access to files
and pipes.  are threads good enough, or would it
be cool to have something similar to Tcl's fileevent
stuff in Python?

-- regexps: has anyone compared the new uni-
code-aware regexp package in Tcl with pcre?

comments?

</F>

btw, the rebol folks have reached 2.0:
    http://www.rebol.com/

maybe 1.6 should be renamed to Python 6.0?




From akuchlin at cnri.reston.va.us  Mon May  3 17:14:15 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:14:15 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us>

Fredrik Lundh writes:
>-- regexps: has anyone compared the new uni-
>code-aware regexp package in Tcl with pcre?

	I looked at it a bit when Tcl 8.1 was in beta; it derives from
Henry Spencer's 1998-vintage code, which seems to try to do a lot of
optimization and analysis.  It may even compile DFAs instead of NFAs
when possible, though it's hard for me to be sure.  This might give it
a substantial speed advantage over engines that do less analysis, but
I haven't benchmarked it.  The code is easy to read, but difficult to
understand because the theory underlying the analysis isn't explained
in the comments; one feels there should be an accompanying paper to
explain how everything works, and it's why I'm not sure if it really
is producing DFAs for some expressions.

	Tcl seems to represent everything as UTF-8 internally, so
there's only one regex engine; there's .  The code is scattered over
more files:

amarok generic>ls re*.[ch]
regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
amarok generic>wc -l re*.[ch]
     742 regc_color.c
     170 regc_cvec.c
    1010 regc_lex.c
     781 regc_locale.c
    1528 regc_nfa.c
    2124 regcomp.c
      85 regcustom.h
     627 rege_dfa.c
      82 regerror.c
      18 regerrs.h
     308 regex.h
     952 regexec.c
      25 regfree.c
      56 regfronts.c
     388 regguts.h
    8896 total
amarok generic>

	This would be an issue for using it with Python, since all
these files would wind up scattered around the Modules directory.  For
comparison, pypcre.c is around 4700 lines of code.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Things need not have happened to be true. Tales and dreams are the
shadow-truths that will endure when mere facts are dust and ashes, and forgot.
    -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_




From guido at CNRI.Reston.VA.US  Mon May  3 17:32:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 11:32:09 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT."
             <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com>  
            <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us>

> 	I looked at it a bit when Tcl 8.1 was in beta; it derives from
> Henry Spencer's 1998-vintage code, which seems to try to do a lot of
> optimization and analysis.  It may even compile DFAs instead of NFAs
> when possible, though it's hard for me to be sure.  This might give it
> a substantial speed advantage over engines that do less analysis, but
> I haven't benchmarked it.  The code is easy to read, but difficult to
> understand because the theory underlying the analysis isn't explained
> in the comments; one feels there should be an accompanying paper to
> explain how everything works, and it's why I'm not sure if it really
> is producing DFAs for some expressions.
> 
> 	Tcl seems to represent everything as UTF-8 internally, so
> there's only one regex engine; there's .

Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
point the regex engine was compiled twice, once for 8-bit chars and
once for 16-bit chars.  But this may have changed.

I've noticed that Perl is taking the same position (everything is
UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
from 8-bit bytes.  Python is currently in the Java camp.  This might
be a good time to make sure that we're still convinced that this is
the right thing to do!

> The code is scattered over
> more files:
> 
> amarok generic>ls re*.[ch]
> regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
> regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
> regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
> amarok generic>wc -l re*.[ch]
>      742 regc_color.c
>      170 regc_cvec.c
>     1010 regc_lex.c
>      781 regc_locale.c
>     1528 regc_nfa.c
>     2124 regcomp.c
>       85 regcustom.h
>      627 rege_dfa.c
>       82 regerror.c
>       18 regerrs.h
>      308 regex.h
>      952 regexec.c
>       25 regfree.c
>       56 regfronts.c
>      388 regguts.h
>     8896 total
> amarok generic>
> 
> 	This would be an issue for using it with Python, since all
> these files would wind up scattered around the Modules directory.  For
> comparison, pypcre.c is around 4700 lines of code.

I'm sure that if it's good code, we'll find a way.  Perhaps a more
interesting question is whether it is Perl5 compatible.  I contacted
Henry Spencer at the time and he was willing to let us use his code.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From akuchlin at cnri.reston.va.us  Mon May  3 17:56:46 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:56:46 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
	<14125.47524.196878.583460@amarok.cnri.reston.va.us>
	<199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
>point the regex engine was compiled twice, once for 8-bit chars and
>once for 16-bit chars.  But this may have changed.

	It doesn't seem to currently; the code in tclRegexp.c looks
like this:

    /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the
     * matches from character to byte offsets.
     */
    regexpPtr->string = string;
    Tcl_DStringInit(&stringBuffer);
    uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer);
    numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar);
    /* Perform the regexp match. */
    result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1,
            ((string > start) ? REG_NOTBOL : 0));

	ISTR the Spencer engine does, however, define a small and
large representation for NFAs and have two versions of the engine, one
for each representation.  Perhaps that's what you're thinking of.

>I've noticed that Perl is taking the same position (everything is
>UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
>from 8-bit bytes.  Python is currently in the Java camp.  This might
>be a good time to make sure that we're still convinced that this is
>the right thing to do!

	I don't know.  There's certainly the fundamental dichotomy
that strings are sometimes used to represent characters, where
changing encodings on input and output is reasonably, and sometimes
used to hold chunks of binary data, where any changes are incorrect.
Perhaps Paul Prescod is right, and we should try to get some other
data type (array.array()) for holding binary data, as distinct from
strings.

>I'm sure that if it's good code, we'll find a way.  Perhaps a more
>interesting question is whether it is Perl5 compatible.  I contacted
>Henry Spencer at the time and he was willing to let us use his code.

	Mostly Perl-compatible, though it doesn't look like the 5.005
features are there, and I haven't checked for every single 5.004
feature.  Adding missing features might be problematic, because I
don't really understand what the code is doing at a high level.  Also,
is there a user community for this code?  Do any other projects use
it?  Philip Hazel has been quite helpful with PCRE, an important thing
when making modifications to the code.
 
	Should I make a point of looking at what using the Spencer
engine would entail?  It might not be too difficult (an evening or
two, maybe?) to write a re.py that sat on top of the Spencer code;
that would at least let us do some benchmarking.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In Einstein's theory of relativity the observer is a man who sets out in quest
of truth armed with a measuring-rod. In quantum theory he sets out with a
sieve.
    -- Sir Arthur Eddington





From guido at CNRI.Reston.VA.US  Mon May  3 18:02:22 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 12:02:22 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT."
             <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>  
            <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us>

> 	Should I make a point of looking at what using the Spencer
> engine would entail?  It might not be too difficult (an evening or
> two, maybe?) to write a re.py that sat on top of the Spencer code;
> that would at least let us do some benchmarking.

Surely this would be more helpful than weeks of specilative emails --
go for it!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Mon May  3 19:10:55 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:10:55 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us>
Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com>

> Also, is there a user community for this code?

how about comp.lang.tcl ;-)

</F>




From fredrik at pythonware.com  Mon May  3 19:15:00 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:15:00 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>             <14125.49911.982236.754340@amarok.cnri.reston.va.us>  <199905031602.MAA05829@eric.cnri.reston.va.us>
Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com>

talking about regexps, here's another thing that
would be quite nice to have in 1.6 (available from
the Python level, that is).  or is it already in there
somewhere?

</F>

...

http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873

Tcl 8.1b3 Request:  Generated by Scriptics' bug entry form at

Submitted by:  Frederic BONNET
OperatingSystem:  Windows 98
CustomShell:  Applied patch to the regexp engine (the exec part)
Synopsis:  regexp improvements

DesiredBehavior:
    As previously requested by Don Libes:
    
    > I see no way for Tcl_RegExpExec to indicate "could match" meaning
    > "could match if more characters arrive that were suitable for a
    > match".  This is required for a class of applications involving
    > matching on a stream required by Expect's interact command.  Henry
    > assured me that this facility would be in the engine (I'm not the only
    > one that needs it).  Note that it is not sufficient to add one more
    > return value to Tcl_RegExpExec (i.e., 2) because one needs to know
    > both if something matches now and can match later.  I recommend
    > another argument (canMatch *int) be added to Tcl_RegExpExec.

/patch info follows/

...




From bwarsaw at cnri.reston.va.us  Tue May  4 00:28:23 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Mon, 3 May 1999 18:28:23 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us>

I've been using Jitterbug for a couple of weeks now as my bug database
for Mailman and JPython.  So it was easy enough for me to set up a
database for Python bug reports.  Guido is in the process of tailoring 
the Jitterbug web interface to his liking and will announce it to the
appropriate forums when he's ready.

In the meantime, I've created YAML that you might be interested in.
All bug reports entered into Jitterbug will be forwarded to
python-bugs-list at python.org.  You are invited to subscribe to the list 
by visiting

    http://www.python.org/mailman/listinfo/python-bugs-list

Enjoy,
-Barry



From jeremy at cnri.reston.va.us  Tue May  4 00:30:10 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon,  3 May 1999 18:30:10 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
References: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>

Pretty low volume list, eh?



From MHammond at skippinet.com.au  Tue May  4 01:28:39 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Tue, 4 May 1999 09:28:39 +1000
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>
Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat>

ha - we wish.  More likely to be full of detailed bug reports about how 1/2
!= 0.5, or that "def foo(baz=[])" is buggy, etc :-)

Mark.

> Pretty low volume list, eh?




From tim_one at email.msn.com  Tue May  4 07:16:17 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:16:17 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <000701be95ed$3d594180$dca22299@tim>

[Guido & Andrew on Tcl's new regexp code]
> I'm sure that if it's good code, we'll find a way.  Perhaps a more
> interesting question is whether it is Perl5 compatible.  I contacted
> Henry Spencer at the time and he was willing to let us use his code.

Haven't looked at the code, but did read the manpage just now:

    http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm

WRT Perl5 compatibility, it sez:

    Incompatibilities of note include `\b', `\B', the lack of special
    treatment for a trailing newline, the addition of complemented
    bracket expressions to the things affected by newline-sensitive
    matching, the restrictions on parentheses and back references in
    lookahead constraints, and the longest/shortest-match (rather than
    first-match) matching semantics.

So some gratuitous differences, and maybe a killer:  Guido hasn't had much
kind to say about "longest" (aka POSIX) matching semantics.  An example from
the page:

    (week|wee)(night|knights)
    matches all ten characters of `weeknights'

which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
'night'.

It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
is correct; indeed, it's a pain to get that behavior any other way!

otoh-it's-potentially-very-much-faster-ly y'rs  - tim





From tim_one at email.msn.com  Tue May  4 07:51:01 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:51:01 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <000701be95ed$3d594180$dca22299@tim>
Message-ID: <000901be95f2$195556c0$dca22299@tim>

[Tim]
> ...
> It's the *natural* semantics if Andrew's suspicion that it's
> compiling a DFA is correct ...

More from the man page:

    AREs report the longest/shortest match for the RE, rather than
    the first found in a specified search order. This may affect some
    RREs which were written in the expectation that the first match
    would be reported. (The careful crafting of RREs to optimize the
    search order for fast matching is obsolete (AREs examine all possible
    matches in parallel, and their performance is largely insensitive to
    their complexity) but cases where the search order was exploited to
    deliberately find a match which was not the longest/shortest will
    need rewriting.)

Nails it, yes?  Now, in 10 seconds, try to remember a regexp where this
really matters <wink>.

Note in passing that IDLE's colorizer regexp *needs* to search for
triple-quoted strings before single-quoted ones, else the P/P semantics
would consider """ to be an empty single-quoted string followed by a double
quote.  This isn't a case where it matters in a bad way, though!  The
"longest" rule picks the correct alternative regardless of the order in
which they're written.

at-least-in-that-specific-regex<0.1-wink>-ly y'rs  - tim





From guido at CNRI.Reston.VA.US  Tue May  4 14:26:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 08:26:04 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT."
             <000701be95ed$3d594180$dca22299@tim> 
References: <000701be95ed$3d594180$dca22299@tim> 
Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us>

[Tim]
> So some gratuitous differences, and maybe a killer:  Guido hasn't had much
> kind to say about "longest" (aka POSIX) matching semantics.
> 
> An example from the page:
> 
>     (week|wee)(night|knights)
>     matches all ten characters of `weeknights'
> 
> which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
> 'night'.
> 
> It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
> is correct; indeed, it's a pain to get that behavior any other way!

Possibly contradicting what I once said about DFAs (I have no idea
what I said any more :-): I think we shouldn't be hung up about the
subtleties of DFA vs. NFA; for most people, the Perl-compatibility
simply means that they can use the same metacharacters.  My guess is
that people don'y so much translate long Perl regexp's to Python but
simply transport their (always incomplete -- Larry Wall *wants* it
that way :-) knowledge of Perl regexps to Python.  My meta-guess is
that this is also Henry Spencer's and John Ousterhout's guess.  As for
Larry Wall, I guess he really doesn't care :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at cnri.reston.va.us  Tue May  4 18:14:41 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Tue,  4 May 1999 12:14:41 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
References: <000701be95ed$3d594180$dca22299@tim>
	<199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Possibly contradicting what I once said about DFAs (I have no idea
>what I said any more :-): I think we shouldn't be hung up about the
>subtleties of DFA vs. NFA; for most people, the Perl-compatibility
>simply means that they can use the same metacharacters.  My guess is

	I don't like slipping in such a change to the semantics with
no visible change to the module name or interface.  On the other hand,
if it's not NFA-based, then it can provide POSIX semantics without
danger of taking exponential time to determine the longest match.
BTW, there's an interesting reference, I assume to this code, in
_Mastering Regular Expressions_; Spencer is quoted on page 121 as
saying it's "at worst quadratic in text size.".

	Anyway, we can let it slide until a Python interface gets written.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In the black shadow of the Baba Yaga babies screamed and mothers miscarried;
milk soured and men went mad.
    -- In SANDMAN #38: "The Hunt"




From guido at CNRI.Reston.VA.US  Tue May  4 18:19:06 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 12:19:06 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT."
             <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us>  
            <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us>

> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

Not sure if that was the same code -- this is *new* code, not
Spencer's old code.  I think Friedl's book is older than the current
code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Wed May  5 07:37:02 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 01:37:02 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <000701be96b9$4e434460$799e2299@tim>

I've consistently found that the best way to kill a thread is to rename it
accurately <wink>.

Agree w/ Guido that few people really care about the differing semantics.

Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage
anyway:  code will definitely break.  Like

    \b(?:
        (?P<keyword>and|if|else|...) |
        (?P<identifier>[a-zA-Z_]\w*)
    )\b

The (special)|(general) idiom relies on left-to-right match-and-out
searching of alternatives to do its job correctly.  Not to mention that \b
is not a word-boundary assertion in the new pkg (talk about pointlessly
irritating differences!  at least this one could be easily hidden via
brainless preprocessing).

Over the long run, moving to a DFA locks Python out of the directions Perl
is *moving*, namely embedding all sorts of runtime gimmicks in regexps that
exploit knowing the "state of the match so far".  DFAs don't work that way.
I don't mind losing those possibilities, because I think the regexp
sublanguage is strained beyond its limits already.  But that's a decision
with Big Consequences, so deserves some thought.

I'd definitely like the (sometimes dramatically) increased speed a DFA can
offer (btw, this code appears to use a lazily-generated DFA, to avoid the
exponential *compile*-time a straightforward DFA implementation can
suffer -- the code is very complex and lacks any high-level internal docs,
so we better hope Henry stays in love with it <0.5 wink>).

> ...
> My guess is that people don't so much translate long Perl regexp's
> to Python but simply transport their (always incomplete -- Larry Wall
> *wants* it that way :-) knowledge of Perl regexps to Python.

This is directly proportional to the number of feeble CGI programmers Python
attracts <wink>.  The good news is that they wouldn't know an NFA from a DFA
if Larry bit Henry on the ass ...

> My meta-guess is that this is also Henry Spencer's and John
> Ousterhout's guess.

I think Spencer strongly favors DFA semantics regardless of fashion, and
Ousterhout is a pragmatist.  So I trust JO's judgment more <0.9 wink>.

> As for Larry Wall, I guess he really doesn't care :-)

I expect he cares a lot!  Because a DFA would prevent Perl from going even
more insane in its present direction.


About the age of the code, postings to comp.lang.tcl have Henry saying he
was working on the alpha version intensely as recently as Decemeber ('98).
A few complaints about the alpha release trickled in, about regexp compile
speed and regexp matching speed in specific cases.  Perhaps paradoxically,
the latter were about especially simple regexps with long fixed substrings
(where this mountain of sophisticated machinery is likely to get beat cold
by an NFA with some fixed-substring lookahead smarts -- which latter Henry
intended to graft into this pkg too).

[Andrew]
> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

[Guido]
> Not sure if that was the same code -- this is *new* code, not
> Spencer's old code.  I think Friedl's book is older than the current
> code.

I expect this is an invariant, though:  it's not natural for a DFA to know
where subexpression matches begin and end, and there's a pile of xxx_dissect
functions in regexec.c that use what strongly appear to be worst-case
quadratic-time algorithms for figuring that out after it's known that the
overall expression has *a* match.  Expect too, but don't know, that only
pathological cases are actually expensive.


Question:  has this package been released in any other context, or is it
unique to Tcl?  I searched in vain for an announcement (let alone code) from
Henry, or any discussion of this code outside the Tcl world.

whatever-happens-i-vote-we-let-them-debug-it<wink>-ly y'rs  - tim





From gstein at lyra.org  Wed May  5 08:22:20 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 4 May 1999 23:22:20 -0700 (PDT)
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <000701be96b9$4e434460$799e2299@tim>
Message-ID: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>

On Wed, 5 May 1999, Tim Peters wrote:
>...
> Question:  has this package been released in any other context, or is it
> unique to Tcl?  I searched in vain for an announcement (let alone code) from
> Henry, or any discussion of this code outside the Tcl world.

Apache uses it.

However, the Apache guys have considered possibility updating the thing. I
gather that they have a pretty old snapshot. Another guy mentioned PCRE
and I pointed out that Python uses it for its regex support. In other
words, if Apache *does* update the code, then it may be that Apache will
drop the HS engine in favor of PCRE.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/





From Ivan.Porres at abo.fi  Wed May  5 10:29:21 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 11:29:21 +0300
Subject: [Python-Dev] Python for Small Systems patch
Message-ID: <37300161.8DFD1D7F@abo.fi>

Python for Small Systems is a minimal version of the python interpreter,
intended to run on small embedded systems with a limited amount of
memory. 

Since there is some interest in the newsgroup, we have decide to release
an alpha version of the patch. You can download the patch from the
following page: 

http://www.abo.fi/~iporres/python

There is no documentation about the changes, but I guess that it is not
so difficult to figure out what Raul has been doing. 

There are some simple examples in the Demo/hitachi directory. The
configure scripts are broken. We plan to modify the configure scripts 
for cross-compilation. We are still testing, cleaning
and trying to reduce the memory requirements of the patched interpreter.
We also plan to write some documentation.

Please send comments to Raul (rparra at abo.fi) or to me (iporres at abo.fi),

Regards,
Ivan


-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 13:52:24 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 13:52:24 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi>
Message-ID: <373030F8.21B73451@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Python for Small Systems is a minimal version of the python interpreter,
> intended to run on small embedded systems with a limited amount of
> memory.
> 
> Since there is some interest in the newsgroup, we have decide to release
> an alpha version of the patch. You can download the patch from the
> following page:
> 
> http://www.abo.fi/~iporres/python
> 
> There is no documentation about the changes, but I guess that it is not
> so difficult to figure out what Raul has been doing.

Ivan,
small Python is a very interesting thing,
thanks for the preview.

But, aren't 12600 lines of diff a little too much
to call it "not difficult to figure out"? :-)

The very last line was indeed helpful:

+++ Pss/miniconfigure	Tue Mar 16 16:59:42 1999
@@ -0,0 +1 @@
+./configure --prefix="/home/rparra/python/Python-1.5.1"
--without-complex --without-float --without-long --without-file
--without-libm --without-libc --without-fpectl --without-threads
--without-dec-threads --with-libs=

But I'd be interested in a brief list
of which other features are out, and even more which
structures were changed. Would that be possible?

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From Ivan.Porres at abo.fi  Wed May  5 15:17:17 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 16:17:17 +0300
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com>
Message-ID: <373044DD.FE4499E@abo.fi>

Christian Tismer wrote:
> Ivan,
> small Python is a very interesting thing,
> thanks for the preview.
> 
> But, aren't 12600 lines of diff a little too much
> to call it "not difficult to figure out"? :-)

Raul Parra (rpb), the author of the patch, got the "source scissors"
(#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
embedded system with some RAM, no keyboard, no screen and no OS. An
example application can be a printer where the print jobs are python
bytecompiled scripts (instead of postscript).

We plan to write some documentation about the patch. Meanwhile, here are
some of the changes:

WITHOUT_PARSER, WITHOUT_COMPILER
Defining WITHOUT_PARSER removes the parser. This has a lot of
implications (no eval() !) but saves a lot of memory. The interpreter
can only execute byte-compiled scripts, that is PyCodeObjects. 

Most embedded processors have poor floating point capabilities. (They
can not compete with DSP's):

WITHOUT-COMPLEX
Removes support for complex numbers

WITHOUT-LONG
Removes long numbers

WITHOUT-FLOAT
Removes floating point numbers

Dependences with the OS:

WITHOUT-FILE
Removes file objects. No file, no print, no input, no interactive
prompt. This is not to bad in a device without hard disk, keyboard or
screen...

WITHOUT-GETPATH
Removes dependencies with os path.(Probabily this change should be
integrated with WITHOUT-FILE)

These changes render most of the standard modules unusable.
There are no fundamental changes on the interpter, just cut and cut....

Ivan
-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 15:31:05 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 15:31:05 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi>
Message-ID: <37304819.AD636B67@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Christian Tismer wrote:
> > Ivan,
> > small Python is a very interesting thing,
> > thanks for the preview.
> >
> > But, aren't 12600 lines of diff a little too much
> > to call it "not difficult to figure out"? :-)
> 
> Raul Parra (rpb), the author of the patch, got the "source scissors"
> (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
> embedded system with some RAM, no keyboard, no screen and no OS. An
> example application can be a printer where the print jobs are python
> bytecompiled scripts (instead of postscript).
> 
> We plan to write some documentation about the patch. Meanwhile, here are
> some of the changes:

Many thanks, this is really interesting

> These changes render most of the standard modules unusable.
> There are no fundamental changes on the interpter, just cut and cut....

I see. A last thing which I'm curious about is the executable
size. If this can be compared to a Windows dll at all. Did you 
compile without the changes for your target as well? 
How is the ratio? The python15.dll file contains everything
of core Python and is about 560 KB large.
If your engine goes down to, say below 200 KB, this could
be a great thing for embedding Python into other apps.

ciao & thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Wed May  5 16:55:40 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 5 May 1999 10:55:40 -0400 (EDT)
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
References: <199905041226.IAA07627@eric.cnri.reston.va.us>
	<000701be96b9$4e434460$799e2299@tim>
Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us>

>>>>> "TP" == Tim Peters <tim_one at email.msn.com> writes:

    TP> Over the long run, moving to a DFA locks Python out of the
    TP> directions Perl is *moving*, namely embedding all sorts of
    TP> runtime gimmicks in regexps that exploit knowing the "state of
    TP> the match so far".  DFAs don't work that way.  I don't mind
    TP> losing those possibilities, because I think the regexp
    TP> sublanguage is strained beyond its limits already.  But that's
    TP> a decision with Big Consequences, so deserves some thought.

I know zip about the internals of the various regexp package.  But as
far as the Python level interface, would it be feasible to support
both as underlying regexp engines underneath re.py?  The idea would be 
that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
Then all the rest of the magic happens behind the scenes, with
appropriate exceptions thrown if there are syntax mismatches in the
regexp that can't be worked around by preprocessors, etc.

Or would that be more confusing than yet another different regexp
module?

-Barry



From tim_one at email.msn.com  Wed May  5 17:55:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 11:55:20 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>
Message-ID: <000601be970f$adef5740$a59e2299@tim>

[Tim]
> Question:  has this package [Tcl's 8.1 regexp support] been released in
> any other context, or is it unique to Tcl?  I searched in vain for an
> announcement (let alone code) from Henry, or any discussion of this code
> outside the Tcl world.

[Greg Stein]
> Apache uses it.
>
> However, the Apache guys have considered possibility updating the thing. I
> gather that they have a pretty old snapshot. Another guy mentioned PCRE
> and I pointed out that Python uses it for its regex support. In other
> words, if Apache *does* update the code, then it may be that Apache will
> drop the HS engine in favor of PCRE.

Hmm.  I just downloaded the Apache 1.3.4 source to check on this, and it
appears to be using a lightly massaged version of Spencer's old (circa
'92-'94) just-POSIX regexp package.  Henry has been distributing regexp pkgs
for a loooong time <wink>.

The Tcl 8.1 regexp pkg is much hairier.  If the Apache folk want to switch
in order to get the Perl regexp syntax extensions, this Tcl version is worth
looking at too.  If they want to switch for some other reason, it would be
good to know what that is!

The base pkg Apache uses is easily available all over the web; the pkg Tcl
8.1 is using I haven't found anywhere except in the Tcl download (which is
why I'm wondering about it -- so far, it doesn't appear to be distributed by
Spencer himself, in a non-Tcl-customized form).

looks-like-an-entirely-new-pkg-to-me-ly y'rs  - tim





From beazley at cs.uchicago.edu  Wed May  5 18:54:45 1999
From: beazley at cs.uchicago.edu (David Beazley)
Date: Wed, 5 May 1999 11:54:45 -0500 (CDT)
Subject: [Python-Dev] My (possibly delusional) book project
Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu>

Although this is a little off-topic for the developer list, I want to
fill people in on a new Python book project.  A few months ago, 
I was approached about doing a new Python reference book and I've
since decided to proceed with the project (after all, an increased
presence at the bookstore is probably a good thing :-).

In any event, my "vision" for this book is to take the material in the
Python tutorial, language reference, library reference, and extension
guide and squeeze it into a compact book no longer than 300 pages (and
hopefully without having to use a 4-point font).  Actually, what I'm
really trying to do is write something in a style similar to the K&R C
Programming book (very terse, straight to the point, and technically
accurate). The book's target audience is experienced/expert
programmers.

With this said, I would really like to get feedback from the developer
community about this project in a few areas.  First, I want to make
sure the language reference is in sync with the latest version of
Python, that it is as accurate as possible, and that it doesn't leave
out any important topics or recent developments.  Second, I would be
interested in knowing how to emphasize certain topics (for instance,
should I emphasize class-based exceptions over string-based exceptions
even though most books only cover the former case?).  The other big
area is the library reference.  Given the size of the library, I'm
going to cut a number of modules out.  However, the choice of what to
cut is not entirely clear (for now, it's a judgment call on my part).

All of the work in progress for this project is online at:

   http://rustler.cs.uchicago.edu/~beazley/essential/reference.html

I would love to get constructive feedback about this from other
developers.  Of course, I'll keep people posted in any case.

Cheers,

Dave




From tim_one at email.msn.com  Thu May  6 07:43:16 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 6 May 1999 01:43:16 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us>
Message-ID: <000d01be9783$57543940$2ca22299@tim>

[Tim notes that moving to a DFA regexp engine would rule out some future
 aping of Perl mistakes <wink>]

[Barry "The Great Compromiser" Warsaw]
> I know zip about the internals of the various regexp package.  But as
> far as the Python level interface, would it be feasible to support
> both as underlying regexp engines underneath re.py?  The idea would be
> that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
> re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
> Then all the rest of the magic happens behind the scenes, with
> appropriate exceptions thrown if there are syntax mismatches in the
> regexp that can't be worked around by preprocessors, etc.
>
> Or would that be more confusing than yet another different regexp
> module?

It depends some on what percentage of the Python distribution Guido wants to
devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of
code in Modules/, where regexp packages already consume more than anything
else.

It's a lot of delicate, difficult code.  Someone would need to step up and
champion each alternative package.  I haven't asked Andrew lately, but I'd
bet half a buck the thrill of supporting pcre has waned.

If there were competing packages, your suggested interface is fine.  I just
doubt the Python developers will support more than one (Andrew may still be
young, but he can't possibly still be naive enough to sign up for two of
these nightmares <wink>).

i'm-so-old-i-never-signed-up-for-one-ly y'rs  - tim





From rushing at nightmare.com  Thu May 13 08:34:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 12 May 1999 23:34:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905070507.BAA22545@python.org>
References: <199905070507.BAA22545@python.org>
Message-ID: <14138.28243.553816.166686@seattle.nightmare.com>

[list has been quiet, thought I'd liven things up a bit. 8^)]

I'm not sure if this has been brought up before in other forums, but
has there been discussion of separating the Python and C invocation
stacks, (i.e., removing recursive calls to the intepreter) to
facilitate coroutines or first-class continuations?

One of the biggest barriers to getting others to use asyncore/medusa
is the need to program in continuation-passing-style (callbacks,
callbacks to callbacks, state machines, etc...).  Usually there has to
be an overriding requirement for speed/scalability before someone will
even look into it.  And even when you do 'get' it, there are limits to
how inside-out your thinking can go. 8^)

If Python had coroutines/continuations, it would be possible to hide
asyncore-style select()/poll() machinery 'behind the scenes'.  I
believe that Concurrent ML does exactly this...

Other advantages might be restartable exceptions, different threading
models, etc...

-Sam
rushing at nightmare.com
rushing at eGroups.net




From mal at lemburg.com  Thu May 13 10:23:13 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 13 May 1999 10:23:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <373A8BF1.AE124BF@lemburg.com>

rushing at nightmare.com wrote:
> 
> [list has been quiet, thought I'd liven things up a bit. 8^)]

Well, there certainly is enough on the todo list... it's probably
the usual "ain't got no time" thing.

> I'm not sure if this has been brought up before in other forums, but
> has there been discussion of separating the Python and C invocation
> stacks, (i.e., removing recursive calls to the intepreter) to
> facilitate coroutines or first-class continuations?

Wouldn't it be possible to move all the C variables passed to
eval_code() via the execution frame ? AFAIK, the frame is
generated on every call to eval_code() and thus could also
be generated *before* calling it.

> One of the biggest barriers to getting others to use asyncore/medusa
> is the need to program in continuation-passing-style (callbacks,
> callbacks to callbacks, state machines, etc...).  Usually there has to
> be an overriding requirement for speed/scalability before someone will
> even look into it.  And even when you do 'get' it, there are limits to
> how inside-out your thinking can go. 8^)
> 
> If Python had coroutines/continuations, it would be possible to hide
> asyncore-style select()/poll() machinery 'behind the scenes'.  I
> believe that Concurrent ML does exactly this...
> 
> Other advantages might be restartable exceptions, different threading
> models, etc...

Don't know if moving the C stack stuff into the frame objects
will get you the desired effect: what about other things having
state (e.g. connections or files), that are not even touched
by this mechanism ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 232 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From rushing at nightmare.com  Thu May 13 11:40:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 02:40:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373A8BF1.AE124BF@lemburg.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<373A8BF1.AE124BF@lemburg.com>
Message-ID: <14138.38550.89759.752058@seattle.nightmare.com>

M.-A. Lemburg writes:

 > Wouldn't it be possible to move all the C variables passed to
 > eval_code() via the execution frame ? AFAIK, the frame is
 > generated on every call to eval_code() and thus could also
 > be generated *before* calling it.

I think this solves half of the problem.  The C stack is both a value
stack and an execution stack (i.e., it holds variables and return
addresses).  Getting rid of arguments (and a return value!) gets rid
of the need for the 'value stack' aspect.

In aiming for an enter-once, exit-once VM, the thorniest part is to
somehow allow python->c->python calls.  The second invocation could
never save a continuation because its execution context includes a C
frame.  This is a general problem, not specific to Python; I probably
should have thought about it a bit before posting...

 > Don't know if moving the C stack stuff into the frame objects
 > will get you the desired effect: what about other things having
 > state (e.g. connections or files), that are not even touched
 > by this mechanism ?

I don't think either of those cause 'real' problems (i.e., nothing
should crash that assumes an open file or socket), but there may be
other stateful things that might.  I don't think that refcounts would
be a problem - a saved continuation wouldn't be all that different
from an exception traceback.

-Sam

p.s. Here's a tiny VM experiment I wrote a while back, to explain
what I mean by 'stackless':

http://www.nightmare.com/stuff/machine.h
http://www.nightmare.com/stuff/machine.c

Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context
onto heap-allocated data structures rather than calling the VM
recursively.




From skip at mojam.com  Thu May 13 13:38:39 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 13 May 1999 07:38:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>

    Sam> I'm not sure if this has been brought up before in other forums,
    Sam> but has there been discussion of separating the Python and C
    Sam> invocation stacks, (i.e., removing recursive calls to the
    Sam> intepreter) to facilitate coroutines or first-class continuations?

I thought Guido was working on that for the mobile agent stuff he was
working on at CNRI.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From bwarsaw at cnri.reston.va.us  Thu May 13 17:10:52 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 13 May 1999 11:10:52 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> I thought Guido was working on that for the mobile agent stuff
    SM> he was working on at CNRI.

Nope, we decided that we could accomplish everything we needed without 
this.  We occasionally revisit this but Guido keeps insisting it's a
lot of work for not enough benefit :-)

-Barry



From guido at CNRI.Reston.VA.US  Thu May 13 17:19:10 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 13 May 1999 11:19:10 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT."
             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>  
            <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us>

Interesting topic!  While I 'm on the road, a few short notes.

> I thought Guido was working on that for the mobile agent stuff he was
> working on at CNRI.

Indeed.  At least I planned on working on it.  I ended up abandoning
the idea because I expected it would be a lot of work and I never had
the time (same old story indeed).

Sam also hit it on the nail: the hardest problem is what to do about
all the places where C calls back into Python.

I've come up with two partial solutions: (1) allow for a way to
arrange for a call to be made immediately after you return to the VM
from C; this would take care of apply() at least and a few other
"tail-recursive" cases; (2) invoke a new VM when C code needs a Python
result, requiring it to return.  The latter clearly breaks certain
uses of coroutines but could probably be made to work most of the
time.  Typical use of the 80-20 rule.

And I've just come up with a third solution: a variation on (1) where
you arrange *two* calls: one to Python and then one to C, with the
result of the first.  (And a bit saying whether you want the C call to 
be made even when an exception happened.)

In general, I still think it's a cool idea, but I also still think
that continuations are too complicated for most programmers.  (This
comes from the realization that they are too complicated for me!)
Corollary: even if we had continuations, I'm not sure if this would
take away the resistance against asyncore/asynchat.  Of course I could 
be wrong.

Different suggestion: it would be cool to work on completely
separating out the VM from the rest of Python, through some kind of
C-level API specification.  Two things should be possiblw with this
new architecture: (1) small platform ports could cut out the
interactive interpreter, the parser and compiler, and certain data
types such as long, complex and files; (2) there could be alternative
pluggable VMs with certain desirable properties such as
platform-specific optimization (Christian, are you listening? :-).

I think the most challenging part might be defining an API for passing 
in the set of supported object types and operations.  E.g. the
EXEC_STMT opcode needs to be be implemented in a way that allows
"exec" to be absent from the language.  Perhaps an __exec__ function
(analogous to __import__) is the way to go.  The set of built-in
functions should also be passed in, so that e.g. one can easily leave
out open(), eval() and comppile(), complex(), long(), float(), etc.

I think it would be ideal if no #ifdefs were needed to remove features
(at least not in the VM code proper).  Fortunately, the VM doesn't
really know about many object types -- frames, fuctions, methods,
classes, ints, strings, dictionaries, tuples, tracebacks, that may be
all it knows.  (Lists?)

Gotta run,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Thu May 13 21:50:44 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 13 May 1999 21:50:44 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>  <199905131519.LAA01097@eric.cnri.reston.va.us>
Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com>

> In general, I still think it's a cool idea, but I also still think
> that continuations are too complicated for most programmers.  (This
> comes from the realization that they are too complicated for me!)

in an earlier life, I used non-preemtive threads (that is,
explicit yields) and co-routines to do some really cool
stuff with very little code.  looks like a stack-less inter-
preter would make it trivial to implement that.

might just be nostalgia, but I think I would give an arm
or two to get that (not necessarily my own, though ;-)

</F>




From rushing at nightmare.com  Fri May 14 04:00:09 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 19:00:09 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
Message-ID: <14139.30970.644343.612721@seattle.nightmare.com>

Guido van Rossum writes:
  > I've come up with two partial solutions: (1) allow for a way to
  > arrange for a call to be made immediately after you return to the
  > VM from C; this would take care of apply() at least and a few
  > other "tail-recursive" cases; (2) invoke a new VM when C code
  > needs a Python result, requiring it to return.  The latter clearly
  > breaks certain uses of coroutines but could probably be made to
  > work most of the time.  Typical use of the 80-20 rule.

I know this is disgusting, but could setjmp/longjmp 'automagically'
force a 'recursive call' to jump back into the top-level loop?  This
would put some serious restraint on what C called from Python could
do...

I think just about any Scheme implementation has to solve this same
problem... I'll dig through my collection of them for ideas.

  > In general, I still think it's a cool idea, but I also still think
  > that continuations are too complicated for most programmers.  (This
  > comes from the realization that they are too complicated for me!)
  > Corollary: even if we had continuations, I'm not sure if this would
  > take away the resistance against asyncore/asynchat.  Of course I could 
  > be wrong.

Theoretically, you could have a bit of code that looked just like
'normal' imperative code, that would actually be entering and exiting
the context for non-blocking i/o.  If it were done right, the same
exact code might even run under 'normal' threads.

Recently I've written an async server that needed to talk to several
other RPC servers, and a mysql server.  Pseudo-example, with
possibly-async calls in UPPERCASE:

  auth, archive = db.FETCH_USER_INFO (user)
  if verify_login(user,auth):
    rpc_server = self.archive_servers[archive]
    group_info = rpc_server.FETCH_GROUP_INFO (group)
    if valid (group_info):
      return rpc_server.FETCH_MESSAGE (message_number)
    else:
      ...
   else:
     ...

This code in CPS is a horrible, complicated mess, it takes something
like 8 callback methods, variables and exceptions have to be passed
around in 'continuation' objects.  It's hairy because there are three
levels of callback state.  Ugh.

If Python had closures, then it would be a *little* easier, but would
still make the average Pythoneer swoon.  Closures would let you put
the above logic all in one method, but the code would still be
'inside-out'.

  > Different suggestion: it would be cool to work on completely
  > separating out the VM from the rest of Python, through some kind of
  > C-level API specification.

I think this is a great idea.  I've been staring at python bytecodes a
bit lately thinking about how to do something like this, for some
subset of Python.

[...]

Ok, we've all seen the 'stick'.  I guess I should give an example of
the 'carrot': I think that a web server built on such a Python could
have the performance/scalability of thttpd, with the
ease-of-programming of Roxen.  As far as I know, there's nothing like
it out there.  Medusa would be put out to pasture. 8^)

-Sam




From guido at CNRI.Reston.VA.US  Fri May 14 14:03:31 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 08:03:31 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT."
             <14139.30970.644343.612721@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
            <14139.30970.644343.612721@seattle.nightmare.com> 
Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us>

> I know this is disgusting, but could setjmp/longjmp 'automagically'
> force a 'recursive call' to jump back into the top-level loop?  This
> would put some serious restraint on what C called from Python could
> do...

Forget about it.  setjmp/longjmp are invitations to problems.  I also
assume that they would interfere badly with C++.

> I think just about any Scheme implementation has to solve this same
> problem... I'll dig through my collection of them for ideas.

Anything that assumes knowledge about how the C compiler and/or the
CPU and OS lay out the stack is a no-no, because it means that the
first thing one has to do for a port to a new architecture is figure
out how the stack is laid out.  Another thread in this list is porting 
Python to microplatforms like PalmOS.  Typically the scheme Hackers
are not afraid to delve deep into the machine, but I refuse to do that
-- I think it's too risky.

>   > In general, I still think it's a cool idea, but I also still think
>   > that continuations are too complicated for most programmers.  (This
>   > comes from the realization that they are too complicated for me!)
>   > Corollary: even if we had continuations, I'm not sure if this would
>   > take away the resistance against asyncore/asynchat.  Of course I could 
>   > be wrong.
> 
> Theoretically, you could have a bit of code that looked just like
> 'normal' imperative code, that would actually be entering and exiting
> the context for non-blocking i/o.  If it were done right, the same
> exact code might even run under 'normal' threads.

Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines
with regular threads.  (I think in cooperation with Steve Majewski.)

> Recently I've written an async server that needed to talk to several
> other RPC servers, and a mysql server.  Pseudo-example, with
> possibly-async calls in UPPERCASE:
> 
>   auth, archive = db.FETCH_USER_INFO (user)
>   if verify_login(user,auth):
>     rpc_server = self.archive_servers[archive]
>     group_info = rpc_server.FETCH_GROUP_INFO (group)
>     if valid (group_info):
>       return rpc_server.FETCH_MESSAGE (message_number)
>     else:
>       ...
>    else:
>      ...
> 
> This code in CPS is a horrible, complicated mess, it takes something
> like 8 callback methods, variables and exceptions have to be passed
> around in 'continuation' objects.  It's hairy because there are three
> levels of callback state.  Ugh.

Agreed.

> If Python had closures, then it would be a *little* easier, but would
> still make the average Pythoneer swoon.  Closures would let you put
> the above logic all in one method, but the code would still be
> 'inside-out'.

I forget how this worked :-(

>   > Different suggestion: it would be cool to work on completely
>   > separating out the VM from the rest of Python, through some kind of
>   > C-level API specification.
> 
> I think this is a great idea.  I've been staring at python bytecodes a
> bit lately thinking about how to do something like this, for some
> subset of Python.
> 
> [...]
> 
> Ok, we've all seen the 'stick'.  I guess I should give an example of
> the 'carrot': I think that a web server built on such a Python could
> have the performance/scalability of thttpd, with the
> ease-of-programming of Roxen.  As far as I know, there's nothing like
> it out there.  Medusa would be put out to pasture. 8^)

I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
they do that Apache doesn't?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Fri May 14 15:16:13 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 14 May 1999 15:16:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>             <14139.30970.644343.612721@seattle.nightmare.com>  <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com>

> I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
> they do that Apache doesn't?

http://www.roxen.com/

a lean and mean secure web server written in Pike
(http://pike.idonex.se/), from a company here in
Link?ping.

</F>




From tismer at appliedbiometrics.com  Fri May 14 17:15:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 14 May 1999 17:15:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
	            <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com>


Guido van Rossum wrote:

[setjmp/longjmp -no-no]

> Forget about it.  setjmp/longjmp are invitations to problems.  I also
> assume that they would interfere badly with C++.
> 
> > I think just about any Scheme implementation has to solve this same
> > problem... I'll dig through my collection of them for ideas.
> 
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.
...

I agree that this is generally bad. While it's a cakewalk
to do a stack swap for the few (X86 based:) platforms where
I work with. This is much less than a thread change.

But on the general issues:
Can the Python-calls-C and C-calls-Python problem just be solved
by turning the whole VM state into a data structure, including
a Python call stack which is independent? Maybe this has been
mentioned already.

This might give a little slowdown, but opens possibilities
like continuation-passing style, and context switches
between different interpreter states would be under direct
control.

Just a little dreaming: Not using threads, but just tiny
interpreter incarnations with local state, and a special
C call or better a new opcode which activates the next
state in some list (of course a Python list).
This would automagically produce ICON iterators (duck)
and coroutines (cover).
If I guess right, continuation passing could be done
by just shifting tiny tuples around. Well, Tim, help me :-)

[closures]

> > I think this is a great idea.  I've been staring at python bytecodes a
> > bit lately thinking about how to do something like this, for some
> > subset of Python.

Lumberjack? How is it going? [to Sam]

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Fri May 14 17:32:51 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Fri, 14 May 1999 11:32:51 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
	<001701be9e0b$f1bc4930$f29b12c2@pythonware.com>
Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

    FL> a lean and mean secure web server written in Pike
    FL> (http://pike.idonex.se/), from a company here in
    FL> Link?ping.

Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
original came on board to add Pike support, which has a syntax similar 
enough to C to be easily integrated.  I think I've had as much success 
convincing him to use Python as he's had convincing me to use Pike :-)

-Barry



From gstein at lyra.org  Fri May 14 23:54:02 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 14 May 1999 14:54:02 -0700
Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?)
References: <199905070507.BAA22545@python.org>
		<14138.28243.553816.166686@seattle.nightmare.com>
		<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
		<14138.60284.584739.711112@anthem.cnri.reston.va.us>
		<14139.30970.644343.612721@seattle.nightmare.com>
		<199905141203.IAA01808@eric.cnri.reston.va.us>
		<001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us>
Message-ID: <373C9B7A.3676A910@lyra.org>

Barry A. Warsaw wrote:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:
> 
>     FL> a lean and mean secure web server written in Pike
>     FL> (http://pike.idonex.se/), from a company here in
>     FL> Link?ping.
> 
> Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
> original came on board to add Pike support, which has a syntax similar
> enough to C to be easily integrated.  I think I've had as much success
> convincing him to use Python as he's had convincing me to use Pike :-)

<HistoricalNote>

Heh. Pike is an outgrowth of the MUD world's LPC programming language. A
guy named "Profezzorn" started a project (in '94?) to redevelop an LPC
compiler/interpreter ("driver") from scratch to avoid some licensing
constraints. The project grew into a generalized network handler, since
MUDs' typical designs are excellent for these tasks. From there, you get
the Roxen web server.

</HistoricalNote>

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From rushing at nightmare.com  Sat May 15 01:36:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Fri, 14 May 1999 16:36:11 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <14140.44469.848840.740112@seattle.nightmare.com>

Guido van Rossum writes:
 > > If Python had closures, then it would be a *little* easier, but would
 > > still make the average Pythoneer swoon.  Closures would let you put
 > > the above logic all in one method, but the code would still be
 > > 'inside-out'.
 > 
 > I forget how this worked :-(

[with a faked-up lambda-ish syntax]

def thing (a):
  return do_async_job_1 (a,
    lambda (b):
      if (a>1):
        do_async_job_2a (b,
          lambda (c):
            [...]
          )
      else:
        do_async_job_2b (a,b,
          lambda (d,e,f):
            [...]
          )
     )

The call to do_async_job_1 passes 'a', and a callback, which is
specified 'in-line'.  You can follow the logic of something like this
more easily than if each lambda is spun off into a different
function/method.

 > > I think that a web server built on such a Python could have the
 > > performance/scalability of thttpd, with the ease-of-programming
 > > of Roxen.  As far as I know, there's nothing like it out there.
 > > Medusa would be put out to pasture. 8^)
 > 
 > I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
 > they do that Apache doesn't?

thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance
and scalability, but suffer from the same programmability problem as
Medusa (only worse, 'cause they're in C).

Roxen is written in Pike, a c-like language with gc, threads,
etc... Roxen is I think now the official 'GNU Web Server'.

Here's an interesting web-server comparison chart:

http://www.acme.com/software/thttpd/benchmarks.html

-Sam




From guido at CNRI.Reston.VA.US  Sat May 15 04:23:24 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 22:23:24 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT."
             <14140.44469.848840.740112@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
            <14140.44469.848840.740112@seattle.nightmare.com> 
Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us>

> def thing (a):
>   return do_async_job_1 (a,
>     lambda (b):
>       if (a>1):
>         do_async_job_2a (b,
>           lambda (c):
>             [...]
>           )
>       else:
>         do_async_job_2b (a,b,
>           lambda (d,e,f):
>             [...]
>           )
>      )
> 
> The call to do_async_job_1 passes 'a', and a callback, which is
> specified 'in-line'.  You can follow the logic of something like this
> more easily than if each lambda is spun off into a different
> function/method.

I agree that it is still ugly.

> http://www.acme.com/software/thttpd/benchmarks.html

I see.  Any pointers to a graph of thttp market share?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat May 15 09:51:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <000701be9ea7$acab7f40$159e2299@tim>

[GvR]
> ...
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.

The Icon language needs a bit of platform-specific context-switching
assembly code to support its full coroutine features, although its
bread-and-butter generators ("semi coroutines") don't need anything special.

The result is that Icon ports sometimes limp for a year before they support
full coroutines, waiting for someone wizardly enough to write the necessary
code.  This can, in fact, be quite difficult; e.g., on machines with HW
register windows (where "the stack" can be a complicated beast half buried
in hidden machine state, sometimes needing kernel privilege to uncover).

Not attractive.  Generators are, though <wink>.

threads-too-ly y'rs  - tim





From tim_one at email.msn.com  Sat May 15 09:51:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000801be9ea7$ae45f560$159e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From gstein at lyra.org  Sat May 15 10:17:15 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 15 May 1999 01:17:15 -0700
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
	            <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us>
Message-ID: <373D2D8B.390C523C@lyra.org>

Guido van Rossum wrote:
> ...
> > http://www.acme.com/software/thttpd/benchmarks.html
> 
> I see.  Any pointers to a graph of thttp market share?

thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That
puts it at #6. However, it is interesting to note that 60k of those
sites are in the .uk domain. I can't figure out who is running it, but I
would guess that a large UK-based ISP is hosting a bunch of domains on
thttpd.

It is somewhat difficult to navigate the various reports (and it never
fails that the one you want is not present), but the data is from
Netcraft's survey at: http://www.netcraft.com/survey/

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From tim_one at email.msn.com  Sat May 15 18:43:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 12:43:20 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000701be9ef2$0a9713e0$659e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From rushing at nightmare.com  Sun May 16 13:10:18 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Sun, 16 May 1999 04:10:18 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <81365478@toto.iv>
Message-ID: <14142.40867.103424.764346@seattle.nightmare.com>

Tim Peters writes:
 > I'm not a fan of continuations myself; coroutines can be
 > implemented faithfully via threads (I posted a rather complete set
 > of Python classes for that in the pre-DejaNews days, a bit more
 > flexible than Icon's coroutines); and:

Continuations are more powerful than coroutines, though I admit
they're a bit esoteric.  I programmed in Scheme for years without
seeing the need for them.  But when you need 'em, you *really* need
'em.  No way around it.

For my purposes (massively scalable single-process servers and
clients) threads don't cut it... for example I have a mailing-list
exploder that juggles up to 2048 simultaneous SMTP connections.  I
think it can go higher - I've tested select() on FreeBSD with 16,000
file descriptors.

[...]

BTW, I have actually made progress borrowing a bit of code from SCM.
It uses the stack-copying technique, along with setjmp/longjmp.  It's
too ugly and unportable to be a real candidate for inclusion in
Official Python.  [i.e., if it could be made to work it should be
considered a stopgap measure for the desperate].

I haven't tested it thoroughly, but I have successfully saved and
invoked (and reinvoked) a continuation.  Caveat: I have to turn off
Py_DECREF in order to keep it from crashing.

  | >>> import callcc
  | >>> saved = None
  | >>> def thing(n):
  | ...     if n == 2:
  | ...             global saved
  | ...             saved = callcc.new()
  | ...     print 'n==',n
  | ...     if n == 0:
  | ...             print 'Done!'
  | ...     else:
  | ...             thing (n-1)
  | ... 
  | >>> thing (5)
  | n== 5
  | n== 4
  | n== 3
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved
  | <Continuation object at 80d30d0>
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> 

I will probably not be able to work on this for a while (baby due any
day now), so anyone is welcome to dive right in.  I don't have much
experience wading through gdb tracking down reference bugs, I'm hoping
a brave soul will pick up where I left off. 8^)

http://www.nightmare.com/stuff/python-callcc.tar.gz
ftp://www.nightmare.com/stuff/python-callcc.tar.gz

-Sam




From tismer at appliedbiometrics.com  Sun May 16 17:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 17:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com>


rushing at nightmare.com wrote:

[...]

> BTW, I have actually made progress borrowing a bit of code from SCM.
> It uses the stack-copying technique, along with setjmp/longjmp.  It's
> too ugly and unportable to be a real candidate for inclusion in
> Official Python.  [i.e., if it could be made to work it should be
> considered a stopgap measure for the desperate].

I tried it and built it as a Win32 .pyd file, and it seems to
work, but...

> I haven't tested it thoroughly, but I have successfully saved and
> invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> Py_DECREF in order to keep it from crashing.

Indeed, and this seems to be a problem too hard to solve
without lots of work.
Since you keep a snapshot of the current machine stack,
it contains a number of object references which have been
valid when the snapshot was taken, but many are most
probably invalid when you restart the continuation.
I guess, incref-ing all current alive objects on
the interpreter stack would be the minimum, maybe more.

A tuple of necessary references could be used as an
attribute of a Continuation object. I will look
how difficult this is.

ciao - chris


-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Sun May 16 20:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 20:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com>
Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com>


Christian Tismer wrote:
> 
> rushing at nightmare.com wrote:
[...]

> > I haven't tested it thoroughly, but I have successfully saved and
> > invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> > Py_DECREF in order to keep it from crashing.

It is possible, but a little hard.
To take a working snapshot of the current thread's
stack, one needs not only the stack snapshot which 
continue.c provides, but also a restorable copy of
all frame objects involved so far.
A copy of the current frame chain must be built, with
proper reference counting of all involved elements.
And this is the crux: The current stack pointer of the
VM is not present in the frame objects, but hangs
around somewhere on the machine stack.
Two solutions:

1) modify PyFrameObject by adding a field which holds
   the stack pointer, when a function is called. 
   I don't like to change the VM in any way for this.
2) use the lasti field which holds the last VM instruction
   offset. Then scan the opcodes of the code object
   and calculate the current stack level. This is possible
   since Guido's code generator creates code with the stack
   level lexically bound to the code offset.

Now we can incref all the referenced objects in the frame.
This must be done for the whole chain, which is copied and
relinked during that. This chain is then held as a
property of the continuation object.

To throw the continuation, the current frame chain must
be cleared, and the saved one is inserted, together with
the machine stack operation which Sam has already.

A little hefty, isn't it?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Mon May 17 07:42:59 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Mon, 17 May 1999 01:42:59 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <000f01bea028$1f75c360$fb9e2299@tim>

[Sam]
> Continuations are more powerful than coroutines, though I admit
> they're a bit esoteric.

"More powerful" is a tedious argument you should always avoid <wink>.

> I programmed in Scheme for years without seeing the need for them.
> But when you need 'em, you *really* need 'em.  No way around it.
>
> For my purposes (massively scalable single-process servers and
> clients) threads don't cut it... for example I have a mailing-list
> exploder that juggles up to 2048 simultaneous SMTP connections.  I
> think it can go higher - I've tested select() on FreeBSD with 16,000
> file descriptors.

The other point being that you want to avoid "inside out" logic, though,
right?  Earlier you posted a kind of ideal:

    Recently I've written an async server that needed to talk to several
    other RPC servers, and a mysql server.  Pseudo-example, with
    possibly-async calls in UPPERCASE:

      auth, archive = db.FETCH_USER_INFO (user)
      if verify_login(user,auth):
          rpc_server = self.archive_servers[archive]
          group_info = rpc_server.FETCH_GROUP_INFO (group)
          if valid (group_info):
              return rpc_server.FETCH_MESSAGE (message_number)
          else:
              ...
          else:
              ...

I assume you want to capture a continuation object in the UPPERCASE methods,
store it away somewhere, run off to your select/poll/whatever loop, and have
it invoke the stored continuation objects as the data they're waiting for
arrives.

If so, that's got to be the nicest use for continuations I've seen!  All
invisible to the end user.  I don't know how to fake it pleasantly without
threads, either, and understand that threads aren't appropriate for resource
reasons.  So I don't have a nice alternative.

> ...
>   | >>> import callcc
>   | >>> saved = None
>   | >>> def thing(n):
>   | ...     if n == 2:
>   | ...             global saved
>   | ...             saved = callcc.new()
>   | ...     print 'n==',n
>   | ...     if n == 0:
>   | ...             print 'Done!'
>   | ...     else:
>   | ...             thing (n-1)
>   | ...
>   | >>> thing (5)
>   | n== 5
>   | n== 4
>   | n== 3
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved
>   | <Continuation object at 80d30d0>
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>>

Suppose the driver were in a script instead:

thing(5)           # line 1
print repr(saved)  # line 2
saved.throw(0)     # line 3
saved.throw(0)     # line 4

Then the continuation would (eventually) "return to" the "print repr(saved)"
and we'd get an infinite output tail of:

Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
...

and never reach line 4.  Right?  That's the part that Guido hates <wink>.

takes-one-to-know-one-ly y'rs  - tim





From tismer at appliedbiometrics.com  Mon May 17 09:07:22 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 09:07:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <373FC02A.69F2D912@appliedbiometrics.com>


Tim Peters wrote:

[to Sam]

> The other point being that you want to avoid "inside out" logic, though,
> right?  Earlier you posted a kind of ideal:
> 
>     Recently I've written an async server that needed to talk to several
>     other RPC servers, and a mysql server.  Pseudo-example, with
>     possibly-async calls in UPPERCASE:
> 
>       auth, archive = db.FETCH_USER_INFO (user)
>       if verify_login(user,auth):
>           rpc_server = self.archive_servers[archive]
>           group_info = rpc_server.FETCH_GROUP_INFO (group)
>           if valid (group_info):
>               return rpc_server.FETCH_MESSAGE (message_number)
>           else:
>               ...
>           else:
>               ...
> 
> I assume you want to capture a continuation object in the UPPERCASE methods,
> store it away somewhere, run off to your select/poll/whatever loop, and have
> it invoke the stored continuation objects as the data they're waiting for
> arrives.
> 
> If so, that's got to be the nicest use for continuations I've seen!  All
> invisible to the end user.  I don't know how to fake it pleasantly without
> threads, either, and understand that threads aren't appropriate for resource
> reasons.  So I don't have a nice alternative.

It can always be done with threads, but also without. Tried it
last night, with proper refcounting, and it wasn't too easy
since I had to duplicate the Python frame chain.

...

> Suppose the driver were in a script instead:
> 
> thing(5)           # line 1
> print repr(saved)  # line 2
> saved.throw(0)     # line 3
> saved.throw(0)     # line 4
> 
> Then the continuation would (eventually) "return to" the "print repr(saved)"
> and we'd get an infinite output tail of:
> 
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!

This is at the moment exactly what happens, with the difference that
after some repetitions we GPF due to dangling references
to too often decref'ed objects. My incref'ing prepares for
just one re-incarnation and should prevend a second call.
But this will be solved, soon.

> and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yup. With a little counting, it was easy to survive:

def main():
    global a
    a=2
    thing (5)
    a=a-1
    if a:
        saved.throw (0)

Weird enough and needs a much better interface.
But finally I'm quite happy that it worked so smoothly
after just a couple of hours (well, about six :)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Mon May 17 11:46:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 02:46:29 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim>
References: <14142.40867.103424.764346@seattle.nightmare.com>
	<000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <14143.56604.21827.891993@seattle.nightmare.com>

Tim Peters writes:
 > [Sam]
 > > Continuations are more powerful than coroutines, though I admit
 > > they're a bit esoteric.
 > 
 > "More powerful" is a tedious argument you should always avoid <wink>.

More powerful in the sense that you can use continuations to build
lots of different control structures (coroutines, backtracking,
exceptions), but not vice versa.

Kinda like a better tool for blowing one's own foot off. 8^)

 > Suppose the driver were in a script instead:
 > 
 > thing(5)           # line 1
 > print repr(saved)  # line 2
 > saved.throw(0)     # line 3
 > saved.throw(0)     # line 4
 > 
 > Then the continuation would (eventually) "return to" the "print repr(saved)"
 > and we'd get an infinite output tail [...]
 > 
 > and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yes... the continuation object so far isn't very usable.  It needs a
driver of some kind around it.  In the Scheme world, there are two
common ways of using continuations - let/cc and call/cc.  [call/cc is what
is in the standard, it's official name is call-with-current-continuation]

let/cc stores the continuation in a variable binding, while
introducing a new scope.  It requires a change to the underlying
language:

(+ 1
  (let/cc escape
    (...)
    (escape 34)))
=> 35

'escape' is a function that when called will 'resume' with whatever
follows the let/cc clause.  In this case it would continue with the
addition...

call/cc is a little trickier, but doesn't require any change to the
language...  instead of making a new binding directly, you pass in
a function that will receive the binding:

(+ 1
   (call/cc
     (lambda (escape)
       (...)
       (escape 34))))
=> 35

In words, it's much more frightening: "call/cc is a function, that
when called with a function as an argument, will pass that function an
argument that is a new function, which when called with a value will
resume the computation with that value as the result of the entire
expression"  Phew.

In Python, an example might look like this:

SAVED = None
def save_continuation (k):
  global SAVED
  SAVED = k

def thing():
  [...]
  value = callcc (lambda k: save_continuation(k))

# or more succinctly:
def thing():
  [...]
  value = callcc (save_continuation)

In order to do useful work like passing values back and forth between
coroutines, we have to have some way of returning a value from the
continuation when it is reinvoked.

I should emphasize that most folks will never see call/cc 'in the
raw', it will usually have some nice wrapper around to implement
whatever construct is needed.

-Sam




From arw at ifu.net  Mon May 17 20:06:18 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 14:06:18 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <37405A99.1DBAF399@ifu.net>

The illustrious Sam Rushing avers:
>Continuations are more powerful than coroutines, though I admit
>they're a bit esoteric.  I programmed in Scheme for years without
>seeing the need for them.  But when you need 'em, you *really* need
>'em.  No way around it.

Frankly, I think I thought I understood this once but now I know I
don't.
How're continuations more powerful than coroutines?
And why can't they be implemented using threads (and semaphores etc)?

...I'm not promising I'll understand the answer...
    -- Aaron Watters

===
I taught I taw a putty-cat!





From gmcm at hypernet.com  Mon May 17 21:18:43 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 14:18:43 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
Message-ID: <1285153546-166193857@hypernet.com>

The estimable Aaron Watters queries:
> The illustrious Sam Rushing avers:
> >Continuations are more powerful than coroutines, though I admit
> >they're a bit esoteric.  I programmed in Scheme for years without
> >seeing the need for them.  But when you need 'em, you *really* need
> >'em.  No way around it.
> 
> Frankly, I think I thought I understood this once but now I know I
> don't. How're continuations more powerful than coroutines? And why
> can't they be implemented using threads (and semaphores etc)?

I think Sam's (immediate <wink>) problem is that he can't afford 
threads - he may have hundreds to thousands of these suckers.

As a fuddy-duddy old imperative programmer, I'm inclined to think 
"state machine". But I'd guess that functional-ophiles probably see 
that as inelegant. (Safe guess - they see _anything_ that isn't 
functional as inelegant!).

crude-but-not-rude-ly y'rs

- Gordon



From jeremy at cnri.reston.va.us  Mon May 17 20:43:34 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 17 May 1999 14:43:34 -0400 (EDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
References: <37405A99.1DBAF399@ifu.net>
Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us>

>>>>> "AW" == Aaron Watters <arw at ifu.net> writes:

  AW> The illustrious Sam Rushing avers:
  >> Continuations are more powerful than coroutines, though I admit
  >> they're a bit esoteric.  I programmed in Scheme for years without
  >> seeing the need for them.  But when you need 'em, you *really*
  >> need 'em.  No way around it.

  AW> Frankly, I think I thought I understood this once but now I know
  AW> I don't.  How're continuations more powerful than coroutines?
  AW> And why can't they be implemented using threads (and semaphores
  AW> etc)?

I think I understood, too.  I'm hoping that someone will debug my
answer and enlighten us both.

A continuation is a mechanism for making control flow explicit.  A
continuation is a means of naming and manipulating "the rest of the
program."   In Scheme terms, the continuation is the function that the 
value of the current expression should be passed to.  The call/cc
mechanisms lets you capture the current continuation and explicitly
call on it.  The most typical use of call/cc is non-local exits, but
it gives you incredible flexibility for implementing your control
flow.

I'm fuzzy on coroutines, as I've only seen them in "Structure
Programming" (which is as old as I am :-) and never actually used
them.  The basic idea is that when a coroutine calls another
coroutine, control is transfered to the second coroutine at the point
at which it last left off (by itself calling another coroutine or by
detaching, which returns control to the lexically enclosing scope).

It seems to me that coroutines are an example of the kind of control
structure that you could build with continuations.  It's not clear
that the reverse is true.

I have to admit that I'm a bit unclear on the motivation for all
this.  As Gordon said, the state machine approach seems like it would
be a good approach.

Jeremy



From klm at digicool.com  Mon May 17 21:08:57 1999
From: klm at digicool.com (Ken Manheimer)
Date: Mon, 17 May 1999 15:08:57 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com>

Jeremy Hylton:

> I have to admit that I'm a bit unclear on the motivation for all
> this.  As Gordon said, the state machine approach seems like it would
> be a good approach.

If i understand what you mean by state machine programming, it's pretty
inherently uncompartmented, all the combinations of state variables need
to be accounted for, so the number of states grows factorially on the
number of state vars, in general it's awkward.  The advantage of going
with what functional folks come up with, like continuations, is that it
tends to be well compartmented - functional.  (Come to think of it, i
suppose that compartmentalization as opposed to state is their mania.)

As abstract as i can be (because i hardly know what i'm talking about)
(but i have done some specifically finite state machine programming, and
did not enjoy it),

Ken
klm at digicool.com



From arw at ifu.net  Mon May 17 21:20:13 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 15:20:13 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com>
Message-ID: <37406BED.95AEB896@ifu.net>

The ineffible Gordon McMillan retorts:

> As a fuddy-duddy old imperative programmer, I'm inclined to think
> "state machine". But I'd guess that functional-ophiles probably see
> that as inelegant. (Safe guess - they see _anything_ that isn't
> functional as inelegant!).

As a fellow fuddy-duddy I'd agree except that if you write properlylayered
software you have to unrole and rerole all those layers for every
transition of the multi-level state machine, and even though with proper
discipline it can be implemented without becoming hideous, it still adds
significant overhead compared to "stop right here and come back later"
which could be implemented using threads/coroutines(?)/continuations.
I think this is particularly true in Python with the relatively high
function
call overhead.  Or maybe I'm out in left field doing cartwheels...

I guess the question of interest is why are threads insufficient?  I guess

they have system limitations on the number of threads or other limitations

that wouldn't be a problem with continuations?  If there aren't a *lot* of

situations where coroutines are vital, I'd be hesitant to do major
surgery.
But I'm a fuddy-duddy.

   -- Aaron Watters

===
I did! I did!





From tismer at appliedbiometrics.com  Mon May 17 22:03:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 22:03:01 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net>
Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com>


Aaron Watters wrote:
> 
> The ineffible Gordon McMillan retorts:
> 
> > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > "state machine". But I'd guess that functional-ophiles probably see
> > that as inelegant. (Safe guess - they see _anything_ that isn't
> > functional as inelegant!).
> 
> As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> software you have to unrole and rerole all those layers for every
> transition of the multi-level state machine, and even though with proper
> discipline it can be implemented without becoming hideous, it still adds
> significant overhead compared to "stop right here and come back later"
> which could be implemented using threads/coroutines(?)/continuations.

Coroutines are most elegant here, since (fir a simple example)
they are a symmetric pair of functions which call each other.
There is neither the one-pulls, the other pushes asymmetry, nor
the need to maintain state and be controlled by a supervisor
function.

> I think this is particularly true in Python with the relatively high
> function
> call overhead.  Or maybe I'm out in left field doing cartwheels...
> I guess the question of interest is why are threads insufficient?  I guess
> they have system limitations on the number of threads or other limitations
> that wouldn't be a problem with continuations?  If there aren't a *lot* of
> situations where coroutines are vital, I'd be hesitant to do major
> surgery.

For me (as always) most interesting is the possible speed of
coroutines. They involve no threads overhead, no locking,
no nothing. Python supports it better than expected. If the
stack level of two code objects is the same at a switching point,
the whole switch is nothing more than swapping two frame objects,
and we're done. This might be even cheaper than general call/cc,
like a function call. Sam's prototype works already, with no change to
the
interpreter (but knowledge of Python frames, and a .dll of course).

I think we'll continue a while.

continuously - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From gmcm at hypernet.com  Tue May 18 00:17:25 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 17:17:25 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com>
Message-ID: <1285142823-166838954@hypernet.com>

Co-Christian-routines Tismer continues:

> Aaron Watters wrote:
> > 
> > The ineffible Gordon McMillan retorts:
> > 
> > > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > > "state machine". But I'd guess that functional-ophiles probably see
> > > that as inelegant. (Safe guess - they see _anything_ that isn't
> > > functional as inelegant!).
> > 
> > As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> > software you have to unrole and rerole all those layers for every
> > transition of the multi-level state machine, and even though with proper
> > discipline it can be implemented without becoming hideous, it still adds
> > significant overhead compared to "stop right here and come back later"
> > which could be implemented using threads/coroutines(?)/continuations.
> 
> Coroutines are most elegant here, since (fir a simple example)
> they are a symmetric pair of functions which call each other.
> There is neither the one-pulls, the other pushes asymmetry, nor the
> need to maintain state and be controlled by a supervisor function.

Well, the state maintains you, instead of the other way 'round. (Any 
other ex-Big-Blue-ers out there that used to play these games with 
checkpoint and SyncSort?).

I won't argue elegance. Just a couple points:

- there's an art to writing state machines which is largely 
unrecognized (most of them are unnecessarily horrid).

- a multiplexed solution (vs a threaded solution) requires that 
something be inside out. In one case it's your code, in the other, 
your understanding of the problem. Neither is trivial.

Not to be discouraging - as long as your solution doesn't involve 
using regexps on bytecode <wink>, I say go for it!

- Gordon



From guido at CNRI.Reston.VA.US  Tue May 18 06:03:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 18 May 1999 00:03:34 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT."
             <14143.56604.21827.891993@seattle.nightmare.com> 
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
            <14143.56604.21827.891993@seattle.nightmare.com> 
Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us>

Sam (& others),

I thought I understood what continuations were, but the examples of
what you can do with them so far don't clarify the matter at all.

Perhaps it would help to explain what a continuation actually does
with the run-time environment, instead of giving examples of how to
use them and what the result it?

Here's a start of my own understanding (brief because I'm on a 28.8k
connection which makes my ordinary typing habits in Emacs very
painful).

1. All program state is somehow contained in a single execution stack.
This includes globals (which are simply name bindings in the botton
stack frame).  It also includes a code pointer for each stack frame
indicating where the function corresponding to that stack frame is
executing (this is the return address if there is a newer stack frame, 
or the current instruction for the newest frame).

2. A continuation does something equivalent to making a copy of the
entire execution stack.  This can probably be done lazily.  There are
probably lots of details.  I also expect that Scheme's semantic model
is different than Python here -- e.g. does it matter whether deep or
shallow copies are made?  I.e. are there mutable *objects* in Scheme?
(I know there are mutable and immutable *name bindings* -- I think.)

3. Calling a continuation probably makes the saved copy of the
execution stack the current execution state; I presume there's also a
way to pass an extra argument.

4. Coroutines (which I *do* understand) are probably done by swapping
between two (or more) continuations.

5. Other control constructs can be done by various manipulations of
continuations.  I presume that in many situations the saved
continuation becomes the main control locus permanently, and the
(previously) current stack is simply garbage-collected.  Of course the 
lazy copy makes this efficient.



If this all is close enough to the truth, I think that continuations
involving C stack frames are definitely out -- as Tim Peters
mentioned, you don't know what the stuff on the C stack of extensions
refers to.  (My guess would be that Scheme implementations assume that
any pointers on the C stack point to Scheme objects, so that C stack
frames can be copied and conservative GC can be used -- this will
never happen in Python.)

Continuations involving only Python stack frames might be supported,
if we can agree on the the sharing / copying semantics.  This is where 
I don't know enough see questions at #2 above).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Tue May 18 06:46:12 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:46:12 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37406BED.95AEB896@ifu.net>
Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim>

[Aaron Watters]
> ...
> I guess the question of interest is why are threads insufficient?  I
> guess they have system limitations on the number of threads or other
> limitations that wouldn't be a problem with continuations?

Sam is mucking with thousands of simultaneous I/O-bound socket connections,
and makes a good case that threads simply don't fly here (each one consumes
a stack, kernel resources, etc).  It's unclear (to me) that thousands of
continuations would be *much* better, though, by the time Christian gets
done making thousands of copies of the Python stack chain.

> If there aren't a *lot* of situations where coroutines are vital, I'd
> be hesitant to do major surgery.  But I'm a fuddy-duddy.

Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the
docs.  They're very well written and describe the problem space exquisitely.
I don't have any problems like that I need to solve, but it's interesting to
ponder!

alas-no-time-for-it-now-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:52 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:52 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com>
Message-ID: <000301bea0e9$4fd473a0$829e2299@tim>

[Christian Tismer]
> ...
> Yup. With a little counting, it was easy to survive:
>
> def main():
>     global a
>     a=2
>     thing (5)
>     a=a-1
>     if a:
>         saved.throw (0)

Did "a" really need to be global here?  I hope you see the same behavior
without the "global a"; e.g., this Scheme:

(define -cont- #f)

(define thing
  (lambda (n)
    (if (= n 2) (call/cc (lambda (k) (set! -cont- k))))
    (display "n == ") (display n) (newline)
    (if (= n 0)
	(begin (display "Done!") (newline))
	(thing (- n 1)))))

(define main
  (lambda ()
    (let ((a 2))
      (thing 5)
      (display "a is ") (display a) (newline)
      (set! a (- a 1))
      (if (> a 0)
	  (-cont- #f)))))

(main)

prints:

n == 5
n == 4
n == 3
n == 2
n == 1
n == 0
Done!
a is 2
n == 2
n == 1
n == 0
Done!
a is 1

Or does brute-force frame-copying cause the continuation to set "a" back to
2 each time?

> Weird enough

Par for the continuation course!  They're nasty when eaten raw.

> and needs a much better interface.

Ya, like screw 'em and use threads <wink>.

> But finally I'm quite happy that it worked so smoothly
> after just a couple of hours (well, about six :)

Yup!  Playing with Python internals is a treat.

to-be-continued-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:57 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:57 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com>
Message-ID: <000401bea0e9$51e467e0$829e2299@tim>

[Sam]
>>> Continuations are more powerful than coroutines, though I admit
>>> they're a bit esoteric.

[Tim]
>> "More powerful" is a tedious argument you should always avoid <wink>.

[Sam]
> More powerful in the sense that you can use continuations to build
> lots of different control structures (coroutines, backtracking,
> exceptions), but not vice versa.

"More powerful" is a tedious argument you should always avoid <frown -- I'm
not touching this, but you can fight it out now with Aaron et alia <wink>>.

>> Then the continuation would (eventually) "return to" the
>> "print repr(saved)" and we'd get an infinite output tail [...]
>> and never reach line 4.  Right?

> Yes... the continuation object so far isn't very usable.

But it's proper behavior for a continuation all the same!  So this aspect
shouldn't be "fixed".

> ...
> let/cc stores the continuation in a variable binding, while
> introducing a new scope.  It requires a change to the underlying
> language:

Isn't this often implemented via a macro, though, so that

   (let/cc name code)

"acts like"

    (call/cc (lambda (name) code))

?  I haven't used a Scheme with native let/cc, but poking around it appears
that the real intent is to support exception-style function exits with a
mechanism cheaper than 1st-class continuations:  twice saw the let/cc object
(the thingie bound to "name") defined as being invalid the instant after
"code" returns, so it's an "up the call stack" gimmick.  That doesn't sound
powerful enough for what you're after.

> [nice let/cc call/cc tutorialette]
> ...
> In order to do useful work like passing values back and forth between
> coroutines, we have to have some way of returning a value from the
> continuation when it is reinvoked.

Somehow, I suspect that's the least of our problems <0.5 wink>.  If
continuations are in Python's future, though, I agree with the need as
stated.

> I should emphasize that most folks will never see call/cc 'in the
> raw', it will usually have some nice wrapper around to implement
> whatever construct is needed.

Python already has well-developed exception and thread facilities, so it's
hard to make a case for continuations as a catch-all implementation
mechanism.  That may be the rub here:  while any number of things *can* be
implementated via continuations, I think very few *need* to be implemented
that way, and full-blown continuations aren't easy to implement efficiently
& portably.

The Icon language was particularly concerned with backtracking searches, and
came up with generators as another clearer/cheaper implementation technique.
When it went on to full-blown coroutines, it's hard to say whether
continuations would have been a better approach.  But the coroutine
implementation it has is sluggish and buggy and hard to port, so I doubt
they could have done noticeably worse.

Would full-blown coroutines be powerful enough for your needs?

assuming-the-practical-defn-of-"powerful-enough"-ly y'rs  - tim





From rushing at nightmare.com  Tue May 18 07:18:06 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:18:06 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim>
References: <14143.56604.21827.891993@seattle.nightmare.com>
	<000401bea0e9$51e467e0$829e2299@tim>
Message-ID: <14144.61765.308962.101884@seattle.nightmare.com>

Tim Peters writes:
 > Isn't this often implemented via a macro, though, so that
 > 
 >    (let/cc name code)
 > 
 > "acts like"
 > 
 >     (call/cc (lambda (name) code))

Yup, they're equivalent, in the sense that given one you can make a
macro to do the other.  call/cc is preferred because it doesn't
require a new binding construct.

 > ?  I haven't used a Scheme with native let/cc, but poking around it
 > appears that the real intent is to support exception-style function
 > exits with a mechanism cheaper than 1st-class continuations: twice
 > saw the let/cc object (the thingie bound to "name") defined as
 > being invalid the instant after "code" returns, so it's an "up the
 > call stack" gimmick.  That doesn't sound powerful enough for what
 > you're after.

Except that since the escape procedure is 'first-class' it can be
stored away and invoked (and reinvoked) later.  [that's all that
'first-class' means: a thing that can be stored in a variable,
returned from a function, used as an argument, etc..]

I've never seen a let/cc that wasn't full-blown, but it wouldn't
surprise me.

 > The Icon language was particularly concerned with backtracking
 > searches, and came up with generators as another clearer/cheaper
 > implementation technique.  When it went on to full-blown
 > coroutines, it's hard to say whether continuations would have been
 > a better approach.  But the coroutine implementation it has is
 > sluggish and buggy and hard to port, so I doubt they could have
 > done noticeably worse.

Many Scheme implementors either skip it, or only support non-escaping
call/cc (i.e., exceptions in Python).

 > Would full-blown coroutines be powerful enough for your needs?

Yes, I think they would be.  But I think with Python it's going to
be just about as hard, either way.

-Sam




From rushing at nightmare.com  Tue May 18 07:48:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:48:29 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <51325225@toto.iv>
Message-ID: <14144.63787.502454.111804@seattle.nightmare.com>

Aaron Watters writes:
 > Frankly, I think I thought I understood this once but now I know I
 > don't.

8^)  That's what I said when I backed into the idea via medusa a
couple of years ago.

 > How're continuations more powerful than coroutines?  And why can't
 > they be implemented using threads (and semaphores etc)?

My understanding of the original 'coroutine' (from Pascal?) was that
it allows two procedures to 'resume' each other.  The classic
coroutine example is the 'samefringe' problem: given two trees of
differing structure, are they equal in the sense that a traversal of
the leaves results in the same list?  Coroutines let you do this
efficiently, comparing leaf-by-leaf without storing the whole tree.

continuations can do coroutines, but can also be used to implement
backtracking, exceptions, threads... probably other stuff I've never
heard of or needed.

The reason that Scheme and ML are such big fans of continuations is
because they can be used to implement all these other features.  Look
at how much try/except and threads complicate other language
implementations.  It's like a super-tool-widget - if you make sure
it's in your toolbox, you can use it to build your circular saw and
lathe from scratch.

Unfortunately there aren't many good sites on the web with good
explanatory material.  The best reference I have is "Essentials of
Programming Languages".  For those that want to play with some of
these ideas using little VM's written in Python:

  http://www.nightmare.com/software.html#EOPL

-Sam




From rushing at nightmare.com  Tue May 18 07:56:37 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:56:37 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <13631823@toto.iv>
Message-ID: <14144.65355.400281.123856@seattle.nightmare.com>

Jeremy Hylton writes:
 > I have to admit that I'm a bit unclear on the motivation for all
 > this.  As Gordon said, the state machine approach seems like it would
 > be a good approach.

For simple problems, state machines are ideal.  Medusa uses state
machines that are built out of Python methods.  But past a certain
level of complexity, they get too hairy to understand.  A really good
example can be found in /usr/src/linux/net/ipv4.  8^)

-Sam




From rushing at nightmare.com  Tue May 18 09:05:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:05:20 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <60057226@toto.iv>
Message-ID: <14145.927.588572.113256@seattle.nightmare.com>

Guido van Rossum writes:
 > Perhaps it would help to explain what a continuation actually does
 > with the run-time environment, instead of giving examples of how to
 > use them and what the result it?

This helped me a lot, and is the angle used in "Essentials of
Programming Languages":

Usually when folks refer to a 'stack', they're refering to an
*implementation* of the stack data type: really an optimization that
assumes an upper bound on stack size, and that things will only be
pushed and popped in order.

If you were to implement a language's variable and execution stacks
with actual data structures (linked lists), then it's easy to see
what's needed: the head of the list represents the current state.  As
functions exit, they pop things off the list.

The reason I brought this up (during a lull!) was that Python is
already paying all of the cost of heap-allocated frames, and it didn't
seem to me too much of a leap from there.

 > 1. All program state is somehow contained in a single execution stack.
Yup.

 > 2. A continuation does something equivalent to making a copy of the
 > entire execution stack.
Yup.
 > I.e. are there mutable *objects* in Scheme?
 > (I know there are mutable and immutable *name bindings* -- I think.)

Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!,
all the things that make it 'impure'.

I think shallow copies are what's expected.  In the examples I have,
the continuation is kept in a 'register', and call/cc merely packages
it up with a little function wrapper.  You are allowed to stomp all
over lexical variables with "set!".

 > 3. Calling a continuation probably makes the saved copy of the
 > execution stack the current execution state; I presume there's also a
 > way to pass an extra argument.
Yup.
 > 4. Coroutines (which I *do* understand) are probably done by swapping
 > between two (or more) continuations.
Yup.  Here's an example in Scheme:

http://www.nightmare.com/stuff/samefringe.scm

Somewhere I have an example of coroutines being used for parsing, very
elegant.  Something like one coroutine does lexing, and passes tokens
one-by-one to the next level, which passes parsed expressions to a
compiler, or whatever.  Kinda like pipes.

 > 5. Other control constructs can be done by various manipulations of
 > continuations.  I presume that in many situations the saved
 > continuation becomes the main control locus permanently, and the
 > (previously) current stack is simply garbage-collected.  Of course
 > the lazy copy makes this efficient.

Yes... I think backtracking would be an example of this.  You're doing
a search on a large space (say a chess game).  After a certain point
you want to try a previous fork, to see if it's promising, but you
don't want to throw away your current work.  Save it, then unwind back
to the previous fork, try that option out... if it turns out to be
better then toss the original.

 > If this all is close enough to the truth, I think that
 > continuations involving C stack frames are definitely out -- as Tim
 > Peters mentioned, you don't know what the stuff on the C stack of
 > extensions refers to.  (My guess would be that Scheme
 > implementations assume that any pointers on the C stack point to
 > Scheme objects, so that C stack frames can be copied and
 > conservative GC can be used -- this will never happen in Python.)

I think you're probably right here - usually there are heavy
restrictions on what kind of data can pass through the C interface.
But I know of at least one Scheme (mzscheme/PLT) that uses
conservative gc and has c/c++ interfaces. [... dig dig ...]


From rushing at nightmare.com  Tue May 18 09:17:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:17:11 -0700 (PDT)
Subject: [Python-Dev] another good motivation
Message-ID: <14145.4917.164756.300678@seattle.nightmare.com>

"Escaping the event loop: an alternative control structure for multi-threaded GUIs"

http://cs.nyu.edu/phd_students/fuchs/
http://cs.nyu.edu/phd_students/fuchs/gui.ps

-Sam




From tismer at appliedbiometrics.com  Tue May 18 15:46:53 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 15:46:53 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <000901bea0e9$5aa2dec0$829e2299@tim>
Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Aaron Watters]
> > ...
> > I guess the question of interest is why are threads insufficient?  I
> > guess they have system limitations on the number of threads or other
> > limitations that wouldn't be a problem with continuations?
> 
> Sam is mucking with thousands of simultaneous I/O-bound socket connections,
> and makes a good case that threads simply don't fly here (each one consumes
> a stack, kernel resources, etc).  It's unclear (to me) that thousands of
> continuations would be *much* better, though, by the time Christian gets
> done making thousands of copies of the Python stack chain.

Well, what he needs here are coroutines and just a single frame
object for every minithread (I think this is a "fiber"?).
If these fibers later do deep function calls before they switch,
there will of course be more frames then.

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Tue May 18 16:35:30 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 16:35:30 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
	            <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <37417AB2.80920595@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> Sam (& others),
> 
> I thought I understood what continuations were, but the examples of
> what you can do with them so far don't clarify the matter at all.
> 
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?
> 
> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
> 
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).  It also includes a code pointer for each stack frame
> indicating where the function corresponding to that stack frame is
> executing (this is the return address if there is a newer stack frame,
> or the current instruction for the newest frame).

Right. For now, this information is on the C stack for each called
function, although almost completely available in the frame chain.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.  I also expect that Scheme's semantic model
> is different than Python here -- e.g. does it matter whether deep or
> shallow copies are made?  I.e. are there mutable *objects* in Scheme?
> (I know there are mutable and immutable *name bindings* -- I think.)

To make it lazy, a gatekeeper must be put on top of the two
splitted frames, which catches the event that one of them
returns. It appears to me that this it the same callcc.new()
object which catches this, splitting frames when hit by a return.

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.
> 
> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.

Right, which is just two or three assignments.

> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

Yes, great. It looks like that switching continuations
is not more expensive than a single Python function call.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

This would mean to avoid creating incompatible continuations.
A continutation may not switch to a frame chain which was created
by a different VM incarnation since this would later on
corrupt the machine stack. One way to assure that would be
a thread-safe function in sys, similar to sys.exc_info()
which gives an id for the current interpreter. continuations
living somewhere in globals would be marked by the interpreter
which created them, and reject to be thrown if they don't match.

The necessary interpreter support appears to be small:

Extend the PyFrame structure by two fields:
  - interpreter ID  (addr of some local variable would do)
  - stack pointer at current instruction.

Change the CALL_FUNCTION opcode to avoid calling eval recursively
in the case of a Python function/method, but the current frame,
build the new one and start over.
RETURN will pop a frame and reload its local variables instead
of returning, as long as there is a frame to pop.

I'm unclear how exceptions should be handled. Are they currently
propagated up across different C calls other than ceval2
recursions?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Tue May 18 17:05:39 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 18 May 1999 11:05:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com>
References: <60057226@toto.iv>
	<14145.927.588572.113256@seattle.nightmare.com>
Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us>

>>>>> "SR" == rushing  <rushing at nightmare.com> writes:

  SR> Somewhere I have an example of coroutines being used for
  SR> parsing, very elegant.  Something like one coroutine does
  SR> lexing, and passes tokens one-by-one to the next level, which
  SR> passes parsed expressions to a compiler, or whatever.  Kinda
  SR> like pipes.

This is the first example that's used in Structured Programming (Dahl,
Djikstra, and Hoare).  I'd be happy to loan a copy to any of the
Python-dev people who sit nearby.

Jeremy



From tismer at appliedbiometrics.com  Tue May 18 17:31:11 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 17:31:11 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <374187BF.36CC65E7@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

(H?stel) Actually, I inserted the "global" later. It worked as well
with a local variable, but I didn't understand it. Still don't :-)

> Or does brute-force frame-copying cause the continuation to set "a" back to
> 2 each time?

No, it doesn't. Behavior is exactly the same with or without
global. I'm not sure wether this is a bug or a feature.
I *think* 'a' as a local has a slot in the frame, so it's
actually a different 'a' living in both copies. But this
would not have worked.
Can it be that before a function call, the interpreter
turns its locals into a dict, using fast_to_locals?
That would explain it.
This is not what I think it should be! Locals need to be
copied.

> > and needs a much better interface.
> 
> Ya, like screw 'em and use threads <wink>.

Never liked threads. These fibers are so neat since
they don't need threads, no locking, and they are
available on systems without threads.

> > But finally I'm quite happy that it worked so smoothly
> > after just a couple of hours (well, about six :)
> 
> Yup!  Playing with Python internals is a treat.
> 
> to-be-continued-ly y'rs  - tim

throw(42) - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From skip at mojam.com  Tue May 18 17:49:42 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 18 May 1999 11:49:42 -0400
Subject: [Python-Dev] Is there another way to solve the continuation problem?
Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>

Okay, from my feeble understanding of the problem it appears that
coroutines/continuations and threads are going to be problematic at best for 
Sam's needs.  Are there other "solutions"?  We know about state machines.
They have the problem that the number of states grows exponentially (?) as
the number of state variables increases.

Can exceptions be coerced into providing the necessary structure without
botching up the application too badly?  Seems that at some point where you
need to do some I/O, you could raise an exception whose second expression
contains the necessary state to get back to where you need to be once the
I/O is ready to go.  The controller that catches the exceptions would use
select or poll to prepare for the I/O then dispatch back to the handlers
using the information from exceptions.

class IOSetup:
    pass

class WaveHands:
    """maintains exception raise info and selects one to go to next"""
    def choose_one(r,w,e):
	pass

    def remember(info):
	pass

def controller(...):
    waiters = WaveHands()
    while 1:
	r, w, e = select([...], [...], [...])
	# using r,w,e, select a waiter to call
	func, place = waiters.choose_one(r,w,e)
	try:
	    func(place)
	except IOSetup, info:
	    waiters.remember(info)


def spam_func(place):
    if place == "spam":
	# whatever I/O we needed to do is ready to go
	bytes = read(some_fd)
	process(bytes)
	# need to read some more from some_fd. args are:
	#    function, target, fd category (r, w), selectable object, 
	raise IOSetup, (spam_func, "eggs" , "r", some_fd)

    elif place == "eggs":
	# that next chunk is ready - get it and proceed...

    elif yadda, yadda, yadda...


One thread, some craftiness needed to construct things.  Seems like it might
isolate some of the statefulness to smaller functional units than a pure
state machine.  Clearly not as clean as continuations would be.  Totally
bogus?  Totally inadequate?  Maybe Sam already does things this way?


Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Tue May 18 19:23:08 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 19:23:08 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

Actually, the frame-copying was not enough to make this 
all behave correctly. Since I didn't change the interpreter,
the ceval.c incarnations still had copies to the old frames.
The only effect which I achieved with frame copying was
that the refcounts were increased correctly.

I have to remove the hardware stack copying now.
Will try to create a non-recursive version of the interpreter.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Wed May 19 01:16:54 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Wed, 19 May 1999 09:16:54 +1000
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>
Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat>

> Sam's needs.  Are there other "solutions"?  We know about
> state machines.
> They have the problem that the number of states grows
> exponentially (?) as
> the number of state variables increases.

Well, I can give you my feeble understanding of "IO Completion Ports", the
technique Win32 provides to "solve" this problem.

My experience is limited to how we used these in a server product designed
to maintain thousands of long-term client connections each spooling large
chunks of data (MSOffice docs - yes, that large :-).  We too could
obviously not afford a thread per connection.  Searching through NT's
documentation, completion ports are the technique they recommend for
high-performance IO, and it appears to deliver.

NT has the concept of a completion port, which in many ways is like an
"inverted semaphore".  You create a completion port with a "max number of
threads" value.  Then, for every IO object you need to use (files, sockets,
pipes etc) you "attach" it to the completion port, along with an integer
key.  This key is (presumably) unique to the file, and usually a pointer to
some structure maintaing the state of the file (ie, connection)

The general programming model is that you have a small number of threads
(possibly 1), and a large number of io objects (eg files).  Each of these
threads is executing a state machine.  When IO is "ready" for a particular
file, one of the available threads is woken, and passed the "key"
associated with the file.  This key identifies the file, and more
importantly the state of that file.  The thread uses the state to perform
the next IO operation, then immediately go back to sleep.  When that IO
operation completes, some other thread is woken to handle that state
change.  What makes this work of course is that _all_ IO is asynch - not a
single IO call in this whole model can afford to block.  NT provides asynch
IO natively.

This sounds very similar to what Medusa does internally, although the NT
model provides a "thread pooling" scheme built-in.  Although our server
performed very well with a single thread and hundreds of high-volume
connections, we chose to run with a default of 5 threads here.

For those still interested, our project has the multi-threaded state
machine I described above implemented in C.  Most of the work is
responsible for spooling the client request data (possibly 100s of kbs)
before handing that data off to the real server.  When the C code
transitions the client through the state of "send/get from the real
server", we actually set a different completion port.  This other
completion port wakes a thread written in Python.  So our architecture
consists of a C implemented thread-pool managing client connections, and a
different Python implemented thread pool that does the real work for each
of these client connections. (The Python side of the world is bound by the
server we are talking to, so Python performance doesnt matter as much - C
wouldnt buy enough)

This means that our state machines are not that complex.  Each "thread
pool" is managing its own, fairly simple state.  NT automatically allows
you to associate state with the IO object, and as we have multiple thread
pools, each one is simple - the one spooling client data is simple, the one
doing the actual server work is simple.  If we had to have a single,
monolithic state machine managing all aspects of the client spooling, _and_
the server work, it would be horrid.

This is all in a shrink-wrapped relatively cheap "Document Management"
product being targetted (successfully, it appears) at huge NT/Exchange
based sites.  Australia's largest Telco are implementing it, and indeed the
company has VC from Intel!  Lots of support from MS, as it helps compete
with Domino.  Not bad for a little startup - now they are wondering what to
do with this Python-thingy they now have in their product that noone else
has ever heard off; but they are planning on keeping it for now :-)
[Funnily, when they started, they didnt think they even _needed_ a server,
so I said "Ill just knock up a little one in Python", and we havent looked
back :-]

Mark.




From tim_one at email.msn.com  Wed May 19 02:48:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 20:48:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim>

[GvR]
> ...
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?

Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme
and its implementation:

ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html

You can pick up a lot from that fast.  Is Steven (Majewski) on this list?
He doped most of this out years ago.

> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
>
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).

Better to think of name resolution following lexical links.  Lexical
closures with indefinite extent are common in Scheme, so much so that name
resolution is (at least conceptually) best viewed as distinct from execution
stacks.

Here's a key:  continuations are entirely about capturing control flow
state, and nothing about capturing binding or data state.  Indeed, mutating
bindings and/or non-local data are the ways distinct invocations of a
continuation communicate with each other, and for this reason true
functional languages generally don't support continuations of the call/cc
flavor.

> It also includes a code pointer for each stack frame indicating where
> the function corresponding to that stack frame is executing (this is
> the return address if there is a newer stack frame, or the current
> instruction for the newest frame).

Yes, although the return address is one piece of information in the current
frame's continuation object -- continuations are used internally for
"regular calls" too.  When a function returns, it passes control thru its
continuation object.  That process restores-- from the continuation
object --what the caller needs to know (in concept:  a pointer to *its*
continuation object, its PC, its name-resolution chain pointer, and its
local eval stack).

Another key point:  a continuation object is immutable.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.

The point of the above is to get across that for Scheme-calling-Scheme,
creating a continuation object copies just a small, fixed number of pointers
(the current continuation pointer, the current name-resolution chain
pointer, the PC), plus the local eval stack.  This is for a "stackless"
interpreter that heap-allocates name-mapping and execution-frame and
continuation objects.  Half the literature is devoted to optimizing one or
more of those away in special cases (e.g., for continuations provably
"up-level", using a stack + setjmp/longjmp instead).

> I also expect that Scheme's semantic model is different than Python
> here -- e.g. does it matter whether deep or shallow copies are made?
> I.e. are there mutable *objects* in Scheme? (I know there are mutable
> and immutable *name bindings* -- I think.)

Same as Python here; Scheme isn't a functional language; has mutable
bindings and mutable objects; any copies needed should be shallow, since
it's "a feature" that invoking a continuation doesn't restore bindings or
object values (see above re communication).

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.

Right, except "stack" is the wrong mental model in the presence of
continuations; it's a general rooted graph (A calls B, B saves a
continuation pointing back to A, B goes on to call A, A saves a continuation
pointing back to B, etc).  If the explicitly saved continuations are never
*invoked*, control will eventually pop back to the root of the graph, so in
that sense there's *a* stack implicit at any given moment.

> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.
>
> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

There's much less copying going on in Scheme-to-Scheme than you might think;
other than that, right on.

> If this all is close enough to the truth, I think that continuations
> involving C stack frames are definitely out -- as Tim Peters
> mentioned, you don't know what the stuff on the C stack of extensions
> refers to.  (My guess would be that Scheme implementations assume that
> any pointers on the C stack point to Scheme objects, so that C stack
> frames can be copied and conservative GC can be used -- this will
> never happen in Python.)

"Scheme" has become a generic term covering dozens of implementations with
varying semantics, and a quick tour of the web suggests that cross-language
Schemes generally put severe restrictions on continuations across language
boundaries.  Most popular seems to be to outlaw them by decree.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

I'd like to go back to examples of what they'd be used for <wink> -- but
fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
and "lambdaness" and syntax-extension facilities, I'm unsure they're going
to work out reasonably in Python practice; it's not enough that they can be
very useful in Scheme, and Sam is highly motivated to go to extremes here.

give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim





From tismer at appliedbiometrics.com  Wed May 19 03:10:15 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 03:10:15 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea191$3f4d1a20$2e9e2299@tim>
Message-ID: <37420F77.48E9940F@appliedbiometrics.com>


Tim Peters wrote:
...

> > Continuations involving only Python stack frames might be supported,
> > if we can agree on the the sharing / copying semantics.  This is where
> > I don't know enough see questions at #2 above).
> 
> I'd like to go back to examples of what they'd be used for <wink> -- but
> fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
> and "lambdaness" and syntax-extension facilities, I'm unsure they're going
> to work out reasonably in Python practice; it's not enough that they can be
> very useful in Scheme, and Sam is highly motivated to go to extremes here.
> 
> give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim

I've put quite many hours into a non-recursive ceval.c
already. Should I continue? At least this would be a little
improvement, also if the continuation thing will not be born. ?

- chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Wed May 19 04:52:04 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 19:52:04 -0700 (PDT)
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <101382377@toto.iv>
Message-ID: <14146.8395.754509.591141@seattle.nightmare.com>

Skip Montanaro writes:
 > Can exceptions be coerced into providing the necessary structure
 > without botching up the application too badly?  Seems that at some
 > point where you need to do some I/O, you could raise an exception
 > whose second expression contains the necessary state to get back to
 > where you need to be once the I/O is ready to go.  The controller
 > that catches the exceptions would use select or poll to prepare for
 > the I/O then dispatch back to the handlers using the information
 > from exceptions.

 > [... code ...]

Well, you just re-invented the 'Reactor' pattern! 8^)

http://www.cs.wustl.edu/~schmidt/patterns-ace.html

 > One thread, some craftiness needed to construct things.  Seems like
 > it might isolate some of the statefulness to smaller functional
 > units than a pure state machine.  Clearly not as clean as
 > continuations would be.  Totally bogus?  Totally inadequate?  Maybe
 > Sam already does things this way?

What you just described is what Medusa does (well, actually, 'Python'
does it now, because the two core libraries that implement this are
now in the library - asyncore.py and asynchat.py).  asyncore doesn't
really use exceptions exactly that way, and asynchat allows you to add 
another layer of processing (basically, dividing the input into
logical 'lines' or 'records' depending on a 'line terminator').

The same technique is at the heart of many well-known network servers,
including INND, BIND, X11, Squid, etc..  It's really just a state
machine underneath (with python functions or methods implementing the
'states').  As long as things don't get too complex.  Python
simplifies things enough to allow one to 'push the difficulty
envelope' a bit further than one could reasonably tolerate in C.  For
example, Squid implements async HTTP (server and client, because it's
a proxy) - but stops short of trying to implement async FTP.  Medusa
implements async FTP, but it's the largest file in the Medusa
distribution, weighing in at a hefty 32KB.

The hard part comes when you want to plug different pieces and
protocols together.  For example, building a simple HTTP or FTP server
is relatively easy, but building an HTTP server *that proxied to an
FTP server* is much more difficult.  I've done these kinds of things,
viewing each as a challenge; but past a certain point it boggles.

The paper I posted about earlier by Matthew Fuchs has a really good
explanation of this, but in the context of GUI event loops... I think
it ties in neatly with this discussion because at the heart of any X11
app is a little guy manipulating a file descriptor.

-Sam




From tim_one at email.msn.com  Wed May 19 07:41:39 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:39 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com>
Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim>

[Sam]
> ...
> Except that since the escape procedure is 'first-class' it can be
> stored away and invoked (and reinvoked) later.  [that's all that
> 'first-class' means: a thing that can be stored in a variable,
> returned from a function, used as an argument, etc..]
>
> I've never seen a let/cc that wasn't full-blown, but it wouldn't
> surprise me.

The let/cc's in question were specifically defined to create continuations
valid only during let/cc's dynamic extent, so that, sure, you could store
them away, but trying to invoke one later could be an error.  It's in that
sense I meant they weren't "first class".

Other flavors of Scheme appear to call this concept "weak continuation", and
use a different verb to invoke it (like call-with-escaping-continuation, or
call/ec).  Suspect the let/cc oddballs I found were simply confused
implementations (there are a lot of amateur Scheme implementations out
there!).

>> Would full-blown coroutines be powerful enough for your needs?

> Yes, I think they would be.  But I think with Python it's going to
> be just about as hard, either way.

Most people on this list are comfortable with coroutines already because
they already understand them -- Jeremy can even reach across the hall and
hand Guido a helpful book <wink>.  So pondering coroutines increase the
number of brain cells willing to think about the implementation.

continuation-examples-leave-people-still-going-"huh?"-after-an-
    hour-of-explanation-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:45 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com>
Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim>

[Christian Tismer]
>>> ...
>>> Yup. With a little counting, it was easy to survive:
>>>
>>> def main():
>>>     global a
>>>     a=2
>>>     thing (5)
>>>     a=a-1
>>>     if a:
>>>         saved.throw (0)

[Tim]
>> Did "a" really need to be global here?  I hope you see the same behavior
>> without the "global a";
[which he does, but for mysterious reasons]

[Christian]
> Actually, the frame-copying was not enough to make this
> all behave correctly. Since I didn't change the interpreter,
> the ceval.c incarnations still had copies to the old frames.
> The only effect which I achieved with frame copying was
> that the refcounts were increased correctly.

All right!  Now you're closer to the real solution <wink>; i.e., copying
wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
when we entered main originally a set of bindings was created for its
locals, and it is that very same set of bindings to which the continuation
returns.  So the continuation *should* reuse them -- making a copy of the
locals is semantically hosed.

This is clearer in Scheme because its "stack" holds *only* control-flow info
(bindings follow a chain of static links, independent of the current "call
stack"), so there's no temptation to run off copying bindings too.

elegant-and-baffling-for-the-price-of-one<wink>-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:56 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:56 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com>
Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim>

[Christian Tismer]
> I've put quite many hours into a non-recursive ceval.c
> already.

Does that mean 6 or 600 <wink>?

> Should I continue? At least this would be a little improvement, also
> if the continuation thing will not be born. ?

Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
so my belief is it's worth pursuing.

but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim





From arw at ifu.net  Wed May 19 15:04:53 1999
From: arw at ifu.net (Aaron Watters)
Date: Wed, 19 May 1999 09:04:53 -0400
Subject: [Python-Dev] continuations and C extensions?
Message-ID: <3742B6F5.C6CB7313@ifu.net>

the immutable GvR intones:
> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where

> I don't know enough see questions at #2 above).

What if there are native C calls mixed in (eg, list.sort calls back to
myclass.__cmp__ which decides to do a call/cc).  One of the really
big advantages of Python in my book is the relative simplicity of
embedding
and extensions, and this is generally one of the failings of lisp
implementations.
I understand lots of scheme implementations purport
to be extendible and embeddable, but in practice you can't do it with
*existing* code -- there is always a show stopper involving having to
change the way some Oracle library which you don't have the source for
does memory management or something... I've known several grad students
who have been bitten by this...  I think having to unroll the C stack
safely
might be one problem area.

With, eg, a netscape nsapi embedding you can actually get into netscape
code calls my code calls netscape code calls my code... suspends in a
continuation?  How would that work?  [my ignorance is torment!]

Threading and extensions are probably also problematic, but at least
it's
better understood, I think.  Just kvetching.  Sorry.
   -- Aaron Watters

ps: Of course there are valid reasons and excellent advantages
  to having continuations, but it's also interesting to consider the
possible cost.
  There ain't no free lunch.





From tismer at appliedbiometrics.com  Wed May 19 21:30:18 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 21:30:18 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
Message-ID: <3743114A.220FFA0B@appliedbiometrics.com>


Tim Peters wrote:
...
> [Christian]
> > Actually, the frame-copying was not enough to make this
> > all behave correctly. Since I didn't change the interpreter,
> > the ceval.c incarnations still had copies to the old frames.
> > The only effect which I achieved with frame copying was
> > that the refcounts were increased correctly.
> 
> All right!  Now you're closer to the real solution <wink>; i.e., copying
> wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
> when we entered main originally a set of bindings was created for its
> locals, and it is that very same set of bindings to which the continuation
> returns.  So the continuation *should* reuse them -- making a copy of the
> locals is semantically hosed.

I tried the most simple thing, and this seemed to be duplicating
the current state of the machine. The frame holds the stack,
and references to all objects.
By chance, the locals are not in a dict, but unpacked into
the frame. (Sometimes I agree with Guido, that optimization
is considered harmful :-)

> This is clearer in Scheme because its "stack" holds *only* control-flow info
> (bindings follow a chain of static links, independent of the current "call
> stack"), so there's no temptation to run off copying bindings too.

The Python stack, besides its intermingledness with the machine
stack, is basically its chain of frames. The value stack pointer
still hides in the machine stack, but that's easy to change.
So the real Scheme-like part is this chain, methinks, with
the current bytecode offset and value stack info.

Making a copy of this in a restartable way means to increase
the refcount of all objects in a frame. Would it be correct
to undo the effect of fast locals before splitting, and redoing
it on activation?

Or do I need to rethink the whole structure? What should
be natural for Python, it at all?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 21:46:49 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 15:46:49 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
	<3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  [Tim Peters]
  >> This is clearer in Scheme because its "stack" holds *only*
  >> control-flow info (bindings follow a chain of static links,
  >> independent of the current "call stack"), so there's no
  >> temptation to run off copying bindings too.

  CT> The Python stack, besides its intermingledness with the machine
  CT> stack, is basically its chain of frames. The value stack pointer
  CT> still hides in the machine stack, but that's easy to change.  So
  CT> the real Scheme-like part is this chain, methinks, with the
  CT> current bytecode offset and value stack info.

  CT> Making a copy of this in a restartable way means to increase the
  CT> refcount of all objects in a frame. Would it be correct to undo
  CT> the effect of fast locals before splitting, and redoing it on
  CT> activation?

Wouldn't it be easier to increase the refcount on the frame object?
Then you wouldn't need to worry about the recounts on all the objects
in the frame, because they would only be decrefed when the frame is
deallocated. 

It seems like the two other things you would need are some way to get
a copy of the current frame and a means to invoke eval_code2 with an
already existing stack frame instead of a new one.

(This sounds too simple, so it's obviously wrong.  I'm just not sure
where.  Is the problem that you really need a seperate stack/graph to
hold the frames?  If we leave them on the Python stack, it could be
hard to dis-entangle value objects from control objects.)

Jeremy



From tismer at appliedbiometrics.com  Wed May 19 22:10:16 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 22:10:16 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
		<3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>
Message-ID: <37431AA8.BC77C615@appliedbiometrics.com>


Jeremy Hylton wrote:

[TP+CT about frame copies et al]

> Wouldn't it be easier to increase the refcount on the frame object?
> Then you wouldn't need to worry about the recounts on all the objects
> in the frame, because they would only be decrefed when the frame is
> deallocated.

Well, the frame is supposed to be run twice, since there are
two incarnations of interpreters working on it: The original one,
and later, when it is thown, another one (or the same, but, in
principle). 
The frame could have been in any state, with a couple
of objects on the stack. My splitting function can be invoked
in some nested context, so I have a current opcode position,
and a current stack position.
Running this once leaves the stack empty, since all the objects are
decrefed. Running this a second time gives a GPF, since the stack is
empty.
Therefore, I made a copy which means to create a duplicate frame
with an extra refcound for all the objects. This makes sure
that both can be restarted at any time.

> It seems like the two other things you would need are some way to get
> a copy of the current frame and a means to invoke eval_code2 with an
> already existing stack frame instead of a new one.

Well, that's exactly where I'm working on.

> (This sounds too simple, so it's obviously wrong.  I'm just not sure
> where.  Is the problem that you really need a seperate stack/graph to
> hold the frames?  If we leave them on the Python stack, it could be
> hard to dis-entangle value objects from control objects.)

Oh, perhaps I should explain it a bit clearer?
What did you mean by the Python stack? The hardware machine stack?

What do we have at the moment:
The stack is the linked list of frames. Every frame has a
local Python evaluation stack. Calls of Python functions produce
a new frame, and the old one is put beneath. This is the control
stack. The additional info on the hardware stack happens to be
a parallel friend of this chain, and currently holds extra info,
but this is an artifact. Adding the current Python stack level
to the frame makes the hardware stack totally unnecessary.

There is a possible speed loss, anyway.
Today, the recursive call of ceval2 is optimized and quite
fast. The non-recursive Version will have to copy variables
in and out from the frames, instead, so there is of course
a little speed penalty to pay.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Wed May 19 23:38:07 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 23:38:07 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > I've put quite many hours into a non-recursive ceval.c
> > already.
> 
> Does that mean 6 or 600 <wink>?

6, or 10, or 20, if I count the time from the first
start with Sam's code, maybe.

> 
> > Should I continue? At least this would be a little improvement, also
> > if the continuation thing will not be born. ?
> 
> Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
> so my belief is it's worth pursuing.
> 
> but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim

Right. Who'se faces? :-)

On the stackless thing, what should I do.
I started to insert minimum patches, but it turns out
that I have to change frames a little (extending).

I can make quite small changes to the interpreter to replace
the recursive calls, but this involves extra flags in some cases,
where the interpreter is called the first time and so on.

What has more probability to be included into a future Python:
Tweaking the current thing only minimally, to make it as similar
as possible as the former?
Or do as much redesign as I think is needed to do it in
a clean way. This would mean to split eval_code2 into two functions,
where one is the interpreter kernel, and one is the frame manager.

There are also other places which do quite deep function calls
and finally call eval_code2. I think these should return a frame
object now. I could convince them to call or return frame,
depending on a flag, but it would be clean to rename the functions,
let them always deal with frames, and put the original function
on top of it.

Short, I can do larger changes which clean this all a bit up,
or I can make small changes which are more tricky to grasp,
but give just small diffs.

How to touch untouchable code the best? :-)

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 23:49:38 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 17:49:38 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com>
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
	<37432F3F.2694DA0E@appliedbiometrics.com>
Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>

I think it makes sense to avoid being obscure or unclear in order to
minimize the size of the patch or the diff.  Realistically, it's
unlikely that anything like your original patch is going to make it
into the CVS tree.  It's primary value is as proof of concept and as
code that the rest of us can try out.  If you make large changes, but
they are clearer, you'll help us out a lot.

We can worry about minimizing the impact of the changes on the
codebase after, after everyone has figured out what's going on and
agree that its worth doing.

feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
Jeremy




From tismer at appliedbiometrics.com  Thu May 20 00:25:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 00:25:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
		<37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>
Message-ID: <37433A50.31E66CB1@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> I think it makes sense to avoid being obscure or unclear in order to
> minimize the size of the patch or the diff.  Realistically, it's
> unlikely that anything like your original patch is going to make it
> into the CVS tree.  It's primary value is as proof of concept and as
> code that the rest of us can try out.  If you make large changes, but
> they are clearer, you'll help us out a lot.

Many many thanks. This is good advice.
I will make absolutely clear what's going on, keep
parts untouched as possible, cut out parts which must
change, and I will not look into speed too much.

Better have a function call more and a bit less optimization,
but a clear and rock-solid introduction of a concept.

> We can worry about minimizing the impact of the changes on the
> codebase after, after everyone has figured out what's going on and
> agree that its worth doing.
> 
> feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
> Jeremy

Hihi - the new little slot with local variables of the 
interpreter happens to have the name "continuation".
Maybe I'd better rename it to "activation record"?.

Now, there is no longer a recoursive call. Instead, a frame
object is returned, which is waiting to be activated
by a dispatcher.

Some more ideas are popping up. Right now, only the recursive
calls can vanish. Callbacks from C code which is called by
the interpreter whcih is called by... is still a problem.

But it might perhaps vanish completely. We have to see
how much the cost is. But if I can manage to let the interpreter
duck and cover also on every call to a builtin? The interpreter
again returns to the dispatcher which then calls the builtin.
Well, if that builtin happens to call to the interpreter again,
it will be a dispatcher again. The machine stack grows a little,
but since everything is saved in the frames, these stacks are
no longer related. This means, the principle works with existing
extension modules, since interpreter-world and C-stack world
are decoupled.
To avoid stack growth, of course a number of builtins would
be better changed, but it is no must in the first place.
execfile for instance is a candidate which needn't call the
interpreter. It could equally parse the file, generate the
code object, build a frame and just return it. This is what
the dispatcher likes: returned frames are put on the chain
and fired.

waah, my bus - running - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Thu May 20 01:56:33 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 19:56:33 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000701bea253$3a182a00$179e2299@tim>

I'm home sick today, so tortured myself <0.9 wink>.

Sam mentioned using coroutines to compare the fringes of two trees, and I
picked a simpler problem:  given a nested list structure, generate the leaf
elements one at a time, in left-to-right order.  A solution to Sam's problem
can be built on that, by getting a generator for each tree and comparing the
leaves a pair at a time until there's a difference.

Attached are solutions in Icon, Python and Scheme.  I have the least
experience with Scheme, but browsing around didn't find a better Scheme
approach than this.

The Python solution is the least satisfactory, using an explicit stack to
simulate recursion by hand; if you didn't know the routine's purpose in
advance, you'd have a hard time guessing it.

The Icon solution is very short and simple, and I'd guess obvious to an
average Icon programmer.  It uses the subset of Icon ("generators") that
doesn't require any C-stack trickery.  However, alone of the three, it
doesn't create a function that could be explicitly called from several
locations to produce "the next" result; Icon's generators are tied into
Icon's unique control structures to work their magic, and breaking that
connection requires moving to full-blown Icon coroutines.  It doesn't need
to be that way, though.

The Scheme solution was the hardest to write, but is a largely mechanical
transformation of a recursive fringe-lister that constructs the entire
fringe in one shot.  Continuations are used twice:  to enable the recursive
routine to resume itself where it left off, and to get each leaf value back
to the caller.  Getting that to work required rebinding non-local
identifiers in delicate ways.  I doubt the intent would be clear to an
average Scheme programmer.

So what would this look like in Continuation Python?  Note that each place
the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and
up-level references are very common.  Two functions are defined at top
level, but seven more at various levels of nesting; the latter can't be
pulled up to the top because they refer to vrbls local to the top-level
functions.  Another (at least initially) discouraging thing to note is that
Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro
facilities.

may-not-be-as-fun-as-it-sounds<wink>-ly y'rs  - tim

Here's the Icon:

procedure main()
    x := [[1, [[2, 3]]], [4], [], [[[5]], 6]]
    every writes(fringe(x), " ")
    write()
end

procedure fringe(node)
    if type(node) == "list" then
        suspend fringe(!node)
    else
        suspend node
end

Here's the Python:

from types import ListType

class Fringe:
    def __init__(self, value):
        self.stack = [(value, 0)]

    def __getitem__(self, ignored):
        while 1:
            # find topmost pending list with something to do
            while 1:
                if not self.stack:
                    raise IndexError
                v, i = self.stack[-1]
                if i < len(v):
                    break
                self.stack.pop()

            this = v[i]
            self.stack[-1] = (v, i+1)
            if type(this) is ListType:
                self.stack.append((this, 0))
            else:
                break

        return this

testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]]

for x in Fringe(testcase):
    print x,
print

Here's the Scheme:

(define list->generator
  ; Takes a list as argument.
  ; Returns a generator g such that each call to g returns
  ; the next element in the list's symmetric-order fringe.
  (lambda (x)
    (letrec {(produce-value #f) ; set to return-to continuation
             (looper
              (lambda (x)
                (cond
                  ((null? x) 'nada) ; ignore null
                  ((list? x)
                   (looper (car x))
                   (looper (cdr x)))
                  (else
                   ; want to produce this non-list fringe elt,
                   ; and also resume here
                   (call/cc
                    (lambda (here)
                      (set! getnext
                            (lambda () (here 'keep-going)))
                      (produce-value x)))))))
             (getnext
              (lambda ()
                (looper x)
                ; have to signal end of sequence somehow;
                ; assume false isn't a legitimate fringe elt
                (produce-value #f)))}

      ; return niladic function that returns next value
      (lambda ()
        (call/cc
         (lambda (k)
           (set! produce-value k)
           (getnext)))))))

(define display-fringe
  (lambda (x)
    (letrec ((g (list->generator x))
             (thiselt #f)
             (looper
              (lambda ()
                (set! thiselt (g))
                (if thiselt
                    (begin
                      (display thiselt) (display " ")
                      (looper))))))
      (looper))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))

(display-fringe test-case)





From MHammond at skippinet.com.au  Thu May 20 02:14:24 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Thu, 20 May 1999 10:14:24 +1000
Subject: [Python-Dev] Interactive Debugging of Python
Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat>

All this talk about stack frames and manipulating them at runtime has
reminded me of one of my biggest gripes about Python.  When I say "biggest
gripe", I really mean "biggest surprise" or "biggest shame".

That is, Python is very interactive and dynamic.  However, when I am
debugging Python, it seems to lose this.  There is no way for me to
effectively change a running program.  Now with VC6, I can do this with C.
Although it is slow and a little dumb, I can change the C side of my Python
world while my program is running, but not the Python side of the world.

Im wondering how feasable it would be to change Python code _while_ running
under the debugger.  Presumably this would require a way of recompiling the
current block of code, patching this code back into the object, and somehow
tricking the stack frame to use this new block of code; even if a first-cut
had to restart the block or somesuch...

Any thoughts on this?

Mark.




From tim_one at email.msn.com  Thu May 20 04:41:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 22:41:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000901bea26a$34526240$179e2299@tim>

[Christian Tismer]
> I tried the most simple thing, and this seemed to be duplicating
> the current state of the machine. The frame holds the stack,
> and references to all objects.
> By chance, the locals are not in a dict, but unpacked into
> the frame. (Sometimes I agree with Guido, that optimization
> is considered harmful :-)

I don't see that the locals are a problem here -- provided you simply leave
them alone <wink>.

> The Python stack, besides its intermingledness with the machine
> stack, is basically its chain of frames.

Right.

> The value stack pointer still hides in the machine stack, but
> that's easy to change.

I'm not sure what "value stack" means here, or "machine stack".  The latter
means the C stack?  Then I don't know which values you have in mind that are
hiding in it (the locals are, as you say, unpacked in the frame, and the
evaluation stack too).  By "evaluation stack" I mean specifically
f->f_valuestack; the current *top* of stack pointer (specifically
stack_pointer) lives in the C stack -- is that what we're talking about?
Whichever, when we're talking about the code, let's use the names the code
uses <wink>.

> So the real Scheme-like part is this chain, methinks, with
> the current bytecode offset and value stack info.

Curiously, f->f_lasti is already materialized every time we make a call, in
order to support tracing.  So if capturing a continuation is done via a
function call (hard to see any other way it could be done <wink>), a
bytecode offset is already getting saved in the frame object.

> Making a copy of this in a restartable way means to increase
> the refcount of all objects in a frame.

You later had a vision of splitting the frame into two objects -- I think.
Whichever part the locals live in should not be copied at all, but merely
have its (single) refcount increased.  The other part hinges on details of
your approach I don't know.  The nastiest part seems to be f->f_valuestack,
which conceptually needs to be (shallow) copied in the current frame and in
all other frames reachable from the current frame's continuation (the chain
rooted at f->f_back today); that's the sum total (along with the same
frames' bytecode offsets) of capturing the control flow state.

> Would it be correct to undo the effect of fast locals before
> splitting, and redoing it on activation?

Unsure what splitting means, but in any case I can't conceive of a reason
for doing anything to the locals.  Their values aren't *supposed* to get
restored upon continuation invocation, so there's no reason to do anything
with their values upon continuation creation either.  Right?  Or are we
talking about different things?

almost-as-good-as-pantomimem<wink>-ly y'rs  - tim





From rushing at nightmare.com  Thu May 20 06:04:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 19 May 1999 21:04:20 -0700 (PDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <50692631@toto.iv>
Message-ID: <14147.34175.950743.79464@seattle.nightmare.com>

Tim Peters writes:
 > The Scheme solution was the hardest to write, but is a largely
 > mechanical transformation of a recursive fringe-lister that
 > constructs the entire fringe in one shot.  Continuations are used
 > twice: to enable the recursive routine to resume itself where it
 > left off, and to get each leaf value back to the caller.  Getting
 > that to work required rebinding non-local identifiers in delicate
 > ways.  I doubt the intent would be clear to an average Scheme
 > programmer.

It's the only way to do it - every example I've seen of using call/cc
looks just like it.

I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
people.  The following should be equivalent:

(define (list->generator x)
  (let ((produce-value #f))

    (define (looper x)
      (cond ((null? x) 'nada)
	    ((list? x)
	     (looper (car x))
	     (looper (cdr x)))
	    (else
	     (call/cc
	      (lambda (here)
		(set! getnext (lambda () (here 'keep-going)))
		(produce-value x))))))

    (define (getnext)
      (looper x)
      (produce-value #f))

    (lambda ()
      (call/cc
       (lambda (k)
	 (set! produce-value k)
	 (getnext))))))

(define (display-fringe x)
  (let ((g (list->generator x)))
    (let loop ((elt (g)))
      (if elt
	  (begin
             (display elt)
             (display " ")
             (loop (g)))))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))
(display-fringe test-case)

 > So what would this look like in Continuation Python?

Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
do this without having the feature to play with.  This presumes a
function "call_cc" that behaves like Scheme's.  I believe the extra
level of indirection is necessary. (i.e., call_cc takes a function as
an argument that takes a continuation function)

class list_generator:

    def __init__ (x):
        self.x = x
        self.k_suspend = None
        self.k_produce = None

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

    def __call__ (self):
        # call self.resume() with a continuation
        # that will return the next fringe element
        return call_cc (self.resume)

    def resume (self, k_produce):
        self.k_produce = k_produce
        if self.k_suspend:
            # resume the suspended walk
            self.k_suspend (None)
        else:
            self.walk (self.x)

    def suspend (self, k_suspend):
        self.k_suspend = k_suspend
        # return a value for __call__
        self.k_produce (self.item)

Variables hold continuations have a 'k_' prefix.  In real life it
might be possible to put the suspend/call/resume machinery in a base
class (Generator?), and override 'walk' as you please.

-Sam




From tim_one at email.msn.com  Thu May 20 09:21:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 03:21:45 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <001d01bea291$6b3efbc0$179e2299@tim>

[Sam, takes up the Continuation Python Challenge]

Thanks, Sam!  I think this is very helpful.

> ...
> It's the only way to do it - every example I've seen of using call/cc
> looks just like it.

Same here -- alas <0.5 wink>.

> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
> people.  The following should be equivalent:

I confess I stopped paying attention to Scheme after R4RS, and largely
because the std decreed that *so* many forms were optional.  Your rework is
certainly nicer, but internal defines and named let are two that R4RS
refused to require, so I always avoided them.  BTW, I *am* a compiler, so
that never bothered me <wink>.

>> So what would this look like in Continuation Python?

> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
> do this without having the feature to play with.

Fully understood.  It's also really hard to implement the feature without
knowing how someone who wants it would like it to behave.  But I don't think
anyone is getting graded on this, so let's have fun <wink>.

Ack!  I have to sleep.  Will study the code in detail later, but first
impression was it looked good!  Especially nice that it appears possible to
package up most of the funky call_cc magic in a base class, so that
non-wizards could reuse it by following a simple protocol.

great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo-
    from-scratch-every-time-ly y'rs  - tim





From skip at mojam.com  Thu May 20 15:27:59 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 09:27:59 -0400 (EDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
References: <50692631@toto.iv>
	<14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>

    Sam> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
    Sam> people.

Sam, you are aware of course that the timbot *is* a compiler, right? ;-)

    >> So what would this look like in Continuation Python?

    Sam> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
    Sam> do this without having the feature to play with.

The thought that it's unlikely one could arrive at a reasonable
approximation of a correct solution for such a small problem without the
ability to "play with" it is sort of scary.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Thu May 20 16:10:32 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 16:10:32 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <008b01bea255$b80cf790$0801a8c0@bobcat>
Message-ID: <374417D8.8DBCB617@appliedbiometrics.com>


Mark Hammond wrote:
> 
> All this talk about stack frames and manipulating them at runtime has
> reminded me of one of my biggest gripes about Python.  When I say "biggest
> gripe", I really mean "biggest surprise" or "biggest shame".
> 
> That is, Python is very interactive and dynamic.  However, when I am
> debugging Python, it seems to lose this.  There is no way for me to
> effectively change a running program.  Now with VC6, I can do this with C.
> Although it is slow and a little dumb, I can change the C side of my Python
> world while my program is running, but not the Python side of the world.
> 
> Im wondering how feasable it would be to change Python code _while_ running
> under the debugger.  Presumably this would require a way of recompiling the
> current block of code, patching this code back into the object, and somehow
> tricking the stack frame to use this new block of code; even if a first-cut
> had to restart the block or somesuch...
> 
> Any thoughts on this?

I'm writing a prototype of a stackless Python, which means that
you will be able to access the current state of the interpreter
completely.
The inner interpreter loop will be isolated from the frame
dispatcher. It will break whenever the ticker goes zero.
If you set the ticker to one, you will be able to single
step on every opcode, have the value stack, the frame chain,
everything.
I think, with this you can do very much.
But tell me if you want a callback hook somewhere.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Thu May 20 18:52:21 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 18:52:21 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com>

Cleaning up, clarifying, trying to understand...

Tim Peters wrote:
> 
> [Christian Tismer]
> > I tried the most simple thing, and this seemed to be duplicating
> > the current state of the machine. The frame holds the stack,
> > and references to all objects.
> > By chance, the locals are not in a dict, but unpacked into
> > the frame. (Sometimes I agree with Guido, that optimization
> > is considered harmful :-)
> 
> I don't see that the locals are a problem here -- provided you simply leave
> them alone <wink>.

This depends on wether I have to duplicate frames
or not. Below...

> > The Python stack, besides its intermingledness with the machine
> > stack, is basically its chain of frames.
> 
> Right.
> 
> > The value stack pointer still hides in the machine stack, but
> > that's easy to change.
> 
> I'm not sure what "value stack" means here, or "machine stack".  The latter
> means the C stack?  Then I don't know which values you have in mind that are
> hiding in it (the locals are, as you say, unpacked in the frame, and the
> evaluation stack too).  By "evaluation stack" I mean specifically
> f->f_valuestack; the current *top* of stack pointer (specifically
> stack_pointer) lives in the C stack -- is that what we're talking about?

Exactly!

> Whichever, when we're talking about the code, let's use the names the code
> uses <wink>.

The evaluation stack pointer is a local variable in the
C stack and must be written to the frame to become independant
from the C stack. Sounds better now?

> 
> > So the real Scheme-like part is this chain, methinks, with
> > the current bytecode offset and value stack info.
> 
> Curiously, f->f_lasti is already materialized every time we make a call, in
> order to support tracing.  So if capturing a continuation is done via a
> function call (hard to see any other way it could be done <wink>), a
> bytecode offset is already getting saved in the frame object.

You got me. I'm just completing what is partially there.

> > Making a copy of this in a restartable way means to increase
> > the refcount of all objects in a frame.
> 
> You later had a vision of splitting the frame into two objects -- I think.

My wrong wording. Not splitting, but duplicting. If a frame is the
current state, I make it two frames to have two current states.
One will be saved, the other will be run. This is what I call
"splitting".
Actually, splitting must occour whenever a frame can be reached twice,
in order to keep elements alive.

> Whichever part the locals live in should not be copied at all, but merely
> have its (single) refcount increased.  The other part hinges on details of
> your approach I don't know.  The nastiest part seems to be f->f_valuestack,
> which conceptually needs to be (shallow) copied in the current frame and in
> all other frames reachable from the current frame's continuation (the chain
> rooted at f->f_back today); that's the sum total (along with the same
> frames' bytecode offsets) of capturing the control flow state.

Well, I see. You want one locals and one globals, shared by two
incarnations. Gets me into trouble.

> > Would it be correct to undo the effect of fast locals before
> > splitting, and redoing it on activation?
> 
> Unsure what splitting means, but in any case I can't conceive of a reason
> for doing anything to the locals.  Their values aren't *supposed* to get
> restored upon continuation invocation, so there's no reason to do anything
> with their values upon continuation creation either.  Right?  Or are we
> talking about different things?

Let me explain. What Python does right now is:
When a function is invoked, all local variables are copied
into fast_locals, well of course just references are copied
and counts increased. These fast locals give a lot of speed
today, we must have them.
You are saying I have to share locals between frames. Besides
that will be a reasonable slowdown, since an extra structure
must be built and accessed indirectly (right now, i's all fast,
living in the one frame buffer), I cannot say that I'm convinced
that this is what we need.

Suppose you have a function

def f(x):
    # do something
    ...
    # in some context, wanna have a snapshot
    global snapshot  # initialized to None
    if not snapshot:
        snapshot = callcc.new()
    # continue computation
    x = x+1
    ...

What I want to achieve is that I can run this again, from my
snapshot. But with shared locals, my parameter x of the
snapshot would have changed to x+1, which I don't find useful.
I want to fix a state of the current frame and still think
it should "own" its locals. Globals are borrowed, anyway.
Class instances will anyway do what you want, since
the local "self" is a mutable object.

How do you want to keep computations independent
when locals are shared? For me it's just easier to
implement and also to think with the shallow copy.
Otherwise, where is my private place?
Open for becoming convinced, of course :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Thu May 20 21:26:30 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Thu, 20 May 1999 15:26:30 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
References: <000901bea26a$34526240$179e2299@tim>
	<37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  CT> What I want to achieve is that I can run this again, from my
  CT> snapshot. But with shared locals, my parameter x of the snapshot
  CT> would have changed to x+1, which I don't find useful.  I want to
  CT> fix a state of the current frame and still think it should "own"
  CT> its locals. Globals are borrowed, anyway.  Class instances will
  CT> anyway do what you want, since the local "self" is a mutable
  CT> object.

  CT> How do you want to keep computations independent when locals are
  CT> shared? For me it's just easier to implement and also to think
  CT> with the shallow copy.  Otherwise, where is my private place?
  CT> Open for becoming convinced, of course :-)

I think you're making things a lot more complicated by trying to
instantiate new variable bindings for locals every time you create a
continuation.  Can you give an example of why that would be helpful?
(Ok.  I'm not sure I can offer a good example of why it would be
helpful to share them, but it makes intuitive sense to me.)

The call_cc mechanism is going to let you capture the current
continuation, save it somewhere, and call on it again as often as you
like.  Would you get a fresh locals each time you used it?  or just
the first time?  If only the first time, it doesn't seem that you've
gained a whole lot.

Also, all the locals that are references to mutable objects are
already effectively shared.  So it's only a few oddballs like ints
that are an issue.

Jeremy



From tim_one at email.msn.com  Fri May 21 00:04:04 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:04 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>
Message-ID: <000601bea30c$ad51b220$9d9e2299@tim>

[Tim]
> So what would this look like in Continuation Python?

[Sam]
> Here's my first hack at it.  Most likely wrong.  It is
> REALLY HARD to do this without having the feature to play with.

[Skip]
> The thought that it's unlikely one could arrive at a reasonable
> approximation of a correct solution for such a small problem without the
> ability to "play with" it is sort of scary.

Yes it is.  But while the problem is small, it's not easy, and only the Icon
solution wrote itself (not a surprise -- Icon was designed for expressing
this kind of algorithm, and the entire language is actually warped towards
it).  My first stab at the Python stack-fiddling solution had bugs too, but
I conveniently didn't post that <wink>.

After studying Sam's code, I expect it *would* work as written, so it's a
decent bet that it's a reasonable approximation to a correct solution as-is.

A different Python approach using threads can be built using

    Demo/threads/Generator.py

from the source distribution.  To make that a fair comparison, I would have
to post the supporting machinery from Generator.py too -- and we can ask
Guido whether Generator.py worked right the first time he tried it <wink>.

The continuation solution is subtle, requiring real expertise; but the
threads solution doesn't fare any better on that count (building the support
machinery with threads is also a baffler if you don't have thread
expertise).  If we threw Python metaclasses into the pot too, they'd be a
third kind of nightmare for the non-expert.

So, if you're faced with this kind of task, there's simply no easy way to
get it done.  Thread- and (it appears) continuation- based machinery can be
crafted once by an expert, then packaged into an easy-to-use protocol for
non-experts.

All in all, I view continuations as a feature most people should actively
avoid!  I think it has that status in Scheme too (e.g., the famed Schemer's
SICP textbook doesn't even mention call/cc).  Its real value (if any <wink>)
is as a Big Invisible Hammer for certified wizards.  Where call_cc leaks
into the user's view of the world I'd try to hide it; e.g., where Sam has

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

I'd do

    def walk(self, x):
        if type(x) == type([]):
            for item in x:
                self.walk(item)
        else:
            self.put(x)

where "put" is inherited from the base class (part of the protocol) and
hides the call_cc business.  Do enough of this, and we'll rediscover why
Scheme demands that tail calls not push a new stack frame <0.9 wink>.

the-tradeoffs-are-murky-ly y'rs  - tim





From tim_one at email.msn.com  Fri May 21 00:04:09 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:09 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <000701bea30c$af7a1060$9d9e2299@tim>

[Christian]
[... clarified stuff ... thanks! ... much clearer ...]
> ...
> If a frame is the current state, I make it two frames to have two
> current states.  One will be saved, the other will be run. This is
> what I call "splitting".  Actually, splitting must occour whenever
> a frame can be reached twice, in order to keep elements alive.

That part doesn't compute:  if a frame can be reached by more than one path,
its refcount must be at least equal to the number of its immediate
predecessors, and its refcount won't fall to 0 before it becomes
unreachable.  So while you may need to split stuff for *some* reasons, I
can't see how keeping elements alive could be one of those reasons (unless
you're zapping frame contents *before* the frame itself is garbage?).

> ...
> Well, I see. You want one locals and one globals, shared by two
> incarnations. Gets me into trouble.

Just clarifying what Scheme does.  Since they've been doing this forever, I
don't want to toss their semantics on a whim <wink>.  It's at least a
conceptual thing:  why *should* locals follow different rules than globals?
If Python2 grows lexical closures, the only thing special about today's
"locals" is that they happen to be the first guys found on the search path.
Conceptually, that's really all they are today too.

Here's the clearest Scheme example I can dream up:

(define k #f)

(define (printi i)
  (display "i is ") (display i) (newline))

(define (test n)
  (let ((i n))
    (printi i)
    (set! i (- i 1))
    (printi i)
    (display "saving continuation") (newline)
    (call/cc (lambda (here) (set! k here)))
    (set! i (- i 1))
    (printi i)
    (set! i (- i 1))
    (printi i)))

No loops, no recursive calls, just a straight chain of fiddle-a-local ops.
Here's some output:

> (test 5)
i is 5
i is 4
saving continuation
i is 3
i is 2
> (k #f)
i is 1
i is 0
> (k #f)
i is -1
i is -2
> (k #f)
i is -3
i is -4
>

So there's no question about what Scheme thinks is proper behavior here.

> ...
> Let me explain. What Python does right now is:
> When a function is invoked, all local variables are copied
> into fast_locals, well of course just references are copied
> and counts increased. These fast locals give a lot of speed
> today, we must have them.

Scheme (most of 'em, anyway) also resolves locals via straight base + offset
indexing.

> You are saying I have to share locals between frames. Besides
> that will be a reasonable slowdown, since an extra structure
> must be built and accessed indirectly (right now, i's all fast,
> living in the one frame buffer),

GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
care where that points *to* <wink -- but, really, it could point into some
other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
due to continuation needs extra setup work?  Scheme saves itself by putting
name-resolution and continuation info into different structures; to mimic
the semantics, Python would need to get the same end effect.

> I cannot say that I'm convinced that this is what we need.
>
> Suppose you have a function
>
> def f(x):
>     # do something
>     ...
>     # in some context, wanna have a snapshot
>     global snapshot  # initialized to None
>     if not snapshot:
>         snapshot = callcc.new()
>     # continue computation
>     x = x+1
>     ...
>
> What I want to achieve is that I can run this again, from my
> snapshot. But with shared locals, my parameter x of the
> snapshot would have changed to x+1, which I don't find useful.

You need a completely fleshed-out example to score points here:  the use of
call/cc is subtle, hinging on details, and fragments ignore too much.  If
you do want the same x,

    commonx = x
    if not snapshot:
         # get the continuation
    # continue computation
    x = commonx
    x = x+1
    ...

That is, it's easy to get it.  But if you *do* want to see changes to the
locals (which is one way for those distinct continuation invocations to
*cooperate* in solving a task -- see below), but the implementation doesn't
allow for it, I don't know what you can do to worm around it short of making
x global too.  But then different *top* level invocations of f will stomp on
that shared global, so that's not a solution either.  Maybe forget functions
entirely and make everything a class method.

> I want to fix a state of the current frame and still think
> it should "own" its locals. Globals are borrowed, anyway.
> Class instances will anyway do what you want, since
> the local "self" is a mutable object.
>
> How do you want to keep computations independent
> when locals are shared? For me it's just easier to
> implement and also to think with the shallow copy.
> Otherwise, where is my private place?
> Open for becoming convinced, of course :-)

I imagine it comes up less often in Scheme because it has no loops:
communication among "iterations" is via function arguments or up-level
lexical vrbls.

So recall your uses of Icon generators instead:  like Python, Icon does have
loops, and two-level scoping, and I routinely build loopy Icon generators
that keep state in locals.  Here's a dirt-simple example I emailed to Sam
earlier this week:

procedure main()
    every result := fib(0, 1) \ 10 do
        write(result)
end

procedure fib(i, j)
    local temp
    repeat {
        suspend i
        temp := i + j
        i := j
        j := temp
    }
end

which prints

0
1
1
2
3
5
8
13
21
34

If Icon restored the locals (i, j, temp) upon each fib resumption, it would
generate a zero followed by an infinite sequence of ones(!).

Think of a continuation as a *paused* computation (which it is) rather than
an *independent* one (which it isn't <wink>), and I think it gets darned
hard to argue.

theory-and-practice-agree-here-in-my-experience-ly y'rs  - tim





From MHammond at skippinet.com.au  Fri May 21 01:01:22 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 09:01:22 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com>
Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat>

> I'm writing a prototype of a stackless Python, which means that
> you will be able to access the current state of the interpreter
> completely.
> The inner interpreter loop will be isolated from the frame
> dispatcher. It will break whenever the ticker goes zero.
> If you set the ticker to one, you will be able to single
> step on every opcode, have the value stack, the frame chain,
> everything.

I think the main point is how to change code when a Python frame already
references it.  I dont think the structure of the frames is as important as
the general concept.  But while we were talking frame-fiddling it seemed a
good point to try and hijack it a little :-)

Would it be possible to recompile just a block of code (eg, just the
current function or method) and patch it back in such a way that the
current frame continues execution of the new code?

I feel this is somewhat related to the inability to change class
implementation for an existing instance.  I know there have been hacks
around this before but they arent completly reliable and IMO it would be
nice if the core Python made it easier to change already running code -
whether that code is in an existing stack frame, or just in an already
created instance, it is very difficult to do.

This has come to try and deflect some conversation away from changing
Python as such towards an attempt at enhancing its _environment_.  To
paraphrase many people before me, even if we completely froze the language
now there would still plenty of work ahead of us :-)

Mark.




From guido at CNRI.Reston.VA.US  Fri May 21 02:06:51 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 20:06:51 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000."
             <00c001bea314$aefc5b40$0801a8c0@bobcat> 
References: <00c001bea314$aefc5b40$0801a8c0@bobcat> 
Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us>

> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

This topic sounds mostly unrelated to the stackless discussion -- in
either case you need to be able to fiddle the contents of the frame
and the bytecode pointer to reflect the changed function.

Some issues:

  - The slots containing local variables may be renumbered after
    recompilation; fortunately we know the name--number mapping so we can
    move them to their new location.  But it is still tricky.

  - Should you be able to edit functions that are present on the call
    stack below the top?  Suppose we have two functions:

	def f():
	    return 1 + g()

	def g():
	    return 0

    Suppose set a break in g(), and then edit the source of f().  We can
    do all sorts of evil to f(): e.g. we could change it to

	    return g() + 2

    which affects the contents of the value stack when g() returns
    (originally, the value stack contained the value 1, now it is empty).
    Or we could even change f() to

	    return 3

    thereby eliminating the call to g() altogether!

What kind of limitations do other systems that support modifying a
"live" program being debugged impose?  Only allowing modification of
the function at the top of the stack might eliminate some problems,
although there are still ways to mess up.  The value stack is not 
always empty even when we only stop at statement boundaries -- e.g. it 
contains 'for' loop indices, and there's also the 'block' stack, which 
contains try-except information.  E.g. what should happen if we change

    def f():
        for i in range(10):
            print 1

stopped at the 'print 1' into

    def f():
        print 1

???

(Ditto for removing or adding a try/except block.)

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I've been thinking a bit about this.  Function objects now have
mutable func_code attributes (and also func_defaults), I think we can
use this.

The hard part is to do the analysis needed to decide which functions
to recompile!  Ideally, we would simply edit a file and tell the
programming environment "recompile this".  The programming environment
would compare the changed file with the old version that it had saved
for this purpose, and notice (for example) that we changed two methods
of class C.  It would then recompile those methods only and stuff the
new code objects in the corresponding function objects.

But what would it do when we changed a global variable?  Say a module
originally contains a statement "x = 0".  Now we change the source
code to say "x = 100".  Should we change the variable x?  Suppose that
x is modified by some of the computations in the module, and the that,
after some computations, the actual value of x was 50.  Should the
"recompile" reset x to 100 or leave it alone?

One option would be to actually change the semantics of the class and
def statements so that they modify an existing class or function
rather than using assignment.  Effectively, this proposal would change
the semantics of

    class A:
        ...some code...

    class A:
        ...some more code...

to be the same as

    class A:
        ...more code...
        ...some more code...
        
This is somewhat similar to the way the module or package commands in
some other dynamic languages work, I think; and I don't think this
would break too much existing code.

The proposal would also change

    def f():
        ...some code...

    def f():
        ...other code...

but here the equivalence is not so easy to express, since I want
different semantics (I don't want the second f's code to be tacked
onto the end of the first f's code).

If we understand that def f(): ... really does the following:

    f = NewFunctionObject()
    f.func_code = ...code object...

then the construct above (def f():... def f(): ...) would do this:

    f = NewFunctionObject()
    f.func_code = ...some code...

    f.func_code = ...other code...

i.e. there is no assignment of a new function object for the second
def.

Of course if there is a variable f but it is not a function, it would
have to be assigned a new function object first.

But in the case of def, this *does* break existing code.  E.g.

# module A
from B import f
.
.
.
if ...some test...:
    def f(): ...some code...

This idiom conditionally redefines a function that was also imported
from some other module.  The proposed new semantics would change B.f
in place!

So perhaps these new semantics should only be invoked when a special
"reload-compile" is asked for...  Or perhaps the programming
environment could do this through source parsing as I proposed
before...

> This has come to try and deflect some conversation away from changing
> Python as such towards an attempt at enhancing its _environment_.  To
> paraphrase many people before me, even if we completely froze the language
> now there would still plenty of work ahead of us :-)

Please, no more posts about Scheme.  Each new post mentioning call/cc
makes it *less* likely that something like that will ever be part of
Python.  "What if Guido's brain exploded?" :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Fri May 21 03:13:28 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 21:13:28 -0400 (EDT)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
	<199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com>

    Guido> What kind of limitations do other systems that support modifying
    Guido> a "live" program being debugged impose?  Only allowing
    Guido> modification of the function at the top of the stack might
    Guido> eliminate some problems, although there are still ways to mess
    Guido> up.

Frame objects maintain pointers to the active code objects, locals and
globals, so modifying a function object's code or globals shouldn't have any
effect on currently executing frames, right?  I assume frame objects do the
usual INCREF/DECREF dance, so the old code object won't get deleted before
the frame object is tossed.

    Guido> But what would it do when we changed a global variable?  Say a
    Guido> module originally contains a statement "x = 0".  Now we change
    Guido> the source code to say "x = 100".  Should we change the variable
    Guido> x?  Suppose that x is modified by some of the computations in the
    Guido> module, and the that, after some computations, the actual value
    Guido> of x was 50.  Should the "recompile" reset x to 100 or leave it
    Guido> alone?

I think you should note the change for users and give them some way to
easily pick between old initial value, new initial value or current value.

    Guido> Please, no more posts about Scheme.  Each new post mentioning
    Guido> call/cc makes it *less* likely that something like that will ever
    Guido> be part of Python.  "What if Guido's brain exploded?" :-)

I agree.  I see call/cc or set! and my eyes just glaze over...

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From MHammond at skippinet.com.au  Fri May 21 03:42:14 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 11:42:14 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat>

[Guido writes...]
> This topic sounds mostly unrelated to the stackless discussion -- in

Sure is - I just saw that as an excuse to try and hijack it <wink>

> Some issues:
>
>   - The slots containing local variables may be renumbered after

Generally, I think we could make something very useful even with a number
of limitations.  For example, I would find a first cut completely
acceptable and a great improvement on today if:

* Only the function at the top of the stack can be recompiled and have the
code reflected while executing.  This function also must be restarted after
such an edit.  If the function uses global variables or makes calls that
restarting will screw-up, then either a) make the code changes _before_
doing this stuff, or b) live with it for now, and help us remove the
limitation :-)

That may make the locals being renumbered easier to deal with, and also
remove some of the problems you discussed about editing functions below the
top.

> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?  Only allowing modification of

I can only speak for VC, and from experience at that - I havent attempted
to find documentation on it.

It accepts most changes while running.  The current line is fine.  If you
create or change the definition of globals (and possibly even the type of
locals?), the "incremental compilation" fails, and you are given the option
of continuing with the old code, or stopping the process and doing a full
build.

When the debug session terminates, some link process (and maybe even
compilation?) is done to bring the .exe on disk up to date with the
changes.

If you do wierd stuff like delete the line being executed, it usually gives
you some warning message before either restarting the function or trying to
pick a line somewhere near the line you deleted.  Either way, it can screw
up, moving the "current" line somewhere else - it doesnt crash the
debugger, but may not do exactly what you expected.  It is still a _huge_
win, and a great feature!

Ironically, I turn this feature _off_ for Python extensions.  Although
changing the C code is great, in 99% of the cases I also need to change
some .py code, and as existing instances are affected I need to restart the
app anyway - so I may as well do a normal build at that time.  ie, C now
lets me debug incrementally, but a far more dynamic language prevents this
feature being useful ;-)

> the function at the top of the stack might eliminate some problems,
> although there are still ways to mess up.  The value stack is not
> always empty even when we only stop at statement boundaries

If we forced a restart would this be better?  Can we reliably reset the
stack to the start of the current function?

> I've been thinking a bit about this.  Function objects now have
> mutable func_code attributes (and also func_defaults), I think we can
> use this.
>
> The hard part is to do the analysis needed to decide which functions
> to recompile!  Ideally, we would simply edit a file and tell the
> programming environment "recompile this".  The programming environment
> would compare the changed file with the old version that it had saved
> for this purpose, and notice (for example) that we changed two methods
> of class C.  It would then recompile those methods only and stuff the
> new code objects in the corresponding function objects.

If this would work for the few changed functions/methods, what would the
impact be of doing it for _every_ function (changed or not)?  Then the
analysis can drop to the module level which is much easier.  I dont think a
slight performace hit is a problem at all when doing this stuff.

> One option would be to actually change the semantics of the class and
> def statements so that they modify an existing class or function
> rather than using assignment.  Effectively, this proposal would change
> the semantics of
>
>     class A:
>         ...some code...
>
>     class A:
>         ...some more code...
>
> to be the same as
>
>     class A:
>         ...more code...
>         ...some more code...

Or extending this (didnt this come up at the latest IPC?)
# .\package\__init__.py
class BigMutha:
  pass

# .\package\something.py
class package.BigMutha:
  def some_category_of_methods():
    ...

# .\package\other.py
class package.BigMutha:
  def other_category_of_methods():
    ...
[Of course, this wont fly as it stands; just a conceptual possibility]

> So perhaps these new semantics should only be invoked when a special
> "reload-compile" is asked for...  Or perhaps the programming
> environment could do this through source parsing as I proposed
> before...


From guido at CNRI.Reston.VA.US  Fri May 21 05:02:49 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 23:02:49 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000."
             <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us>

> Generally, I think we could make something very useful even with a number
> of limitations.  For example, I would find a first cut completely
> acceptable and a great improvement on today if:
> 
> * Only the function at the top of the stack can be recompiled and have the
> code reflected while executing.  This function also must be restarted after
> such an edit.  If the function uses global variables or makes calls that
> restarting will screw-up, then either a) make the code changes _before_
> doing this stuff, or b) live with it for now, and help us remove the
> limitation :-)

OK, restarting the function seems a reasonable compromise and would
seem relatively easy to implement.  Not *real* easy though: it turns
out that eval_code2() is called with a code object as argument, and
it's not entirely trivial to figure out the corresponding function
object from which to grab the new code object.  But it could be done
-- give it a try.  (Don't wait for me, I'm ducking for cover until at
least mid June.)

> Ironically, I turn this feature _off_ for Python extensions.  Although
> changing the C code is great, in 99% of the cases I also need to change
> some .py code, and as existing instances are affected I need to restart the
> app anyway - so I may as well do a normal build at that time.  ie, C now
> lets me debug incrementally, but a far more dynamic language prevents this
> feature being useful ;-)

I hear you.

> If we forced a restart would this be better?  Can we reliably reset the
> stack to the start of the current function?

Yes, no problem.

> If this would work for the few changed functions/methods, what would the
> impact be of doing it for _every_ function (changed or not)?  Then the
> analysis can drop to the module level which is much easier.  I dont think a
> slight performace hit is a problem at all when doing this stuff.

Yes, this would be fine too.

> >"What if Guido's brain exploded?" :-)
> 
> At least on that particular topic I didnt even consider I was the only one
> in fear of that!  But it is good to know that you specifically are too :-)

Have no fear.  I've learned to say no. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Fri May 21 07:36:44 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 21 May 1999 01:36:44 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <000401bea34b$e93fcda0$d89e2299@tim>

[GvR]
> ...
> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?

As an ex-compiler guy, I should have something wise to say about that.
Alas, I've never used a system that allowed more than poking new values into
vrbls, and the thought of any more than that makes me vaguely ill!  Oh,
that's right -- I'm vaguely ill anyway today.  Still-- oooooh -- the
problems.

This later got reduced to restarting the topmost function from scratch.
That has some attraction, especially on the bang-for-buck-o-meter.

> ...
> Please, no more posts about Scheme.  Each new post mentioning call/cc
> makes it *less* likely that something like that will ever be part of
> Python.  "What if Guido's brain exploded?" :-)

What a pussy <wink>.  Really, overall continuations are much less trouble to
understand than threads -- there's only one function in the entire
interface!

OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
way to get pseudo-parallel semantics regardless of OS.

changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs  - tim





From tismer at appliedbiometrics.com  Fri May 21 09:12:05 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:12:05 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
Message-ID: <37450745.21D63A5@appliedbiometrics.com>


Mark Hammond wrote:
> 
> > I'm writing a prototype of a stackless Python, which means that
> > you will be able to access the current state of the interpreter
> > completely.
> > The inner interpreter loop will be isolated from the frame
> > dispatcher. It will break whenever the ticker goes zero.
> > If you set the ticker to one, you will be able to single
> > step on every opcode, have the value stack, the frame chain,
> > everything.
> 
> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

Sure. Since the frame holds a pointer to the code, and the current
IP and SP, your code can easily change it (with care, or GPF:) .
It could even create a fresh code object and let it run only
for the running instance. By instance, I mean a frame which is
running a code object.

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I think this has been difficult, only since information was hiding
in the inner interpreter loop. Gonna change now.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Fri May 21 09:21:22 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:21:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
		<37443DC5.1330EAC6@appliedbiometrics.com> <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>
Message-ID: <37450972.D19E160@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:
> 
>   CT> What I want to achieve is that I can run this again, from my
>   CT> snapshot. But with shared locals, my parameter x of the snapshot
>   CT> would have changed to x+1, which I don't find useful.  I want to
>   CT> fix a state of the current frame and still think it should "own"
>   CT> its locals. Globals are borrowed, anyway.  Class instances will
>   CT> anyway do what you want, since the local "self" is a mutable
>   CT> object.
> 
>   CT> How do you want to keep computations independent when locals are
>   CT> shared? For me it's just easier to implement and also to think
>   CT> with the shallow copy.  Otherwise, where is my private place?
>   CT> Open for becoming convinced, of course :-)
> 
> I think you're making things a lot more complicated by trying to
> instantiate new variable bindings for locals every time you create a
> continuation.  Can you give an example of why that would be helpful?

I'm not sure wether you all understand me, and vice versa.
There is no copying at all, but for the frame.
I copy the frame, which means I also incref all the
objects which it holds. Done. This is the bare minimum
which I must do.

> (Ok.  I'm not sure I can offer a good example of why it would be
> helpful to share them, but it makes intuitive sense to me.)
> 
> The call_cc mechanism is going to let you capture the current
> continuation, save it somewhere, and call on it again as often as you
> like.  Would you get a fresh locals each time you used it?  or just
> the first time?  If only the first time, it doesn't seem that you've
> gained a whole lot.

call_cc does a copy of the state which is the frame. This is
stored away until it is revived. Nothing else happens.
As Guido pointed out, virtually the whole frame chain is
duplicated, but only on demand.

> Also, all the locals that are references to mutable objects are
> already effectively shared.  So it's only a few oddballs like ints
> that are an issue.

Simply look at a frame, what it is. What do you need to do to
run it again with a given state. You have to preserve the stack
variables. And you have to preserve the current locals, since
some of them might even have a copy on the stack, and we want
to stay consistent.

I believe it would become obvious if you tried to implement it.
Maybe I should close my ears and get something ready to show?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Fri May 21 11:00:26 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 11:00:26 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea30c$af7a1060$9d9e2299@tim>
Message-ID: <374520AA.2ADEA687@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian]
> [... clarified stuff ... thanks! ... much clearer ...]

But still not clear enough, I fear.

> > ...
> > If a frame is the current state, I make it two frames to have two
> > current states.  One will be saved, the other will be run. This is
> > what I call "splitting".  Actually, splitting must occour whenever
> > a frame can be reached twice, in order to keep elements alive.
> 
> That part doesn't compute:  if a frame can be reached by more than one path,
> its refcount must be at least equal to the number of its immediate
> predecessors, and its refcount won't fall to 0 before it becomes
> unreachable.  So while you may need to split stuff for *some* reasons, I
> can't see how keeping elements alive could be one of those reasons (unless
> you're zapping frame contents *before* the frame itself is garbage?).

I was saying that under the side condition that I don't want to
change frames as they are now. Maybe that's misconcepted, but
this is what I did:

If a frame as we have it today shall be resumed twice, then
it has to be copied, since:
The stack is in it and has some state which will change
after resuming.

That was the whole problem with my first prototype, which
was done hoping that I don't need to change the interpreter
at all. Wrong, bad, however.

What I actually did was more than seems to be needed:
I made a copy of the whole current frame chain. Later on,
Guido said this can be done on demand. He's right.

[Scheme sample - understood]

> GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
> care where that points *to* <wink -- but, really, it could point into some
> other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
> due to continuation needs extra setup work?  Scheme saves itself by putting
> name-resolution and continuation info into different structures; to mimic
> the semantics, Python would need to get the same end effect.

Point taken. The pointer doesn't save time of access, it just
saves allocating another structure.
So we can use something else without speed loss.

[have to cut a little]

> So recall your uses of Icon generators instead:  like Python, Icon does have
> loops, and two-level scoping, and I routinely build loopy Icon generators
> that keep state in locals.  Here's a dirt-simple example I emailed to Sam
> earlier this week:
> 
> procedure main()
>     every result := fib(0, 1) \ 10 do
>         write(result)
> end
> 
> procedure fib(i, j)
>     local temp
>     repeat {
>         suspend i
>         temp := i + j
>         i := j
>         j := temp
>     }
> end

[prints fib series]

> If Icon restored the locals (i, j, temp) upon each fib resumption, it would
> generate a zero followed by an infinite sequence of ones(!).

Now I'm completely missing the point. Why should I want
to restore anything? At a suspend, which when done by continuations
will be done by temporarily having two identical states, one
is saved and another is continued. The continued one in your example
just returns the current value and immediately forgets about
the locals. The other one is continued later, and of course with
the same locals which were active when going asleep.

> Think of a continuation as a *paused* computation (which it is) rather than
> an *independent* one (which it isn't <wink>), and I think it gets darned
> hard to argue.

No, you get me wrong. I understand what you mean. It is just
the decision wether a frame, which will be reactivated later
as a continuation, should use a reference to locals like
the reference which it has for the globals. This causes me
a major frame redesign.

Current design:
A frame is: back chain, state, code, unpacked locals, globals, stack.

Code and globals are shared. 
State, unpacked locals and stack are private.

Possible new design:
A frame is: back chain, state, code, variables, globals, stack.

variables is: unpacked locals.

This makes the variables into an extra structure which is shared.
Probably a list would be the thing, or abusing a tuple as
a mutable object.

Hmm. I think I should get something ready, and we should
keep this thread short, or we will loose the rest of 
Guido's goodwill (if not already).

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From da at ski.org  Fri May 21 18:27:42 1999
From: da at ski.org (David Ascher)
Date: Fri, 21 May 1999 09:27:42 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <000401bea34b$e93fcda0$d89e2299@tim>
Message-ID: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>

On Fri, 21 May 1999, Tim Peters wrote:

> OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
> way to get pseudo-parallel semantics regardless of OS.

I read about coroutines years ago on c.l.py, but I admit I forgot it all.
Can you explain them briefly in pseudo-python? 

--david




From tim_one at email.msn.com  Sat May 22 06:22:50 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:50 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000401bea40a$c1d2d2c0$659e2299@tim>

[Tim]
> OK.  So how do you feel about coroutines?  Would sure be nice
> to have *some* way to get pseudo-parallel semantics regardless of OS.

[David Ascher]
> I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> Can you explain them briefly in pseudo-python?

How about real Python?  http://www.python.org/tim_one/000169.html contains a
complete coroutine implementation using threads under the covers (& exactly
5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
different object interface (making coroutines objects in their own right
instead of funneling everything through a "coroutine controller" object),
but the ideas are the same in every coroutine language.  The post contains
several executable examples, from simple to "literature standard".

I had forgotten all about this:  it contains solutions to the same "compare
tree fringes" problem Sam mentioned, *and* the generator-based building
block I posted three other solutions for in this thread.  That last looks
like:

# fringe visits a nested list in inorder, and detaches for each non-list
# element; raises EarlyExit after the list is exhausted
def fringe( co, list ):
    for x in list:
        if type(x) is type([]):
            fringe(co, x)
        else:
            co.detach(x)

def printinorder( list ):
    co = Coroutine()
    f = co.create(fringe, co, list)
    try:
        while 1:
            print co.tran(f),
    except EarlyExit:
        pass
    print

printinorder([1,2,3])  # 1 2 3
printinorder([[[[1,[2]]],3]]) # ditto
x = [0, 1, [2, [3]], [4,5], [[[6]]] ]
printinorder(x) # 0 1 2 3 4 5 6

Generators are really "half a coroutine", so this doesn't show the full
power (other examples in the post do).  co.detach is a special way to deal
with this asymmetry.  In the general case you use co.tran all the time,
where (see the post for more info)

    v = co.tran(c [, w])

means "resume coroutine c from the place it last did a co.tran, optionally
passing it the value w, and when somebody does a co.tran back to *me*,
resume me right here, binding v to the value *they* pass to co.tran ).

Knuth complains several times that it's very hard to come up with a
coroutine example that's both simple and clear <0.5 wink>.  In a nutshell,
coroutines don't have a "caller/callee" relationship, they have "we're all
equal partners" relationship, where any coroutine is free to resume any
other one where it left off.  It's no coincidence that making coroutines
easy to use was pioneered by simulation languages!  Just try simulating a
marriage where one partner is the master and the other a slave <wink>.

i-may-be-a-bachelor-but-i-have-eyes-ly y'rs  - tim





From tim_one at email.msn.com  Sat May 22 06:22:55 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:55 -0400
Subject: [Python-Dev] Re: Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000501bea40a$c3d1fe20$659e2299@tim>

Thoughts o' the day:

+ Generators ("semi-coroutines") are wonderful tools and easy to implement
without major changes to the PVM.  Icon calls 'em generators, Sather calls
'em iterators, and they're exactly what you need to implement "for thing in
object:" when object represents a collection that's tricky to materialize.
Python needs something like that.  OTOH, generators are pretty much limited
to that.

+ Coroutines are more general but much harder to implement, because each
coroutine needs its own stack (a generator only has one stack *frame*-- its
own --to worry about), and C-calling-Python can get into the act.  As Sam
said, they're probably no easier to implement than call/cc (but trivial to
implement given call/cc).

+ What may be most *natural* is to forget all that and think about a
variation of Python threads implemented directly via the interpreter,
without using OS threads.  The PVM already knows how to handle thread-state
swapping.  Given Christian's stackless interpreter, and barring C->Python
cases, I suspect Python can fake threads all by itself, in the sense of
interleaving their executions within a single "real" (OS) thread.  Given the
global interpreter lock, Python effectively does only-one-at-a-time anyway.

Threads are harder than generators or coroutines to learn, but

A) Many more people know how to use them already.

B) Generators and coroutines can be implemented using (real or fake)
threads.

C) Python has offered threads since the beginning.

D) Threads offer a powerful mode of control transfer coroutines don't,
namely "*anyone* else who can make progress now, feel encouraged to do so at
my expense".

E) For whatever reasons, in my experience people find threads much easier to
learn than call/cc -- perhaps because threads are *obviously* useful upon
first sight, while it takes a real Zen Experience before call/cc begins to
make sense.

F) Simulated threads could presumably produce much more informative error
msgs (about deadlocks and such) than OS threads, so even people using real
threads could find excellent debugging use for them.

Sam doesn't want to use "real threads" because they're pigs; fake threads
don't have to be.  Perhaps

x = y.SOME_ASYNC_CALL(r, s, t)

could map to e.g.

import config
if config.USE_REAL_THREADS:
    import threading
else:
    from simulated_threading import threading

from config.shared import msg_queue

class Y:
    def __init__(self, ...):
        self.ready = threading.Event()
        ...

    def SOME_ASYNC_CALL(self, r, s, t):
        result = [None]  # mutable container to hold the result
        msg_queue.put((server_of_the_day, r, s, t, self.ready, result))
        self.ready.wait()
        self.ready.clear()
        return result[0]

where some other simulated thread polls the msg_queue and does ready.set()
when it's done processing the msg enqueued by SOME_ASYNC_CALL.  For this to
scale nicely, it's probably necessary for the PVM to cooperate with the
simulated_threading implementation (e.g., a simulated thread that blocks
(like on self.ready.wait()) should be taken out of the collection of
simulated threads the PVM may attempt to resume -- else in Sam's case the
PVM would repeatedly attempt to wake up thousands of blocked threads, and
things would slow to a crawl).

Of course, simulated_threading could be built on top of call/cc or
coroutines too.  The point to making threads the core concept is keeping
Guido's brain from exploding.  Plus, as above, you can switch to "real
threads" by changing an import statement.

making-sure-the-global-lock-support-hair-stays-around-even-if-greg-
    renders-it-moot-for-real-threads<wink>-ly y'rs  - tim





From tismer at appliedbiometrics.com  Sat May 22 18:20:30 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 18:20:30 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim>
Message-ID: <3746D94E.239D0B8E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Tim]
> > OK.  So how do you feel about coroutines?  Would sure be nice
> > to have *some* way to get pseudo-parallel semantics regardless of OS.
> 
> [David Ascher]
> > I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> > Can you explain them briefly in pseudo-python?
> 
> How about real Python?  http://www.python.org/tim_one/000169.html contains a
> complete coroutine implementation using threads under the covers (& exactly
> 5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
> different object interface (making coroutines objects in their own right
> instead of funneling everything through a "coroutine controller" object),
> but the ideas are the same in every coroutine language.  The post contains
> several executable examples, from simple to "literature standard".

What an interesting thread! Unfortunately, all the examples are messed
up since some HTML formatter didn't take care of the python code,
rendering it unreadable. Is there a different version available?

Also, I'd like to read the rest of the threads in 
http://www.python.org/tim_one/ but it seems that only your messages
are archived?
Anyway, the citations in http://www.python.org/tim_one/000146.html
show me that you have been through all of this five years
ago, with a five years younger Guido which sounds a bit
different than today.
I had understood him better if I had known that this
is a re-iteration of a somehow dropped or entombed idea.

(If someone has the original archives from that epoche,
I'd be happy to get a copy. Actually, I'm missing all upto
end of 1996.)

A sort snapshot:
Stackless Python is meanwhile nearly alive, with recursion
avoided in ceval. Of course, some modules are left which
still need work, but enough for a prototype. Frames contain
now all necessry state and are now prepared for execution
and thrown back to the evaluator (elevator?). 

The key idea was to change the deeply nested functions in a 
way, that their last eval_code call happens to be tail recursive.
In ceval.c (and in other not yet changed places), functions
to a lot of preparation, build some parameter, call eval_code
and release the parameter. This was the crux, which I solved
by a new filed in the frame object, where such references
can be stored. The routine can now return with the ready packaged
frame, instead of calling it.

As a minimum facility for future co-anythings,
I provided a hook function for resuming frames, which causes no
overhead in the usual case but allows to override what a frame
does when someone returns control to it. To implement
this is due to some extension module, wether this may
be coroutines or your nice nano-threads, it's possible.

threadedly yours - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Sat May 22 21:04:43 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 21:04:43 +0200
Subject: [Python-Dev] How stackless can Python be?
Message-ID: <3746FFCB.CD506BE4@appliedbiometrics.com>

Hi,

to make the core interpreter stackless is one thing.
Turning functions which call the interpreter
from some deep nesting level into versions,
which return a frame object instead which is
to be called, is possible in many cases.

Internals like apply are rather uncomplicated to convert.
CallObjectWithKeywords is done.

What I have *no* good solution for is map.
Map does an iteration over evaluations and keeps
state while it is running. The same applies to reduce,
but it seems to be not used so much. Map is.

I don't see at the moment if map could be a killer
for Tim's nice mini-thread idea. How must map work,
if, for instance, a map is done with a function
which then begins to switch between threads,
before map is done? Can one imagine a problem?

Maybe it is no issue, but I'd really like to
know wether we need a stateless map.
(without replacing it by a for loop :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Sat May 22 21:35:58 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 15:35:58 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <000501bea48a$51563980$119e2299@tim>

>> http://www.python.org/tim_one/000169.html

[Christian]
> What an interesting thread! Unfortunately, all the examples are messed
> up since some HTML formatter didn't take care of the python code,
> rendering it unreadable. Is there a different version available?
>
> Also, I'd like to read the rest of the threads in
> http://www.python.org/tim_one/ but it seems that only your messages
> are archived?

Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's
all me, all the time, no mercy, no escape <wink>.

It predates the DejaNews archive, but the context can still be found in

http://www.python.org/search/hypermail/python-1994q2/index.html

There's a lot in that quarter about continuations & coroutines, most from
Steven Majewski, who took a serious shot at implementing all this.

Don't have the code in a more usable form; when my then-employer died, most
of my files went with it.

You can save the file as text, though!  The structure of the code is intact,
it's simply that your browswer squashes out the spaces when displaying it.
Nuke the <P> at the start of each code line and what remains is very close
to what was originally posted.

> Anyway, the citations in http://www.python.org/tim_one/000146.html
> show me that you have been through all of this five years
> ago, with a five years younger Guido which sounds a bit
> different than today.
> I had understood him better if I had known that this
> is a re-iteration of a somehow dropped or entombed idea.

You *used* to know that <wink>!  Thought you even got StevenM's old code
from him a year or so ago.  He went most of the way, up until hitting the
C<->Python stack intertwingling barrier, and then dropped it.  Plus Guido
wrote generator.py to shut me up, which works, but is about 3x clumsier to
use and runs about 50x slower than a generator should <wink>.

> ...
> Stackless Python is meanwhile nearly alive, with recursion
> avoided in ceval. Of course, some modules are left which
> still need work, but enough for a prototype. Frames contain
> now all necessry state and are now prepared for execution
> and thrown back to the evaluator (elevator?).
> ...

Excellent!  Running off to a movie & dinner now, but will give a more
careful reading tonight.

co-dependent-ly y'rs  - tim





From tismer at appliedbiometrics.com  Sun May 23 15:07:44 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 23 May 1999 15:07:44 +0200
Subject: [Python-Dev] How stackless can Python be?
References: <3746FFCB.CD506BE4@appliedbiometrics.com>
Message-ID: <3747FDA0.AD3E7095@appliedbiometrics.com>

After a good sleep, I can answer this one by myself.

I wrote:
> to make the core interpreter stackless is one thing.
...
> Internals like apply are rather uncomplicated to convert.
> CallObjectWithKeywords is done.
> 
> What I have *no* good solution for is map.
> Map does an iteration over evaluations and keeps
> state while it is running. The same applies to reduce,
> but it seems to be not used so much. Map is.
...

About stackless map,
and this applies to every extension module
which *wants* to be stackless. We don't have to enforce
everybody to be stackless, but there is a couple of
modules which would benefit from it.

The problem with map is, that it needs to keep state,
while repeatedly calling objects which might call
the interpreter. Even if we kept local variables
in the caller's frame, this would still be not
stateless. The info that a map is running is sitting
on the hardware stack, and that's wrong.

Now a solution. In my last post, I argued that I don't
want to replace map by a slower Python function. But
that gave me the key idea to solve this:

C functions which cannot tail-recursively unwound to
return an executable frame object must instead return
themselves as a frame object. That's it! Frames need
again to be a little extended. They have to spell their
interpreter, which normally is the old eval_code loop.

Anatomy of a standard frame invocation:
A new frame is created, parameters are inserted,
the frame is returned to the frame dispatcher,
which runs the inner eval_code loop until it bails out.
On return, special cases of control flow are handled,
as there are exception, returning, and now also calling.
This is an eval_code frame, since eval_code is its
execution handler.

Anatomy of a map frame invocation:
Map has several phases. The first phases to
argument checking and basic setup.
The last phase is iteration over function calls
and building the result. This phase must be split
off as a second function, eval_map.
A new frame is created, with all temporary variables
placed there. eval_map is inserted as the execution
handler.

Now, I think the analogy is obvious.
By building proper frames, it should be possible
to turn any extension function into a stackless function.

The overall protocol is:
A C function which does a simple computation which cannot
cause an interpreter invocation, may simply evaluate
and return a value.
A C function which might cause an interpreter invocation,
should return a freshly created frame as return value.
- This can be done either in a tail-recursive fashion,
  if the last action of the C function would basically 
  be calling the frame.
- If no tail-recursion is possible, the function must
  return a new frame for itself, with an executor
  for its purpose.

A good stackless candidate is Fredrik's xmlop, which
calls back into the interpreter. If that worked
without the hardware stack, then we could build
ultra-fast XML processors with co-routines!

As a side note: 
The frame structure which I sketched
so far is still made for eval_code in the first place,
but it has all necessary flexibilty for pluggable
interpreters. An extension module can now create
its own frame, with its own execution handler, and
throw it back to the frame dispatcher.
In other words: People can create extensions and
test their own VMs if they want.
This was not my primary intent, but comes for free
as a consequence of having a stackless map.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From fredrik at pythonware.com  Sun May 23 15:53:19 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 23 May 1999 15:53:19 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim> <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <031e01bea524$8db41e70$f29b12c2@pythonware.com>

Christian Tismer <tismer at appliedbiometrics.com> wrote:
> (If someone has the original archives from that epoche,
> I'd be happy to get a copy. Actually, I'm missing all upto
> end of 1996.)

http://www.egroups.com/group/python-list/info.html
has it all (almost), starting in 1991.

</F>




From tim_one at email.msn.com  Sat May  1 10:32:30 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 1 May 1999 04:32:30 -0400
Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.)
In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us>
Message-ID: <000801be93ad$27772ea0$7a9e2299@tim>

[Andrew M. Kuchling]
> ...
> A performance improvement project would definitely be a good idea
> for 1.6, and a good sub-topic for python-dev.

To the extent that optimization requires uglification, optimization got
pushed beyond Guido's comfort zone back around 1.4 -- little has made it in
since then.

Not griping; I'm just trying to avoid enduring the same discussions for the
third to twelfth times <wink>.

Anywho, on the theory that a sweeping speedup patch has no chance of making
it in regardless, how about focusing on one subsystem?  In my experience,
the speed issue Python gets beat up the most for is the relative slowness of
function calls.  It would be very good if eval_code2 somehow or other could
manage to invoke a Python function without all the hair of a recursive C
call, and I believe Guido intends to move in that direction for Python2
anyway.  This would be a good time to start exploring that seriously.

inspirationally y'rs  - tim





From da at ski.org  Sun May  2 00:15:32 1999
From: da at ski.org (David Ascher)
Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <37296856.5875AAAF@lemburg.com>
Message-ID: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>

> Since you put out to objectives, I'd like to propose a little
> different approach...
> 
> 1. Have eval/exec accept any mapping object as input
> 
> 2. Make those two copy the content of the mapping object into real
>    dictionaries
> 
> 3. Provide a hook into the dictionary implementation that can be
>    used to redirect KeyErrors and use that redirection to forward
>    the request to the original mapping objects

Interesting counterproposal.  I'm not sure whether any of the proposals on
the table really do what's needed for e.g. case-insensitive namespace
handling.  I can see how all of the proposals so far allow
case-insensitive reference name handling in the global namespace, but
don't we also need to hook into the local-namespace creation process to
allow case-insensitivity to work throughout? 

--david






From da at ski.org  Sun May  2 17:15:57 1999
From: da at ski.org (David Ascher)
Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9905020810270.152-100000@david.ski.org>

On Sun, 2 May 1999, Mark Hammond wrote:

> > I'm not sure whether any of the
> > proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation
> > process to
> > allow case-insensitivity to work throughout?
> 
> Why not?  I pictured case insensitive namespaces working so that they
> retain the case of the first assignment, but all lookups would be
> case-insensitive.
> 
> Ohh - right!  Python itself would need changing to support this.  I suppose
> that faced with code such as:
> 
> def func():
>   if spam:
>     Spam=1
> 
> Python would generate code that refers to "spam" as a local, and "Spam" as
> a global.
> 
> Is this why you feel it wont work?

I hadn't thought of that, to be truthful, but I think it's more generic.
[FWIW, I never much cared for the tag-variables-at-compile-time
optimization in CPython, and wouldn't miss it if were lost.]

The point is that if I eval or exec code which calls a function specifying
some strange mapping as the namespaces (global and current-local) I
presumably want to also specify how local namespaces work for the
function calls within that code snippet.  That means that somehow Python
has to know what kind of namespace to use for local environments, and not
use the standard dictionary.  Maybe we can simply have it use a
'.clear()'ed .__copy__ of the specified environment.

  exec 'foo()' in globals(), mylocals

would then call foo and within foo, the local env't would be
mylocals.__copy__.clear().  

Anyway, something for those-with-the-patches to keep in mind.  

--david





From tismer at appliedbiometrics.com  Sun May  2 15:00:37 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 02 May 1999 15:00:37 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com>


David Ascher wrote:
[Marc:> 
> > Since you put out to objectives, I'd like to propose a little
> > different approach...
> >
> > 1. Have eval/exec accept any mapping object as input
> >
> > 2. Make those two copy the content of the mapping object into real
> >    dictionaries
> >
> > 3. Provide a hook into the dictionary implementation that can be
> >    used to redirect KeyErrors and use that redirection to forward
> >    the request to the original mapping objects

I don't think that this proposal would give so much new
value. Since a mapping can also be implemented in arbitrary
ways, say by functions, a mapping is not necessarily finite
and might not be changeable into a dict.

[David:>
> Interesting counterproposal.  I'm not sure whether any of the proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation process to
> allow case-insensitivity to work throughout?

Case-independant namespaces seem to be a minor point,
nice to have for interfacing to other products, but then,
in a function, I see no benefit in changing the semantics
of function locals? The lookup of foreign symbols would 
always be through a mapping object. If you take COM for 
instance, your access to a COM wrapper for an arbitrary
object would be through properties of this object. After
assignment to a local function variable, why should we
support case-insensitivity at all?

I would think mapping objects would be a great 
simplification of lazy imports in COM, where
we would like to avoid to import really huge
namespaces in one big slurp. Also the wrapper code
could be made quite a lot easier and faster without
so much getattr/setattr trapping.

Does btw. anybody really want to see case-insensitivity
in Python programs? I'm quite happy with it as it is,
and I would even force the use to always use the same
case style after he has touched an external property
once. Example for Excel: You may write "xl.workbooks"
in lowercase, but then you have to stay with it.
This would keep Python source clean for, say, PyLint.

my 0.02 Euro - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Sun May  2 01:28:11 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Sun, 2 May 1999 09:28:11 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat>

> I'm not sure whether any of the
> proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation
> process to
> allow case-insensitivity to work throughout?

Why not?  I pictured case insensitive namespaces working so that they
retain the case of the first assignment, but all lookups would be
case-insensitive.

Ohh - right!  Python itself would need changing to support this.  I suppose
that faced with code such as:

def func():
  if spam:
    Spam=1

Python would generate code that refers to "spam" as a local, and "Spam" as
a global.

Is this why you feel it wont work?

Mark.




From mal at lemburg.com  Sun May  2 21:24:54 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 02 May 1999 21:24:54 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org> <372C4C75.5B7CCAC8@appliedbiometrics.com>
Message-ID: <372CA686.215D71DF@lemburg.com>

Christian Tismer wrote:
> 
> David Ascher wrote:
> [Marc:>
> > > Since you put out the objectives, I'd like to propose a little
> > > different approach...
> > >
> > > 1. Have eval/exec accept any mapping object as input
> > >
> > > 2. Make those two copy the content of the mapping object into real
> > >    dictionaries
> > >
> > > 3. Provide a hook into the dictionary implementation that can be
> > >    used to redirect KeyErrors and use that redirection to forward
> > >    the request to the original mapping objects
> 
> I don't think that this proposal would give so much new
> value. Since a mapping can also be implemented in arbitrary
> ways, say by functions, a mapping is not necessarily finite
> and might not be changeable into a dict.

[Disclaimer: I'm not really keen on having the possibility of
 letting code execute in arbitrary namespace objects... it would
 make code optimizations even less manageable.]

You can easily support infinite mappings by wrapping the
function into an object which returns an empty list
for .items() and then use the hook mentioned in 3 to
redirect the lookup to that function.

The proposal allows one to use such a proxy to simulate any
kind of mapping -- it works much like the __getattr__ hook
provided for instances.
 
> [David:>
> > Interesting counterproposal.  I'm not sure whether any of the proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation process to
> > allow case-insensitivity to work throughout?
> 
> Case-independant namespaces seem to be a minor point,
> nice to have for interfacing to other products, but then,
> in a function, I see no benefit in changing the semantics
> of function locals? The lookup of foreign symbols would
> always be through a mapping object. If you take COM for
> instance, your access to a COM wrapper for an arbitrary
> object would be through properties of this object. After
> assignment to a local function variable, why should we
> support case-insensitivity at all?
>
> I would think mapping objects would be a great
> simplification of lazy imports in COM, where
> we would like to avoid to import really huge
> namespaces in one big slurp. Also the wrapper code
> could be made quite a lot easier and faster without
> so much getattr/setattr trapping.

What do lazy imports have to do with case [in]sensitive
namespaces ? Anyway, how about a simple lazy import
mechanism in the standard distribution, i.e. why not make
all imports lazy ? Since modules are first class objects
this should be easy to implement...
 
> Does btw. anybody really want to see case-insensitivity
> in Python programs? I'm quite happy with it as it is,
> and I would even force the use to always use the same
> case style after he has touched an external property
> once. Example for Excel: You may write "xl.workbooks"
> in lowercase, but then you have to stay with it.
> This would keep Python source clean for, say, PyLint.

"No" and "me too" ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 243 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From MHammond at skippinet.com.au  Mon May  3 02:52:41 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Mon, 3 May 1999 10:52:41 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <372CA686.215D71DF@lemburg.com>
Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat>

[Marc]
> [Disclaimer: I'm not really keen on having the possibility of
>  letting code execute in arbitrary namespace objects... it would
>  make code optimizations even less manageable.]

Good point - although surely that would simply mean (certain) optimisations
can't be performed for code executing in that environment?  How to detect
this at "optimization time" may be a little difficult :-)

However, this is the primary purpose of this thread - to workout _if_ it is
a good idea, as much as working out _how_ to do it :-)

> The proposal allows one to use such a proxy to simulate any
> kind of mapping -- it works much like the __getattr__ hook
> provided for instances.

My only problem with Marc's proposal is that there already _is_ an
established mapping protocol, and this doesnt use it; instead it invents a
new one with the benefit being potentially less code breakage.

And without attempting to sound flippant, I wonder how many extension
modules will be affected?  Module init code certainly assumes the module
__dict__ is a dictionary, but none of my code assumes anything about other
namespaces.  Marc's extensions may be a special case, as AFAIK they inject
objects into other dictionaries (ie, new builtins?).  Again, not trying to
downplay this too much, but if it is only a problem for Marc's more
esoteric extensions, I dont feel that should hold up an otherwise solid
proposal.

[Chris, I think?]
> > Case-independant namespaces seem to be a minor point,
> > nice to have for interfacing to other products, but then,
> > in a function, I see no benefit in changing the semantics
> > of function locals? The lookup of foreign symbols would

I disagree here.  Consider Alice, and similar projects, where a (arguably
misplaced, but nonetheless) requirement is that the embedded language be
case-insensitive.  Period.  The Alice people are somewhat special in that
they had the resources to change the interpreters guts.  Most people wont,
and will look for a different language to embedd.

Of course, I agree with you for the specific cases you are talking - COM,
Active Scripting etc.  Indeed, everything I would use this for would prefer
to keep the local function semantics identical.

> > Does btw. anybody really want to see case-insensitivity
> > in Python programs? I'm quite happy with it as it is,
> > and I would even force the use to always use the same
> > case style after he has touched an external property
> > once. Example for Excel: You may write "xl.workbooks"
> > in lowercase, but then you have to stay with it.
> > This would keep Python source clean for, say, PyLint.
>
> "No" and "me too" ;-)

I think we are missing the point a little.  If we focus on COM, we may come
up with a different answer.  Indeed, if we are to focus on COM integration
with Python, there are other areas I would prefer to start with :-)

IMO, we should attempt to come up with a more flexible namespace mechanism
that is in the style of Python, and will not noticeably slowdown Python.
Then COM etc can take advantage of it - much in the same way that Python's
existing namespace model existed pre-COM, and COM had to take advantage of
what it could!

Of course, a key indicator of the likely success is how well COM _can_ take
advantage of it, and how much Alice could have taken advantage of it - I
cant think of any other yardsticks?

Mark.




From mal at lemburg.com  Mon May  3 09:56:53 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 03 May 1999 09:56:53 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <372D56C5.4738DE3D@lemburg.com>

Mark Hammond wrote:
> 
> [Marc]
> > [Disclaimer: I'm not really keen on having the possibility of
> >  letting code execute in arbitrary namespace objects... it would
> >  make code optimizations even less manageable.]
> 
> Good point - although surely that would simply mean (certain) optimisations
> can't be performed for code executing in that environment?  How to detect
> this at "optimization time" may be a little difficult :-)
> 
> However, this is the primary purpose of this thread - to workout _if_ it is
> a good idea, as much as working out _how_ to do it :-)
> 
> > The proposal allows one to use such a proxy to simulate any
> > kind of mapping -- it works much like the __getattr__ hook
> > provided for instances.
> 
> My only problem with Marc's proposal is that there already _is_ an
> established mapping protocol, and this doesnt use it; instead it invents a
> new one with the benefit being potentially less code breakage.

...and that's the key point: you get the intended features and
the core code will not have to be changed in significant ways.
Basically, I think these kind of core extensions should be done
in generic ways, e.g. by letting the eval/exec machinery accept
subclasses of dictionaries, rather than trying to raise the
abstraction level used and slowing things down in general
just to be able to use the feature on very few occasions.

> And without attempting to sound flippant, I wonder how many extension
> modules will be affected?  Module init code certainly assumes the module
> __dict__ is a dictionary, but none of my code assumes anything about other
> namespaces.  Marc's extensions may be a special case, as AFAIK they inject
> objects into other dictionaries (ie, new builtins?).  Again, not trying to
> downplay this too much, but if it is only a problem for Marc's more
> esoteric extensions, I dont feel that should hold up an otherwise solid
> proposal.

My mxTools extension does the assignment in Python, so it wouldn't
be affected. The others only do the usual modinit() stuff.

Before going any further on this thread we may have to ponder a little
more on the objectives that we have. If it's only case-insensitive
lookups then I guess a simple compile time switch exchanging the
implementations of string hash and compare functions would do the
trick. If we're after doing wild things like lookups accross
networks, then a more specific approach is needed.

So what is it that we want in 1.6 ?

> [Chris, I think?]
> > > Case-independant namespaces seem to be a minor point,
> > > nice to have for interfacing to other products, but then,
> > > in a function, I see no benefit in changing the semantics
> > > of function locals? The lookup of foreign symbols would
> 
> I disagree here.  Consider Alice, and similar projects, where a (arguably
> misplaced, but nonetheless) requirement is that the embedded language be
> case-insensitive.  Period.  The Alice people are somewhat special in that
> they had the resources to change the interpreters guts.  Most people wont,
> and will look for a different language to embedd.
> 
> Of course, I agree with you for the specific cases you are talking - COM,
> Active Scripting etc.  Indeed, everything I would use this for would prefer
> to keep the local function semantics identical.

As I understand the needs in COM and AS you are talking about
object attributes, right ? Making these case-insensitive is
a job for a proxy or a __getattr__ hack.
 
> > > Does btw. anybody really want to see case-insensitivity
> > > in Python programs? I'm quite happy with it as it is,
> > > and I would even force the use to always use the same
> > > case style after he has touched an external property
> > > once. Example for Excel: You may write "xl.workbooks"
> > > in lowercase, but then you have to stay with it.
> > > This would keep Python source clean for, say, PyLint.
> >
> > "No" and "me too" ;-)
> 
> I think we are missing the point a little.  If we focus on COM, we may come
> up with a different answer.  Indeed, if we are to focus on COM integration
> with Python, there are other areas I would prefer to start with :-)
> 
> IMO, we should attempt to come up with a more flexible namespace mechanism
> that is in the style of Python, and will not noticeably slowdown Python.
> Then COM etc can take advantage of it - much in the same way that Python's
> existing namespace model existed pre-COM, and COM had to take advantage of
> what it could!
> 
> Of course, a key indicator of the likely success is how well COM _can_ take
> advantage of it, and how much Alice could have taken advantage of it - I
> cant think of any other yardsticks?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 242 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From fredrik at pythonware.com  Mon May  3 16:01:10 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 16:01:10 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com>

scriptics is positioning tcl as a perl killer:

    http://www.scriptics.com/scripting/perl.html

afaict, unicode and event handling are the two
main thingies missing from python 1.5.

-- unicode: is on its way.

-- event handling: asynclib/asynchat provides an
awesome framework for event-driven socket pro-
gramming.  however, Python still lacks good cross-
platform support for event-driven access to files
and pipes.  are threads good enough, or would it
be cool to have something similar to Tcl's fileevent
stuff in Python?

-- regexps: has anyone compared the new uni-
code-aware regexp package in Tcl with pcre?

comments?

</F>

btw, the rebol folks have reached 2.0:
    http://www.rebol.com/

maybe 1.6 should be renamed to Python 6.0?




From akuchlin at cnri.reston.va.us  Mon May  3 17:14:15 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:14:15 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us>

Fredrik Lundh writes:
>-- regexps: has anyone compared the new uni-
>code-aware regexp package in Tcl with pcre?

	I looked at it a bit when Tcl 8.1 was in beta; it derives from
Henry Spencer's 1998-vintage code, which seems to try to do a lot of
optimization and analysis.  It may even compile DFAs instead of NFAs
when possible, though it's hard for me to be sure.  This might give it
a substantial speed advantage over engines that do less analysis, but
I haven't benchmarked it.  The code is easy to read, but difficult to
understand because the theory underlying the analysis isn't explained
in the comments; one feels there should be an accompanying paper to
explain how everything works, and it's why I'm not sure if it really
is producing DFAs for some expressions.

	Tcl seems to represent everything as UTF-8 internally, so
there's only one regex engine; there's .  The code is scattered over
more files:

amarok generic>ls re*.[ch]
regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
amarok generic>wc -l re*.[ch]
     742 regc_color.c
     170 regc_cvec.c
    1010 regc_lex.c
     781 regc_locale.c
    1528 regc_nfa.c
    2124 regcomp.c
      85 regcustom.h
     627 rege_dfa.c
      82 regerror.c
      18 regerrs.h
     308 regex.h
     952 regexec.c
      25 regfree.c
      56 regfronts.c
     388 regguts.h
    8896 total
amarok generic>

	This would be an issue for using it with Python, since all
these files would wind up scattered around the Modules directory.  For
comparison, pypcre.c is around 4700 lines of code.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Things need not have happened to be true. Tales and dreams are the
shadow-truths that will endure when mere facts are dust and ashes, and forgot.
    -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_




From guido at CNRI.Reston.VA.US  Mon May  3 17:32:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 11:32:09 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT."
             <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com>  
            <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us>

> 	I looked at it a bit when Tcl 8.1 was in beta; it derives from
> Henry Spencer's 1998-vintage code, which seems to try to do a lot of
> optimization and analysis.  It may even compile DFAs instead of NFAs
> when possible, though it's hard for me to be sure.  This might give it
> a substantial speed advantage over engines that do less analysis, but
> I haven't benchmarked it.  The code is easy to read, but difficult to
> understand because the theory underlying the analysis isn't explained
> in the comments; one feels there should be an accompanying paper to
> explain how everything works, and it's why I'm not sure if it really
> is producing DFAs for some expressions.
> 
> 	Tcl seems to represent everything as UTF-8 internally, so
> there's only one regex engine; there's .

Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
point the regex engine was compiled twice, once for 8-bit chars and
once for 16-bit chars.  But this may have changed.

I've noticed that Perl is taking the same position (everything is
UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
from 8-bit bytes.  Python is currently in the Java camp.  This might
be a good time to make sure that we're still convinced that this is
the right thing to do!

> The code is scattered over
> more files:
> 
> amarok generic>ls re*.[ch]
> regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
> regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
> regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
> amarok generic>wc -l re*.[ch]
>      742 regc_color.c
>      170 regc_cvec.c
>     1010 regc_lex.c
>      781 regc_locale.c
>     1528 regc_nfa.c
>     2124 regcomp.c
>       85 regcustom.h
>      627 rege_dfa.c
>       82 regerror.c
>       18 regerrs.h
>      308 regex.h
>      952 regexec.c
>       25 regfree.c
>       56 regfronts.c
>      388 regguts.h
>     8896 total
> amarok generic>
> 
> 	This would be an issue for using it with Python, since all
> these files would wind up scattered around the Modules directory.  For
> comparison, pypcre.c is around 4700 lines of code.

I'm sure that if it's good code, we'll find a way.  Perhaps a more
interesting question is whether it is Perl5 compatible.  I contacted
Henry Spencer at the time and he was willing to let us use his code.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From akuchlin at cnri.reston.va.us  Mon May  3 17:56:46 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:56:46 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
	<14125.47524.196878.583460@amarok.cnri.reston.va.us>
	<199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
>point the regex engine was compiled twice, once for 8-bit chars and
>once for 16-bit chars.  But this may have changed.

	It doesn't seem to currently; the code in tclRegexp.c looks
like this:

    /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the
     * matches from character to byte offsets.
     */
    regexpPtr->string = string;
    Tcl_DStringInit(&stringBuffer);
    uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer);
    numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar);
    /* Perform the regexp match. */
    result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1,
            ((string > start) ? REG_NOTBOL : 0));

	ISTR the Spencer engine does, however, define a small and
large representation for NFAs and have two versions of the engine, one
for each representation.  Perhaps that's what you're thinking of.

>I've noticed that Perl is taking the same position (everything is
>UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
>from 8-bit bytes.  Python is currently in the Java camp.  This might
>be a good time to make sure that we're still convinced that this is
>the right thing to do!

	I don't know.  There's certainly the fundamental dichotomy
that strings are sometimes used to represent characters, where
changing encodings on input and output is reasonably, and sometimes
used to hold chunks of binary data, where any changes are incorrect.
Perhaps Paul Prescod is right, and we should try to get some other
data type (array.array()) for holding binary data, as distinct from
strings.

>I'm sure that if it's good code, we'll find a way.  Perhaps a more
>interesting question is whether it is Perl5 compatible.  I contacted
>Henry Spencer at the time and he was willing to let us use his code.

	Mostly Perl-compatible, though it doesn't look like the 5.005
features are there, and I haven't checked for every single 5.004
feature.  Adding missing features might be problematic, because I
don't really understand what the code is doing at a high level.  Also,
is there a user community for this code?  Do any other projects use
it?  Philip Hazel has been quite helpful with PCRE, an important thing
when making modifications to the code.
 
	Should I make a point of looking at what using the Spencer
engine would entail?  It might not be too difficult (an evening or
two, maybe?) to write a re.py that sat on top of the Spencer code;
that would at least let us do some benchmarking.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In Einstein's theory of relativity the observer is a man who sets out in quest
of truth armed with a measuring-rod. In quantum theory he sets out with a
sieve.
    -- Sir Arthur Eddington





From guido at CNRI.Reston.VA.US  Mon May  3 18:02:22 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 12:02:22 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT."
             <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>  
            <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us>

> 	Should I make a point of looking at what using the Spencer
> engine would entail?  It might not be too difficult (an evening or
> two, maybe?) to write a re.py that sat on top of the Spencer code;
> that would at least let us do some benchmarking.

Surely this would be more helpful than weeks of specilative emails --
go for it!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Mon May  3 19:10:55 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:10:55 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us>
Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com>

> Also, is there a user community for this code?

how about comp.lang.tcl ;-)

</F>




From fredrik at pythonware.com  Mon May  3 19:15:00 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:15:00 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>             <14125.49911.982236.754340@amarok.cnri.reston.va.us>  <199905031602.MAA05829@eric.cnri.reston.va.us>
Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com>

talking about regexps, here's another thing that
would be quite nice to have in 1.6 (available from
the Python level, that is).  or is it already in there
somewhere?

</F>

...

http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873

Tcl 8.1b3 Request:  Generated by Scriptics' bug entry form at

Submitted by:  Frederic BONNET
OperatingSystem:  Windows 98
CustomShell:  Applied patch to the regexp engine (the exec part)
Synopsis:  regexp improvements

DesiredBehavior:
    As previously requested by Don Libes:
    
    > I see no way for Tcl_RegExpExec to indicate "could match" meaning
    > "could match if more characters arrive that were suitable for a
    > match".  This is required for a class of applications involving
    > matching on a stream required by Expect's interact command.  Henry
    > assured me that this facility would be in the engine (I'm not the only
    > one that needs it).  Note that it is not sufficient to add one more
    > return value to Tcl_RegExpExec (i.e., 2) because one needs to know
    > both if something matches now and can match later.  I recommend
    > another argument (canMatch *int) be added to Tcl_RegExpExec.

/patch info follows/

...




From bwarsaw at cnri.reston.va.us  Tue May  4 00:28:23 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Mon, 3 May 1999 18:28:23 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us>

I've been using Jitterbug for a couple of weeks now as my bug database
for Mailman and JPython.  So it was easy enough for me to set up a
database for Python bug reports.  Guido is in the process of tailoring 
the Jitterbug web interface to his liking and will announce it to the
appropriate forums when he's ready.

In the meantime, I've created YAML that you might be interested in.
All bug reports entered into Jitterbug will be forwarded to
python-bugs-list at python.org.  You are invited to subscribe to the list 
by visiting

    http://www.python.org/mailman/listinfo/python-bugs-list

Enjoy,
-Barry



From jeremy at cnri.reston.va.us  Tue May  4 00:30:10 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon,  3 May 1999 18:30:10 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
References: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>

Pretty low volume list, eh?



From MHammond at skippinet.com.au  Tue May  4 01:28:39 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Tue, 4 May 1999 09:28:39 +1000
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>
Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat>

ha - we wish.  More likely to be full of detailed bug reports about how 1/2
!= 0.5, or that "def foo(baz=[])" is buggy, etc :-)

Mark.

> Pretty low volume list, eh?




From tim_one at email.msn.com  Tue May  4 07:16:17 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:16:17 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <000701be95ed$3d594180$dca22299@tim>

[Guido & Andrew on Tcl's new regexp code]
> I'm sure that if it's good code, we'll find a way.  Perhaps a more
> interesting question is whether it is Perl5 compatible.  I contacted
> Henry Spencer at the time and he was willing to let us use his code.

Haven't looked at the code, but did read the manpage just now:

    http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm

WRT Perl5 compatibility, it sez:

    Incompatibilities of note include `\b', `\B', the lack of special
    treatment for a trailing newline, the addition of complemented
    bracket expressions to the things affected by newline-sensitive
    matching, the restrictions on parentheses and back references in
    lookahead constraints, and the longest/shortest-match (rather than
    first-match) matching semantics.

So some gratuitous differences, and maybe a killer:  Guido hasn't had much
kind to say about "longest" (aka POSIX) matching semantics.  An example from
the page:

    (week|wee)(night|knights)
    matches all ten characters of `weeknights'

which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
'night'.

It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
is correct; indeed, it's a pain to get that behavior any other way!

otoh-it's-potentially-very-much-faster-ly y'rs  - tim





From tim_one at email.msn.com  Tue May  4 07:51:01 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:51:01 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <000701be95ed$3d594180$dca22299@tim>
Message-ID: <000901be95f2$195556c0$dca22299@tim>

[Tim]
> ...
> It's the *natural* semantics if Andrew's suspicion that it's
> compiling a DFA is correct ...

More from the man page:

    AREs report the longest/shortest match for the RE, rather than
    the first found in a specified search order. This may affect some
    RREs which were written in the expectation that the first match
    would be reported. (The careful crafting of RREs to optimize the
    search order for fast matching is obsolete (AREs examine all possible
    matches in parallel, and their performance is largely insensitive to
    their complexity) but cases where the search order was exploited to
    deliberately find a match which was not the longest/shortest will
    need rewriting.)

Nails it, yes?  Now, in 10 seconds, try to remember a regexp where this
really matters <wink>.

Note in passing that IDLE's colorizer regexp *needs* to search for
triple-quoted strings before single-quoted ones, else the P/P semantics
would consider """ to be an empty single-quoted string followed by a double
quote.  This isn't a case where it matters in a bad way, though!  The
"longest" rule picks the correct alternative regardless of the order in
which they're written.

at-least-in-that-specific-regex<0.1-wink>-ly y'rs  - tim





From guido at CNRI.Reston.VA.US  Tue May  4 14:26:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 08:26:04 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT."
             <000701be95ed$3d594180$dca22299@tim> 
References: <000701be95ed$3d594180$dca22299@tim> 
Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us>

[Tim]
> So some gratuitous differences, and maybe a killer:  Guido hasn't had much
> kind to say about "longest" (aka POSIX) matching semantics.
> 
> An example from the page:
> 
>     (week|wee)(night|knights)
>     matches all ten characters of `weeknights'
> 
> which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
> 'night'.
> 
> It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
> is correct; indeed, it's a pain to get that behavior any other way!

Possibly contradicting what I once said about DFAs (I have no idea
what I said any more :-): I think we shouldn't be hung up about the
subtleties of DFA vs. NFA; for most people, the Perl-compatibility
simply means that they can use the same metacharacters.  My guess is
that people don'y so much translate long Perl regexp's to Python but
simply transport their (always incomplete -- Larry Wall *wants* it
that way :-) knowledge of Perl regexps to Python.  My meta-guess is
that this is also Henry Spencer's and John Ousterhout's guess.  As for
Larry Wall, I guess he really doesn't care :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at cnri.reston.va.us  Tue May  4 18:14:41 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Tue,  4 May 1999 12:14:41 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
References: <000701be95ed$3d594180$dca22299@tim>
	<199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Possibly contradicting what I once said about DFAs (I have no idea
>what I said any more :-): I think we shouldn't be hung up about the
>subtleties of DFA vs. NFA; for most people, the Perl-compatibility
>simply means that they can use the same metacharacters.  My guess is

	I don't like slipping in such a change to the semantics with
no visible change to the module name or interface.  On the other hand,
if it's not NFA-based, then it can provide POSIX semantics without
danger of taking exponential time to determine the longest match.
BTW, there's an interesting reference, I assume to this code, in
_Mastering Regular Expressions_; Spencer is quoted on page 121 as
saying it's "at worst quadratic in text size.".

	Anyway, we can let it slide until a Python interface gets written.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In the black shadow of the Baba Yaga babies screamed and mothers miscarried;
milk soured and men went mad.
    -- In SANDMAN #38: "The Hunt"




From guido at CNRI.Reston.VA.US  Tue May  4 18:19:06 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 12:19:06 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT."
             <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us>  
            <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us>

> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

Not sure if that was the same code -- this is *new* code, not
Spencer's old code.  I think Friedl's book is older than the current
code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Wed May  5 07:37:02 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 01:37:02 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <000701be96b9$4e434460$799e2299@tim>

I've consistently found that the best way to kill a thread is to rename it
accurately <wink>.

Agree w/ Guido that few people really care about the differing semantics.

Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage
anyway:  code will definitely break.  Like

    \b(?:
        (?P<keyword>and|if|else|...) |
        (?P<identifier>[a-zA-Z_]\w*)
    )\b

The (special)|(general) idiom relies on left-to-right match-and-out
searching of alternatives to do its job correctly.  Not to mention that \b
is not a word-boundary assertion in the new pkg (talk about pointlessly
irritating differences!  at least this one could be easily hidden via
brainless preprocessing).

Over the long run, moving to a DFA locks Python out of the directions Perl
is *moving*, namely embedding all sorts of runtime gimmicks in regexps that
exploit knowing the "state of the match so far".  DFAs don't work that way.
I don't mind losing those possibilities, because I think the regexp
sublanguage is strained beyond its limits already.  But that's a decision
with Big Consequences, so deserves some thought.

I'd definitely like the (sometimes dramatically) increased speed a DFA can
offer (btw, this code appears to use a lazily-generated DFA, to avoid the
exponential *compile*-time a straightforward DFA implementation can
suffer -- the code is very complex and lacks any high-level internal docs,
so we better hope Henry stays in love with it <0.5 wink>).

> ...
> My guess is that people don't so much translate long Perl regexp's
> to Python but simply transport their (always incomplete -- Larry Wall
> *wants* it that way :-) knowledge of Perl regexps to Python.

This is directly proportional to the number of feeble CGI programmers Python
attracts <wink>.  The good news is that they wouldn't know an NFA from a DFA
if Larry bit Henry on the ass ...

> My meta-guess is that this is also Henry Spencer's and John
> Ousterhout's guess.

I think Spencer strongly favors DFA semantics regardless of fashion, and
Ousterhout is a pragmatist.  So I trust JO's judgment more <0.9 wink>.

> As for Larry Wall, I guess he really doesn't care :-)

I expect he cares a lot!  Because a DFA would prevent Perl from going even
more insane in its present direction.


About the age of the code, postings to comp.lang.tcl have Henry saying he
was working on the alpha version intensely as recently as Decemeber ('98).
A few complaints about the alpha release trickled in, about regexp compile
speed and regexp matching speed in specific cases.  Perhaps paradoxically,
the latter were about especially simple regexps with long fixed substrings
(where this mountain of sophisticated machinery is likely to get beat cold
by an NFA with some fixed-substring lookahead smarts -- which latter Henry
intended to graft into this pkg too).

[Andrew]
> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

[Guido]
> Not sure if that was the same code -- this is *new* code, not
> Spencer's old code.  I think Friedl's book is older than the current
> code.

I expect this is an invariant, though:  it's not natural for a DFA to know
where subexpression matches begin and end, and there's a pile of xxx_dissect
functions in regexec.c that use what strongly appear to be worst-case
quadratic-time algorithms for figuring that out after it's known that the
overall expression has *a* match.  Expect too, but don't know, that only
pathological cases are actually expensive.


Question:  has this package been released in any other context, or is it
unique to Tcl?  I searched in vain for an announcement (let alone code) from
Henry, or any discussion of this code outside the Tcl world.

whatever-happens-i-vote-we-let-them-debug-it<wink>-ly y'rs  - tim





From gstein at lyra.org  Wed May  5 08:22:20 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 4 May 1999 23:22:20 -0700 (PDT)
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <000701be96b9$4e434460$799e2299@tim>
Message-ID: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>

On Wed, 5 May 1999, Tim Peters wrote:
>...
> Question:  has this package been released in any other context, or is it
> unique to Tcl?  I searched in vain for an announcement (let alone code) from
> Henry, or any discussion of this code outside the Tcl world.

Apache uses it.

However, the Apache guys have considered possibility updating the thing. I
gather that they have a pretty old snapshot. Another guy mentioned PCRE
and I pointed out that Python uses it for its regex support. In other
words, if Apache *does* update the code, then it may be that Apache will
drop the HS engine in favor of PCRE.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/





From Ivan.Porres at abo.fi  Wed May  5 10:29:21 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 11:29:21 +0300
Subject: [Python-Dev] Python for Small Systems patch
Message-ID: <37300161.8DFD1D7F@abo.fi>

Python for Small Systems is a minimal version of the python interpreter,
intended to run on small embedded systems with a limited amount of
memory. 

Since there is some interest in the newsgroup, we have decide to release
an alpha version of the patch. You can download the patch from the
following page: 

http://www.abo.fi/~iporres/python

There is no documentation about the changes, but I guess that it is not
so difficult to figure out what Raul has been doing. 

There are some simple examples in the Demo/hitachi directory. The
configure scripts are broken. We plan to modify the configure scripts 
for cross-compilation. We are still testing, cleaning
and trying to reduce the memory requirements of the patched interpreter.
We also plan to write some documentation.

Please send comments to Raul (rparra at abo.fi) or to me (iporres at abo.fi),

Regards,
Ivan


-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 13:52:24 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 13:52:24 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi>
Message-ID: <373030F8.21B73451@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Python for Small Systems is a minimal version of the python interpreter,
> intended to run on small embedded systems with a limited amount of
> memory.
> 
> Since there is some interest in the newsgroup, we have decide to release
> an alpha version of the patch. You can download the patch from the
> following page:
> 
> http://www.abo.fi/~iporres/python
> 
> There is no documentation about the changes, but I guess that it is not
> so difficult to figure out what Raul has been doing.

Ivan,
small Python is a very interesting thing,
thanks for the preview.

But, aren't 12600 lines of diff a little too much
to call it "not difficult to figure out"? :-)

The very last line was indeed helpful:

+++ Pss/miniconfigure	Tue Mar 16 16:59:42 1999
@@ -0,0 +1 @@
+./configure --prefix="/home/rparra/python/Python-1.5.1"
--without-complex --without-float --without-long --without-file
--without-libm --without-libc --without-fpectl --without-threads
--without-dec-threads --with-libs=

But I'd be interested in a brief list
of which other features are out, and even more which
structures were changed. Would that be possible?

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From Ivan.Porres at abo.fi  Wed May  5 15:17:17 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 16:17:17 +0300
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com>
Message-ID: <373044DD.FE4499E@abo.fi>

Christian Tismer wrote:
> Ivan,
> small Python is a very interesting thing,
> thanks for the preview.
> 
> But, aren't 12600 lines of diff a little too much
> to call it "not difficult to figure out"? :-)

Raul Parra (rpb), the author of the patch, got the "source scissors"
(#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
embedded system with some RAM, no keyboard, no screen and no OS. An
example application can be a printer where the print jobs are python
bytecompiled scripts (instead of postscript).

We plan to write some documentation about the patch. Meanwhile, here are
some of the changes:

WITHOUT_PARSER, WITHOUT_COMPILER
Defining WITHOUT_PARSER removes the parser. This has a lot of
implications (no eval() !) but saves a lot of memory. The interpreter
can only execute byte-compiled scripts, that is PyCodeObjects. 

Most embedded processors have poor floating point capabilities. (They
can not compete with DSP's):

WITHOUT-COMPLEX
Removes support for complex numbers

WITHOUT-LONG
Removes long numbers

WITHOUT-FLOAT
Removes floating point numbers

Dependences with the OS:

WITHOUT-FILE
Removes file objects. No file, no print, no input, no interactive
prompt. This is not to bad in a device without hard disk, keyboard or
screen...

WITHOUT-GETPATH
Removes dependencies with os path.(Probabily this change should be
integrated with WITHOUT-FILE)

These changes render most of the standard modules unusable.
There are no fundamental changes on the interpter, just cut and cut....

Ivan
-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 15:31:05 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 15:31:05 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi>
Message-ID: <37304819.AD636B67@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Christian Tismer wrote:
> > Ivan,
> > small Python is a very interesting thing,
> > thanks for the preview.
> >
> > But, aren't 12600 lines of diff a little too much
> > to call it "not difficult to figure out"? :-)
> 
> Raul Parra (rpb), the author of the patch, got the "source scissors"
> (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
> embedded system with some RAM, no keyboard, no screen and no OS. An
> example application can be a printer where the print jobs are python
> bytecompiled scripts (instead of postscript).
> 
> We plan to write some documentation about the patch. Meanwhile, here are
> some of the changes:

Many thanks, this is really interesting

> These changes render most of the standard modules unusable.
> There are no fundamental changes on the interpter, just cut and cut....

I see. A last thing which I'm curious about is the executable
size. If this can be compared to a Windows dll at all. Did you 
compile without the changes for your target as well? 
How is the ratio? The python15.dll file contains everything
of core Python and is about 560 KB large.
If your engine goes down to, say below 200 KB, this could
be a great thing for embedding Python into other apps.

ciao & thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Wed May  5 16:55:40 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 5 May 1999 10:55:40 -0400 (EDT)
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
References: <199905041226.IAA07627@eric.cnri.reston.va.us>
	<000701be96b9$4e434460$799e2299@tim>
Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us>

>>>>> "TP" == Tim Peters <tim_one at email.msn.com> writes:

    TP> Over the long run, moving to a DFA locks Python out of the
    TP> directions Perl is *moving*, namely embedding all sorts of
    TP> runtime gimmicks in regexps that exploit knowing the "state of
    TP> the match so far".  DFAs don't work that way.  I don't mind
    TP> losing those possibilities, because I think the regexp
    TP> sublanguage is strained beyond its limits already.  But that's
    TP> a decision with Big Consequences, so deserves some thought.

I know zip about the internals of the various regexp package.  But as
far as the Python level interface, would it be feasible to support
both as underlying regexp engines underneath re.py?  The idea would be 
that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
Then all the rest of the magic happens behind the scenes, with
appropriate exceptions thrown if there are syntax mismatches in the
regexp that can't be worked around by preprocessors, etc.

Or would that be more confusing than yet another different regexp
module?

-Barry



From tim_one at email.msn.com  Wed May  5 17:55:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 11:55:20 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>
Message-ID: <000601be970f$adef5740$a59e2299@tim>

[Tim]
> Question:  has this package [Tcl's 8.1 regexp support] been released in
> any other context, or is it unique to Tcl?  I searched in vain for an
> announcement (let alone code) from Henry, or any discussion of this code
> outside the Tcl world.

[Greg Stein]
> Apache uses it.
>
> However, the Apache guys have considered possibility updating the thing. I
> gather that they have a pretty old snapshot. Another guy mentioned PCRE
> and I pointed out that Python uses it for its regex support. In other
> words, if Apache *does* update the code, then it may be that Apache will
> drop the HS engine in favor of PCRE.

Hmm.  I just downloaded the Apache 1.3.4 source to check on this, and it
appears to be using a lightly massaged version of Spencer's old (circa
'92-'94) just-POSIX regexp package.  Henry has been distributing regexp pkgs
for a loooong time <wink>.

The Tcl 8.1 regexp pkg is much hairier.  If the Apache folk want to switch
in order to get the Perl regexp syntax extensions, this Tcl version is worth
looking at too.  If they want to switch for some other reason, it would be
good to know what that is!

The base pkg Apache uses is easily available all over the web; the pkg Tcl
8.1 is using I haven't found anywhere except in the Tcl download (which is
why I'm wondering about it -- so far, it doesn't appear to be distributed by
Spencer himself, in a non-Tcl-customized form).

looks-like-an-entirely-new-pkg-to-me-ly y'rs  - tim





From beazley at cs.uchicago.edu  Wed May  5 18:54:45 1999
From: beazley at cs.uchicago.edu (David Beazley)
Date: Wed, 5 May 1999 11:54:45 -0500 (CDT)
Subject: [Python-Dev] My (possibly delusional) book project
Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu>

Although this is a little off-topic for the developer list, I want to
fill people in on a new Python book project.  A few months ago, 
I was approached about doing a new Python reference book and I've
since decided to proceed with the project (after all, an increased
presence at the bookstore is probably a good thing :-).

In any event, my "vision" for this book is to take the material in the
Python tutorial, language reference, library reference, and extension
guide and squeeze it into a compact book no longer than 300 pages (and
hopefully without having to use a 4-point font).  Actually, what I'm
really trying to do is write something in a style similar to the K&R C
Programming book (very terse, straight to the point, and technically
accurate). The book's target audience is experienced/expert
programmers.

With this said, I would really like to get feedback from the developer
community about this project in a few areas.  First, I want to make
sure the language reference is in sync with the latest version of
Python, that it is as accurate as possible, and that it doesn't leave
out any important topics or recent developments.  Second, I would be
interested in knowing how to emphasize certain topics (for instance,
should I emphasize class-based exceptions over string-based exceptions
even though most books only cover the former case?).  The other big
area is the library reference.  Given the size of the library, I'm
going to cut a number of modules out.  However, the choice of what to
cut is not entirely clear (for now, it's a judgment call on my part).

All of the work in progress for this project is online at:

   http://rustler.cs.uchicago.edu/~beazley/essential/reference.html

I would love to get constructive feedback about this from other
developers.  Of course, I'll keep people posted in any case.

Cheers,

Dave




From tim_one at email.msn.com  Thu May  6 07:43:16 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 6 May 1999 01:43:16 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us>
Message-ID: <000d01be9783$57543940$2ca22299@tim>

[Tim notes that moving to a DFA regexp engine would rule out some future
 aping of Perl mistakes <wink>]

[Barry "The Great Compromiser" Warsaw]
> I know zip about the internals of the various regexp package.  But as
> far as the Python level interface, would it be feasible to support
> both as underlying regexp engines underneath re.py?  The idea would be
> that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
> re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
> Then all the rest of the magic happens behind the scenes, with
> appropriate exceptions thrown if there are syntax mismatches in the
> regexp that can't be worked around by preprocessors, etc.
>
> Or would that be more confusing than yet another different regexp
> module?

It depends some on what percentage of the Python distribution Guido wants to
devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of
code in Modules/, where regexp packages already consume more than anything
else.

It's a lot of delicate, difficult code.  Someone would need to step up and
champion each alternative package.  I haven't asked Andrew lately, but I'd
bet half a buck the thrill of supporting pcre has waned.

If there were competing packages, your suggested interface is fine.  I just
doubt the Python developers will support more than one (Andrew may still be
young, but he can't possibly still be naive enough to sign up for two of
these nightmares <wink>).

i'm-so-old-i-never-signed-up-for-one-ly y'rs  - tim





From rushing at nightmare.com  Thu May 13 08:34:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 12 May 1999 23:34:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905070507.BAA22545@python.org>
References: <199905070507.BAA22545@python.org>
Message-ID: <14138.28243.553816.166686@seattle.nightmare.com>

[list has been quiet, thought I'd liven things up a bit. 8^)]

I'm not sure if this has been brought up before in other forums, but
has there been discussion of separating the Python and C invocation
stacks, (i.e., removing recursive calls to the intepreter) to
facilitate coroutines or first-class continuations?

One of the biggest barriers to getting others to use asyncore/medusa
is the need to program in continuation-passing-style (callbacks,
callbacks to callbacks, state machines, etc...).  Usually there has to
be an overriding requirement for speed/scalability before someone will
even look into it.  And even when you do 'get' it, there are limits to
how inside-out your thinking can go. 8^)

If Python had coroutines/continuations, it would be possible to hide
asyncore-style select()/poll() machinery 'behind the scenes'.  I
believe that Concurrent ML does exactly this...

Other advantages might be restartable exceptions, different threading
models, etc...

-Sam
rushing at nightmare.com
rushing at eGroups.net




From mal at lemburg.com  Thu May 13 10:23:13 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 13 May 1999 10:23:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <373A8BF1.AE124BF@lemburg.com>

rushing at nightmare.com wrote:
> 
> [list has been quiet, thought I'd liven things up a bit. 8^)]

Well, there certainly is enough on the todo list... it's probably
the usual "ain't got no time" thing.

> I'm not sure if this has been brought up before in other forums, but
> has there been discussion of separating the Python and C invocation
> stacks, (i.e., removing recursive calls to the intepreter) to
> facilitate coroutines or first-class continuations?

Wouldn't it be possible to move all the C variables passed to
eval_code() via the execution frame ? AFAIK, the frame is
generated on every call to eval_code() and thus could also
be generated *before* calling it.

> One of the biggest barriers to getting others to use asyncore/medusa
> is the need to program in continuation-passing-style (callbacks,
> callbacks to callbacks, state machines, etc...).  Usually there has to
> be an overriding requirement for speed/scalability before someone will
> even look into it.  And even when you do 'get' it, there are limits to
> how inside-out your thinking can go. 8^)
> 
> If Python had coroutines/continuations, it would be possible to hide
> asyncore-style select()/poll() machinery 'behind the scenes'.  I
> believe that Concurrent ML does exactly this...
> 
> Other advantages might be restartable exceptions, different threading
> models, etc...

Don't know if moving the C stack stuff into the frame objects
will get you the desired effect: what about other things having
state (e.g. connections or files), that are not even touched
by this mechanism ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 232 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From rushing at nightmare.com  Thu May 13 11:40:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 02:40:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373A8BF1.AE124BF@lemburg.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<373A8BF1.AE124BF@lemburg.com>
Message-ID: <14138.38550.89759.752058@seattle.nightmare.com>

M.-A. Lemburg writes:

 > Wouldn't it be possible to move all the C variables passed to
 > eval_code() via the execution frame ? AFAIK, the frame is
 > generated on every call to eval_code() and thus could also
 > be generated *before* calling it.

I think this solves half of the problem.  The C stack is both a value
stack and an execution stack (i.e., it holds variables and return
addresses).  Getting rid of arguments (and a return value!) gets rid
of the need for the 'value stack' aspect.

In aiming for an enter-once, exit-once VM, the thorniest part is to
somehow allow python->c->python calls.  The second invocation could
never save a continuation because its execution context includes a C
frame.  This is a general problem, not specific to Python; I probably
should have thought about it a bit before posting...

 > Don't know if moving the C stack stuff into the frame objects
 > will get you the desired effect: what about other things having
 > state (e.g. connections or files), that are not even touched
 > by this mechanism ?

I don't think either of those cause 'real' problems (i.e., nothing
should crash that assumes an open file or socket), but there may be
other stateful things that might.  I don't think that refcounts would
be a problem - a saved continuation wouldn't be all that different
from an exception traceback.

-Sam

p.s. Here's a tiny VM experiment I wrote a while back, to explain
what I mean by 'stackless':

http://www.nightmare.com/stuff/machine.h
http://www.nightmare.com/stuff/machine.c

Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context
onto heap-allocated data structures rather than calling the VM
recursively.




From skip at mojam.com  Thu May 13 13:38:39 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 13 May 1999 07:38:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>

    Sam> I'm not sure if this has been brought up before in other forums,
    Sam> but has there been discussion of separating the Python and C
    Sam> invocation stacks, (i.e., removing recursive calls to the
    Sam> intepreter) to facilitate coroutines or first-class continuations?

I thought Guido was working on that for the mobile agent stuff he was
working on at CNRI.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From bwarsaw at cnri.reston.va.us  Thu May 13 17:10:52 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 13 May 1999 11:10:52 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> I thought Guido was working on that for the mobile agent stuff
    SM> he was working on at CNRI.

Nope, we decided that we could accomplish everything we needed without 
this.  We occasionally revisit this but Guido keeps insisting it's a
lot of work for not enough benefit :-)

-Barry



From guido at CNRI.Reston.VA.US  Thu May 13 17:19:10 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 13 May 1999 11:19:10 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT."
             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>  
            <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us>

Interesting topic!  While I 'm on the road, a few short notes.

> I thought Guido was working on that for the mobile agent stuff he was
> working on at CNRI.

Indeed.  At least I planned on working on it.  I ended up abandoning
the idea because I expected it would be a lot of work and I never had
the time (same old story indeed).

Sam also hit it on the nail: the hardest problem is what to do about
all the places where C calls back into Python.

I've come up with two partial solutions: (1) allow for a way to
arrange for a call to be made immediately after you return to the VM
from C; this would take care of apply() at least and a few other
"tail-recursive" cases; (2) invoke a new VM when C code needs a Python
result, requiring it to return.  The latter clearly breaks certain
uses of coroutines but could probably be made to work most of the
time.  Typical use of the 80-20 rule.

And I've just come up with a third solution: a variation on (1) where
you arrange *two* calls: one to Python and then one to C, with the
result of the first.  (And a bit saying whether you want the C call to 
be made even when an exception happened.)

In general, I still think it's a cool idea, but I also still think
that continuations are too complicated for most programmers.  (This
comes from the realization that they are too complicated for me!)
Corollary: even if we had continuations, I'm not sure if this would
take away the resistance against asyncore/asynchat.  Of course I could 
be wrong.

Different suggestion: it would be cool to work on completely
separating out the VM from the rest of Python, through some kind of
C-level API specification.  Two things should be possiblw with this
new architecture: (1) small platform ports could cut out the
interactive interpreter, the parser and compiler, and certain data
types such as long, complex and files; (2) there could be alternative
pluggable VMs with certain desirable properties such as
platform-specific optimization (Christian, are you listening? :-).

I think the most challenging part might be defining an API for passing 
in the set of supported object types and operations.  E.g. the
EXEC_STMT opcode needs to be be implemented in a way that allows
"exec" to be absent from the language.  Perhaps an __exec__ function
(analogous to __import__) is the way to go.  The set of built-in
functions should also be passed in, so that e.g. one can easily leave
out open(), eval() and comppile(), complex(), long(), float(), etc.

I think it would be ideal if no #ifdefs were needed to remove features
(at least not in the VM code proper).  Fortunately, the VM doesn't
really know about many object types -- frames, fuctions, methods,
classes, ints, strings, dictionaries, tuples, tracebacks, that may be
all it knows.  (Lists?)

Gotta run,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Thu May 13 21:50:44 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 13 May 1999 21:50:44 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>  <199905131519.LAA01097@eric.cnri.reston.va.us>
Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com>

> In general, I still think it's a cool idea, but I also still think
> that continuations are too complicated for most programmers.  (This
> comes from the realization that they are too complicated for me!)

in an earlier life, I used non-preemtive threads (that is,
explicit yields) and co-routines to do some really cool
stuff with very little code.  looks like a stack-less inter-
preter would make it trivial to implement that.

might just be nostalgia, but I think I would give an arm
or two to get that (not necessarily my own, though ;-)

</F>




From rushing at nightmare.com  Fri May 14 04:00:09 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 19:00:09 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
Message-ID: <14139.30970.644343.612721@seattle.nightmare.com>

Guido van Rossum writes:
  > I've come up with two partial solutions: (1) allow for a way to
  > arrange for a call to be made immediately after you return to the
  > VM from C; this would take care of apply() at least and a few
  > other "tail-recursive" cases; (2) invoke a new VM when C code
  > needs a Python result, requiring it to return.  The latter clearly
  > breaks certain uses of coroutines but could probably be made to
  > work most of the time.  Typical use of the 80-20 rule.

I know this is disgusting, but could setjmp/longjmp 'automagically'
force a 'recursive call' to jump back into the top-level loop?  This
would put some serious restraint on what C called from Python could
do...

I think just about any Scheme implementation has to solve this same
problem... I'll dig through my collection of them for ideas.

  > In general, I still think it's a cool idea, but I also still think
  > that continuations are too complicated for most programmers.  (This
  > comes from the realization that they are too complicated for me!)
  > Corollary: even if we had continuations, I'm not sure if this would
  > take away the resistance against asyncore/asynchat.  Of course I could 
  > be wrong.

Theoretically, you could have a bit of code that looked just like
'normal' imperative code, that would actually be entering and exiting
the context for non-blocking i/o.  If it were done right, the same
exact code might even run under 'normal' threads.

Recently I've written an async server that needed to talk to several
other RPC servers, and a mysql server.  Pseudo-example, with
possibly-async calls in UPPERCASE:

  auth, archive = db.FETCH_USER_INFO (user)
  if verify_login(user,auth):
    rpc_server = self.archive_servers[archive]
    group_info = rpc_server.FETCH_GROUP_INFO (group)
    if valid (group_info):
      return rpc_server.FETCH_MESSAGE (message_number)
    else:
      ...
   else:
     ...

This code in CPS is a horrible, complicated mess, it takes something
like 8 callback methods, variables and exceptions have to be passed
around in 'continuation' objects.  It's hairy because there are three
levels of callback state.  Ugh.

If Python had closures, then it would be a *little* easier, but would
still make the average Pythoneer swoon.  Closures would let you put
the above logic all in one method, but the code would still be
'inside-out'.

  > Different suggestion: it would be cool to work on completely
  > separating out the VM from the rest of Python, through some kind of
  > C-level API specification.

I think this is a great idea.  I've been staring at python bytecodes a
bit lately thinking about how to do something like this, for some
subset of Python.

[...]

Ok, we've all seen the 'stick'.  I guess I should give an example of
the 'carrot': I think that a web server built on such a Python could
have the performance/scalability of thttpd, with the
ease-of-programming of Roxen.  As far as I know, there's nothing like
it out there.  Medusa would be put out to pasture. 8^)

-Sam




From guido at CNRI.Reston.VA.US  Fri May 14 14:03:31 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 08:03:31 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT."
             <14139.30970.644343.612721@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
            <14139.30970.644343.612721@seattle.nightmare.com> 
Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us>

> I know this is disgusting, but could setjmp/longjmp 'automagically'
> force a 'recursive call' to jump back into the top-level loop?  This
> would put some serious restraint on what C called from Python could
> do...

Forget about it.  setjmp/longjmp are invitations to problems.  I also
assume that they would interfere badly with C++.

> I think just about any Scheme implementation has to solve this same
> problem... I'll dig through my collection of them for ideas.

Anything that assumes knowledge about how the C compiler and/or the
CPU and OS lay out the stack is a no-no, because it means that the
first thing one has to do for a port to a new architecture is figure
out how the stack is laid out.  Another thread in this list is porting 
Python to microplatforms like PalmOS.  Typically the scheme Hackers
are not afraid to delve deep into the machine, but I refuse to do that
-- I think it's too risky.

>   > In general, I still think it's a cool idea, but I also still think
>   > that continuations are too complicated for most programmers.  (This
>   > comes from the realization that they are too complicated for me!)
>   > Corollary: even if we had continuations, I'm not sure if this would
>   > take away the resistance against asyncore/asynchat.  Of course I could 
>   > be wrong.
> 
> Theoretically, you could have a bit of code that looked just like
> 'normal' imperative code, that would actually be entering and exiting
> the context for non-blocking i/o.  If it were done right, the same
> exact code might even run under 'normal' threads.

Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines
with regular threads.  (I think in cooperation with Steve Majewski.)

> Recently I've written an async server that needed to talk to several
> other RPC servers, and a mysql server.  Pseudo-example, with
> possibly-async calls in UPPERCASE:
> 
>   auth, archive = db.FETCH_USER_INFO (user)
>   if verify_login(user,auth):
>     rpc_server = self.archive_servers[archive]
>     group_info = rpc_server.FETCH_GROUP_INFO (group)
>     if valid (group_info):
>       return rpc_server.FETCH_MESSAGE (message_number)
>     else:
>       ...
>    else:
>      ...
> 
> This code in CPS is a horrible, complicated mess, it takes something
> like 8 callback methods, variables and exceptions have to be passed
> around in 'continuation' objects.  It's hairy because there are three
> levels of callback state.  Ugh.

Agreed.

> If Python had closures, then it would be a *little* easier, but would
> still make the average Pythoneer swoon.  Closures would let you put
> the above logic all in one method, but the code would still be
> 'inside-out'.

I forget how this worked :-(

>   > Different suggestion: it would be cool to work on completely
>   > separating out the VM from the rest of Python, through some kind of
>   > C-level API specification.
> 
> I think this is a great idea.  I've been staring at python bytecodes a
> bit lately thinking about how to do something like this, for some
> subset of Python.
> 
> [...]
> 
> Ok, we've all seen the 'stick'.  I guess I should give an example of
> the 'carrot': I think that a web server built on such a Python could
> have the performance/scalability of thttpd, with the
> ease-of-programming of Roxen.  As far as I know, there's nothing like
> it out there.  Medusa would be put out to pasture. 8^)

I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
they do that Apache doesn't?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Fri May 14 15:16:13 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 14 May 1999 15:16:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>             <14139.30970.644343.612721@seattle.nightmare.com>  <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com>

> I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
> they do that Apache doesn't?

http://www.roxen.com/

a lean and mean secure web server written in Pike
(http://pike.idonex.se/), from a company here in
Link?ping.

</F>




From tismer at appliedbiometrics.com  Fri May 14 17:15:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 14 May 1999 17:15:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
	            <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com>


Guido van Rossum wrote:

[setjmp/longjmp -no-no]

> Forget about it.  setjmp/longjmp are invitations to problems.  I also
> assume that they would interfere badly with C++.
> 
> > I think just about any Scheme implementation has to solve this same
> > problem... I'll dig through my collection of them for ideas.
> 
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.
...

I agree that this is generally bad. While it's a cakewalk
to do a stack swap for the few (X86 based:) platforms where
I work with. This is much less than a thread change.

But on the general issues:
Can the Python-calls-C and C-calls-Python problem just be solved
by turning the whole VM state into a data structure, including
a Python call stack which is independent? Maybe this has been
mentioned already.

This might give a little slowdown, but opens possibilities
like continuation-passing style, and context switches
between different interpreter states would be under direct
control.

Just a little dreaming: Not using threads, but just tiny
interpreter incarnations with local state, and a special
C call or better a new opcode which activates the next
state in some list (of course a Python list).
This would automagically produce ICON iterators (duck)
and coroutines (cover).
If I guess right, continuation passing could be done
by just shifting tiny tuples around. Well, Tim, help me :-)

[closures]

> > I think this is a great idea.  I've been staring at python bytecodes a
> > bit lately thinking about how to do something like this, for some
> > subset of Python.

Lumberjack? How is it going? [to Sam]

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Fri May 14 17:32:51 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Fri, 14 May 1999 11:32:51 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
	<001701be9e0b$f1bc4930$f29b12c2@pythonware.com>
Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

    FL> a lean and mean secure web server written in Pike
    FL> (http://pike.idonex.se/), from a company here in
    FL> Link?ping.

Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
original came on board to add Pike support, which has a syntax similar 
enough to C to be easily integrated.  I think I've had as much success 
convincing him to use Python as he's had convincing me to use Pike :-)

-Barry



From gstein at lyra.org  Fri May 14 23:54:02 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 14 May 1999 14:54:02 -0700
Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?)
References: <199905070507.BAA22545@python.org>
		<14138.28243.553816.166686@seattle.nightmare.com>
		<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
		<14138.60284.584739.711112@anthem.cnri.reston.va.us>
		<14139.30970.644343.612721@seattle.nightmare.com>
		<199905141203.IAA01808@eric.cnri.reston.va.us>
		<001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us>
Message-ID: <373C9B7A.3676A910@lyra.org>

Barry A. Warsaw wrote:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:
> 
>     FL> a lean and mean secure web server written in Pike
>     FL> (http://pike.idonex.se/), from a company here in
>     FL> Link?ping.
> 
> Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
> original came on board to add Pike support, which has a syntax similar
> enough to C to be easily integrated.  I think I've had as much success
> convincing him to use Python as he's had convincing me to use Pike :-)

<HistoricalNote>

Heh. Pike is an outgrowth of the MUD world's LPC programming language. A
guy named "Profezzorn" started a project (in '94?) to redevelop an LPC
compiler/interpreter ("driver") from scratch to avoid some licensing
constraints. The project grew into a generalized network handler, since
MUDs' typical designs are excellent for these tasks. From there, you get
the Roxen web server.

</HistoricalNote>

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From rushing at nightmare.com  Sat May 15 01:36:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Fri, 14 May 1999 16:36:11 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <14140.44469.848840.740112@seattle.nightmare.com>

Guido van Rossum writes:
 > > If Python had closures, then it would be a *little* easier, but would
 > > still make the average Pythoneer swoon.  Closures would let you put
 > > the above logic all in one method, but the code would still be
 > > 'inside-out'.
 > 
 > I forget how this worked :-(

[with a faked-up lambda-ish syntax]

def thing (a):
  return do_async_job_1 (a,
    lambda (b):
      if (a>1):
        do_async_job_2a (b,
          lambda (c):
            [...]
          )
      else:
        do_async_job_2b (a,b,
          lambda (d,e,f):
            [...]
          )
     )

The call to do_async_job_1 passes 'a', and a callback, which is
specified 'in-line'.  You can follow the logic of something like this
more easily than if each lambda is spun off into a different
function/method.

 > > I think that a web server built on such a Python could have the
 > > performance/scalability of thttpd, with the ease-of-programming
 > > of Roxen.  As far as I know, there's nothing like it out there.
 > > Medusa would be put out to pasture. 8^)
 > 
 > I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
 > they do that Apache doesn't?

thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance
and scalability, but suffer from the same programmability problem as
Medusa (only worse, 'cause they're in C).

Roxen is written in Pike, a c-like language with gc, threads,
etc... Roxen is I think now the official 'GNU Web Server'.

Here's an interesting web-server comparison chart:

http://www.acme.com/software/thttpd/benchmarks.html

-Sam




From guido at CNRI.Reston.VA.US  Sat May 15 04:23:24 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 22:23:24 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT."
             <14140.44469.848840.740112@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
            <14140.44469.848840.740112@seattle.nightmare.com> 
Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us>

> def thing (a):
>   return do_async_job_1 (a,
>     lambda (b):
>       if (a>1):
>         do_async_job_2a (b,
>           lambda (c):
>             [...]
>           )
>       else:
>         do_async_job_2b (a,b,
>           lambda (d,e,f):
>             [...]
>           )
>      )
> 
> The call to do_async_job_1 passes 'a', and a callback, which is
> specified 'in-line'.  You can follow the logic of something like this
> more easily than if each lambda is spun off into a different
> function/method.

I agree that it is still ugly.

> http://www.acme.com/software/thttpd/benchmarks.html

I see.  Any pointers to a graph of thttp market share?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat May 15 09:51:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <000701be9ea7$acab7f40$159e2299@tim>

[GvR]
> ...
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.

The Icon language needs a bit of platform-specific context-switching
assembly code to support its full coroutine features, although its
bread-and-butter generators ("semi coroutines") don't need anything special.

The result is that Icon ports sometimes limp for a year before they support
full coroutines, waiting for someone wizardly enough to write the necessary
code.  This can, in fact, be quite difficult; e.g., on machines with HW
register windows (where "the stack" can be a complicated beast half buried
in hidden machine state, sometimes needing kernel privilege to uncover).

Not attractive.  Generators are, though <wink>.

threads-too-ly y'rs  - tim





From tim_one at email.msn.com  Sat May 15 09:51:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000801be9ea7$ae45f560$159e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From gstein at lyra.org  Sat May 15 10:17:15 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 15 May 1999 01:17:15 -0700
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
	            <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us>
Message-ID: <373D2D8B.390C523C@lyra.org>

Guido van Rossum wrote:
> ...
> > http://www.acme.com/software/thttpd/benchmarks.html
> 
> I see.  Any pointers to a graph of thttp market share?

thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That
puts it at #6. However, it is interesting to note that 60k of those
sites are in the .uk domain. I can't figure out who is running it, but I
would guess that a large UK-based ISP is hosting a bunch of domains on
thttpd.

It is somewhat difficult to navigate the various reports (and it never
fails that the one you want is not present), but the data is from
Netcraft's survey at: http://www.netcraft.com/survey/

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From tim_one at email.msn.com  Sat May 15 18:43:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 12:43:20 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000701be9ef2$0a9713e0$659e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From rushing at nightmare.com  Sun May 16 13:10:18 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Sun, 16 May 1999 04:10:18 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <81365478@toto.iv>
Message-ID: <14142.40867.103424.764346@seattle.nightmare.com>

Tim Peters writes:
 > I'm not a fan of continuations myself; coroutines can be
 > implemented faithfully via threads (I posted a rather complete set
 > of Python classes for that in the pre-DejaNews days, a bit more
 > flexible than Icon's coroutines); and:

Continuations are more powerful than coroutines, though I admit
they're a bit esoteric.  I programmed in Scheme for years without
seeing the need for them.  But when you need 'em, you *really* need
'em.  No way around it.

For my purposes (massively scalable single-process servers and
clients) threads don't cut it... for example I have a mailing-list
exploder that juggles up to 2048 simultaneous SMTP connections.  I
think it can go higher - I've tested select() on FreeBSD with 16,000
file descriptors.

[...]

BTW, I have actually made progress borrowing a bit of code from SCM.
It uses the stack-copying technique, along with setjmp/longjmp.  It's
too ugly and unportable to be a real candidate for inclusion in
Official Python.  [i.e., if it could be made to work it should be
considered a stopgap measure for the desperate].

I haven't tested it thoroughly, but I have successfully saved and
invoked (and reinvoked) a continuation.  Caveat: I have to turn off
Py_DECREF in order to keep it from crashing.

  | >>> import callcc
  | >>> saved = None
  | >>> def thing(n):
  | ...     if n == 2:
  | ...             global saved
  | ...             saved = callcc.new()
  | ...     print 'n==',n
  | ...     if n == 0:
  | ...             print 'Done!'
  | ...     else:
  | ...             thing (n-1)
  | ... 
  | >>> thing (5)
  | n== 5
  | n== 4
  | n== 3
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved
  | <Continuation object at 80d30d0>
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> 

I will probably not be able to work on this for a while (baby due any
day now), so anyone is welcome to dive right in.  I don't have much
experience wading through gdb tracking down reference bugs, I'm hoping
a brave soul will pick up where I left off. 8^)

http://www.nightmare.com/stuff/python-callcc.tar.gz
ftp://www.nightmare.com/stuff/python-callcc.tar.gz

-Sam




From tismer at appliedbiometrics.com  Sun May 16 17:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 17:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com>


rushing at nightmare.com wrote:

[...]

> BTW, I have actually made progress borrowing a bit of code from SCM.
> It uses the stack-copying technique, along with setjmp/longjmp.  It's
> too ugly and unportable to be a real candidate for inclusion in
> Official Python.  [i.e., if it could be made to work it should be
> considered a stopgap measure for the desperate].

I tried it and built it as a Win32 .pyd file, and it seems to
work, but...

> I haven't tested it thoroughly, but I have successfully saved and
> invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> Py_DECREF in order to keep it from crashing.

Indeed, and this seems to be a problem too hard to solve
without lots of work.
Since you keep a snapshot of the current machine stack,
it contains a number of object references which have been
valid when the snapshot was taken, but many are most
probably invalid when you restart the continuation.
I guess, incref-ing all current alive objects on
the interpreter stack would be the minimum, maybe more.

A tuple of necessary references could be used as an
attribute of a Continuation object. I will look
how difficult this is.

ciao - chris


-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Sun May 16 20:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 20:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com>
Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com>


Christian Tismer wrote:
> 
> rushing at nightmare.com wrote:
[...]

> > I haven't tested it thoroughly, but I have successfully saved and
> > invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> > Py_DECREF in order to keep it from crashing.

It is possible, but a little hard.
To take a working snapshot of the current thread's
stack, one needs not only the stack snapshot which 
continue.c provides, but also a restorable copy of
all frame objects involved so far.
A copy of the current frame chain must be built, with
proper reference counting of all involved elements.
And this is the crux: The current stack pointer of the
VM is not present in the frame objects, but hangs
around somewhere on the machine stack.
Two solutions:

1) modify PyFrameObject by adding a field which holds
   the stack pointer, when a function is called. 
   I don't like to change the VM in any way for this.
2) use the lasti field which holds the last VM instruction
   offset. Then scan the opcodes of the code object
   and calculate the current stack level. This is possible
   since Guido's code generator creates code with the stack
   level lexically bound to the code offset.

Now we can incref all the referenced objects in the frame.
This must be done for the whole chain, which is copied and
relinked during that. This chain is then held as a
property of the continuation object.

To throw the continuation, the current frame chain must
be cleared, and the saved one is inserted, together with
the machine stack operation which Sam has already.

A little hefty, isn't it?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Mon May 17 07:42:59 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Mon, 17 May 1999 01:42:59 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <000f01bea028$1f75c360$fb9e2299@tim>

[Sam]
> Continuations are more powerful than coroutines, though I admit
> they're a bit esoteric.

"More powerful" is a tedious argument you should always avoid <wink>.

> I programmed in Scheme for years without seeing the need for them.
> But when you need 'em, you *really* need 'em.  No way around it.
>
> For my purposes (massively scalable single-process servers and
> clients) threads don't cut it... for example I have a mailing-list
> exploder that juggles up to 2048 simultaneous SMTP connections.  I
> think it can go higher - I've tested select() on FreeBSD with 16,000
> file descriptors.

The other point being that you want to avoid "inside out" logic, though,
right?  Earlier you posted a kind of ideal:

    Recently I've written an async server that needed to talk to several
    other RPC servers, and a mysql server.  Pseudo-example, with
    possibly-async calls in UPPERCASE:

      auth, archive = db.FETCH_USER_INFO (user)
      if verify_login(user,auth):
          rpc_server = self.archive_servers[archive]
          group_info = rpc_server.FETCH_GROUP_INFO (group)
          if valid (group_info):
              return rpc_server.FETCH_MESSAGE (message_number)
          else:
              ...
          else:
              ...

I assume you want to capture a continuation object in the UPPERCASE methods,
store it away somewhere, run off to your select/poll/whatever loop, and have
it invoke the stored continuation objects as the data they're waiting for
arrives.

If so, that's got to be the nicest use for continuations I've seen!  All
invisible to the end user.  I don't know how to fake it pleasantly without
threads, either, and understand that threads aren't appropriate for resource
reasons.  So I don't have a nice alternative.

> ...
>   | >>> import callcc
>   | >>> saved = None
>   | >>> def thing(n):
>   | ...     if n == 2:
>   | ...             global saved
>   | ...             saved = callcc.new()
>   | ...     print 'n==',n
>   | ...     if n == 0:
>   | ...             print 'Done!'
>   | ...     else:
>   | ...             thing (n-1)
>   | ...
>   | >>> thing (5)
>   | n== 5
>   | n== 4
>   | n== 3
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved
>   | <Continuation object at 80d30d0>
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>>

Suppose the driver were in a script instead:

thing(5)           # line 1
print repr(saved)  # line 2
saved.throw(0)     # line 3
saved.throw(0)     # line 4

Then the continuation would (eventually) "return to" the "print repr(saved)"
and we'd get an infinite output tail of:

Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
...

and never reach line 4.  Right?  That's the part that Guido hates <wink>.

takes-one-to-know-one-ly y'rs  - tim





From tismer at appliedbiometrics.com  Mon May 17 09:07:22 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 09:07:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <373FC02A.69F2D912@appliedbiometrics.com>


Tim Peters wrote:

[to Sam]

> The other point being that you want to avoid "inside out" logic, though,
> right?  Earlier you posted a kind of ideal:
> 
>     Recently I've written an async server that needed to talk to several
>     other RPC servers, and a mysql server.  Pseudo-example, with
>     possibly-async calls in UPPERCASE:
> 
>       auth, archive = db.FETCH_USER_INFO (user)
>       if verify_login(user,auth):
>           rpc_server = self.archive_servers[archive]
>           group_info = rpc_server.FETCH_GROUP_INFO (group)
>           if valid (group_info):
>               return rpc_server.FETCH_MESSAGE (message_number)
>           else:
>               ...
>           else:
>               ...
> 
> I assume you want to capture a continuation object in the UPPERCASE methods,
> store it away somewhere, run off to your select/poll/whatever loop, and have
> it invoke the stored continuation objects as the data they're waiting for
> arrives.
> 
> If so, that's got to be the nicest use for continuations I've seen!  All
> invisible to the end user.  I don't know how to fake it pleasantly without
> threads, either, and understand that threads aren't appropriate for resource
> reasons.  So I don't have a nice alternative.

It can always be done with threads, but also without. Tried it
last night, with proper refcounting, and it wasn't too easy
since I had to duplicate the Python frame chain.

...

> Suppose the driver were in a script instead:
> 
> thing(5)           # line 1
> print repr(saved)  # line 2
> saved.throw(0)     # line 3
> saved.throw(0)     # line 4
> 
> Then the continuation would (eventually) "return to" the "print repr(saved)"
> and we'd get an infinite output tail of:
> 
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!

This is at the moment exactly what happens, with the difference that
after some repetitions we GPF due to dangling references
to too often decref'ed objects. My incref'ing prepares for
just one re-incarnation and should prevend a second call.
But this will be solved, soon.

> and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yup. With a little counting, it was easy to survive:

def main():
    global a
    a=2
    thing (5)
    a=a-1
    if a:
        saved.throw (0)

Weird enough and needs a much better interface.
But finally I'm quite happy that it worked so smoothly
after just a couple of hours (well, about six :)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Mon May 17 11:46:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 02:46:29 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim>
References: <14142.40867.103424.764346@seattle.nightmare.com>
	<000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <14143.56604.21827.891993@seattle.nightmare.com>

Tim Peters writes:
 > [Sam]
 > > Continuations are more powerful than coroutines, though I admit
 > > they're a bit esoteric.
 > 
 > "More powerful" is a tedious argument you should always avoid <wink>.

More powerful in the sense that you can use continuations to build
lots of different control structures (coroutines, backtracking,
exceptions), but not vice versa.

Kinda like a better tool for blowing one's own foot off. 8^)

 > Suppose the driver were in a script instead:
 > 
 > thing(5)           # line 1
 > print repr(saved)  # line 2
 > saved.throw(0)     # line 3
 > saved.throw(0)     # line 4
 > 
 > Then the continuation would (eventually) "return to" the "print repr(saved)"
 > and we'd get an infinite output tail [...]
 > 
 > and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yes... the continuation object so far isn't very usable.  It needs a
driver of some kind around it.  In the Scheme world, there are two
common ways of using continuations - let/cc and call/cc.  [call/cc is what
is in the standard, it's official name is call-with-current-continuation]

let/cc stores the continuation in a variable binding, while
introducing a new scope.  It requires a change to the underlying
language:

(+ 1
  (let/cc escape
    (...)
    (escape 34)))
=> 35

'escape' is a function that when called will 'resume' with whatever
follows the let/cc clause.  In this case it would continue with the
addition...

call/cc is a little trickier, but doesn't require any change to the
language...  instead of making a new binding directly, you pass in
a function that will receive the binding:

(+ 1
   (call/cc
     (lambda (escape)
       (...)
       (escape 34))))
=> 35

In words, it's much more frightening: "call/cc is a function, that
when called with a function as an argument, will pass that function an
argument that is a new function, which when called with a value will
resume the computation with that value as the result of the entire
expression"  Phew.

In Python, an example might look like this:

SAVED = None
def save_continuation (k):
  global SAVED
  SAVED = k

def thing():
  [...]
  value = callcc (lambda k: save_continuation(k))

# or more succinctly:
def thing():
  [...]
  value = callcc (save_continuation)

In order to do useful work like passing values back and forth between
coroutines, we have to have some way of returning a value from the
continuation when it is reinvoked.

I should emphasize that most folks will never see call/cc 'in the
raw', it will usually have some nice wrapper around to implement
whatever construct is needed.

-Sam




From arw at ifu.net  Mon May 17 20:06:18 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 14:06:18 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <37405A99.1DBAF399@ifu.net>

The illustrious Sam Rushing avers:
>Continuations are more powerful than coroutines, though I admit
>they're a bit esoteric.  I programmed in Scheme for years without
>seeing the need for them.  But when you need 'em, you *really* need
>'em.  No way around it.

Frankly, I think I thought I understood this once but now I know I
don't.
How're continuations more powerful than coroutines?
And why can't they be implemented using threads (and semaphores etc)?

...I'm not promising I'll understand the answer...
    -- Aaron Watters

===
I taught I taw a putty-cat!





From gmcm at hypernet.com  Mon May 17 21:18:43 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 14:18:43 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
Message-ID: <1285153546-166193857@hypernet.com>

The estimable Aaron Watters queries:
> The illustrious Sam Rushing avers:
> >Continuations are more powerful than coroutines, though I admit
> >they're a bit esoteric.  I programmed in Scheme for years without
> >seeing the need for them.  But when you need 'em, you *really* need
> >'em.  No way around it.
> 
> Frankly, I think I thought I understood this once but now I know I
> don't. How're continuations more powerful than coroutines? And why
> can't they be implemented using threads (and semaphores etc)?

I think Sam's (immediate <wink>) problem is that he can't afford 
threads - he may have hundreds to thousands of these suckers.

As a fuddy-duddy old imperative programmer, I'm inclined to think 
"state machine". But I'd guess that functional-ophiles probably see 
that as inelegant. (Safe guess - they see _anything_ that isn't 
functional as inelegant!).

crude-but-not-rude-ly y'rs

- Gordon



From jeremy at cnri.reston.va.us  Mon May 17 20:43:34 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 17 May 1999 14:43:34 -0400 (EDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
References: <37405A99.1DBAF399@ifu.net>
Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us>

>>>>> "AW" == Aaron Watters <arw at ifu.net> writes:

  AW> The illustrious Sam Rushing avers:
  >> Continuations are more powerful than coroutines, though I admit
  >> they're a bit esoteric.  I programmed in Scheme for years without
  >> seeing the need for them.  But when you need 'em, you *really*
  >> need 'em.  No way around it.

  AW> Frankly, I think I thought I understood this once but now I know
  AW> I don't.  How're continuations more powerful than coroutines?
  AW> And why can't they be implemented using threads (and semaphores
  AW> etc)?

I think I understood, too.  I'm hoping that someone will debug my
answer and enlighten us both.

A continuation is a mechanism for making control flow explicit.  A
continuation is a means of naming and manipulating "the rest of the
program."   In Scheme terms, the continuation is the function that the 
value of the current expression should be passed to.  The call/cc
mechanisms lets you capture the current continuation and explicitly
call on it.  The most typical use of call/cc is non-local exits, but
it gives you incredible flexibility for implementing your control
flow.

I'm fuzzy on coroutines, as I've only seen them in "Structure
Programming" (which is as old as I am :-) and never actually used
them.  The basic idea is that when a coroutine calls another
coroutine, control is transfered to the second coroutine at the point
at which it last left off (by itself calling another coroutine or by
detaching, which returns control to the lexically enclosing scope).

It seems to me that coroutines are an example of the kind of control
structure that you could build with continuations.  It's not clear
that the reverse is true.

I have to admit that I'm a bit unclear on the motivation for all
this.  As Gordon said, the state machine approach seems like it would
be a good approach.

Jeremy



From klm at digicool.com  Mon May 17 21:08:57 1999
From: klm at digicool.com (Ken Manheimer)
Date: Mon, 17 May 1999 15:08:57 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com>

Jeremy Hylton:

> I have to admit that I'm a bit unclear on the motivation for all
> this.  As Gordon said, the state machine approach seems like it would
> be a good approach.

If i understand what you mean by state machine programming, it's pretty
inherently uncompartmented, all the combinations of state variables need
to be accounted for, so the number of states grows factorially on the
number of state vars, in general it's awkward.  The advantage of going
with what functional folks come up with, like continuations, is that it
tends to be well compartmented - functional.  (Come to think of it, i
suppose that compartmentalization as opposed to state is their mania.)

As abstract as i can be (because i hardly know what i'm talking about)
(but i have done some specifically finite state machine programming, and
did not enjoy it),

Ken
klm at digicool.com



From arw at ifu.net  Mon May 17 21:20:13 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 15:20:13 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com>
Message-ID: <37406BED.95AEB896@ifu.net>

The ineffible Gordon McMillan retorts:

> As a fuddy-duddy old imperative programmer, I'm inclined to think
> "state machine". But I'd guess that functional-ophiles probably see
> that as inelegant. (Safe guess - they see _anything_ that isn't
> functional as inelegant!).

As a fellow fuddy-duddy I'd agree except that if you write properlylayered
software you have to unrole and rerole all those layers for every
transition of the multi-level state machine, and even though with proper
discipline it can be implemented without becoming hideous, it still adds
significant overhead compared to "stop right here and come back later"
which could be implemented using threads/coroutines(?)/continuations.
I think this is particularly true in Python with the relatively high
function
call overhead.  Or maybe I'm out in left field doing cartwheels...

I guess the question of interest is why are threads insufficient?  I guess

they have system limitations on the number of threads or other limitations

that wouldn't be a problem with continuations?  If there aren't a *lot* of

situations where coroutines are vital, I'd be hesitant to do major
surgery.
But I'm a fuddy-duddy.

   -- Aaron Watters

===
I did! I did!





From tismer at appliedbiometrics.com  Mon May 17 22:03:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 22:03:01 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net>
Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com>


Aaron Watters wrote:
> 
> The ineffible Gordon McMillan retorts:
> 
> > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > "state machine". But I'd guess that functional-ophiles probably see
> > that as inelegant. (Safe guess - they see _anything_ that isn't
> > functional as inelegant!).
> 
> As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> software you have to unrole and rerole all those layers for every
> transition of the multi-level state machine, and even though with proper
> discipline it can be implemented without becoming hideous, it still adds
> significant overhead compared to "stop right here and come back later"
> which could be implemented using threads/coroutines(?)/continuations.

Coroutines are most elegant here, since (fir a simple example)
they are a symmetric pair of functions which call each other.
There is neither the one-pulls, the other pushes asymmetry, nor
the need to maintain state and be controlled by a supervisor
function.

> I think this is particularly true in Python with the relatively high
> function
> call overhead.  Or maybe I'm out in left field doing cartwheels...
> I guess the question of interest is why are threads insufficient?  I guess
> they have system limitations on the number of threads or other limitations
> that wouldn't be a problem with continuations?  If there aren't a *lot* of
> situations where coroutines are vital, I'd be hesitant to do major
> surgery.

For me (as always) most interesting is the possible speed of
coroutines. They involve no threads overhead, no locking,
no nothing. Python supports it better than expected. If the
stack level of two code objects is the same at a switching point,
the whole switch is nothing more than swapping two frame objects,
and we're done. This might be even cheaper than general call/cc,
like a function call. Sam's prototype works already, with no change to
the
interpreter (but knowledge of Python frames, and a .dll of course).

I think we'll continue a while.

continuously - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From gmcm at hypernet.com  Tue May 18 00:17:25 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 17:17:25 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com>
Message-ID: <1285142823-166838954@hypernet.com>

Co-Christian-routines Tismer continues:

> Aaron Watters wrote:
> > 
> > The ineffible Gordon McMillan retorts:
> > 
> > > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > > "state machine". But I'd guess that functional-ophiles probably see
> > > that as inelegant. (Safe guess - they see _anything_ that isn't
> > > functional as inelegant!).
> > 
> > As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> > software you have to unrole and rerole all those layers for every
> > transition of the multi-level state machine, and even though with proper
> > discipline it can be implemented without becoming hideous, it still adds
> > significant overhead compared to "stop right here and come back later"
> > which could be implemented using threads/coroutines(?)/continuations.
> 
> Coroutines are most elegant here, since (fir a simple example)
> they are a symmetric pair of functions which call each other.
> There is neither the one-pulls, the other pushes asymmetry, nor the
> need to maintain state and be controlled by a supervisor function.

Well, the state maintains you, instead of the other way 'round. (Any 
other ex-Big-Blue-ers out there that used to play these games with 
checkpoint and SyncSort?).

I won't argue elegance. Just a couple points:

- there's an art to writing state machines which is largely 
unrecognized (most of them are unnecessarily horrid).

- a multiplexed solution (vs a threaded solution) requires that 
something be inside out. In one case it's your code, in the other, 
your understanding of the problem. Neither is trivial.

Not to be discouraging - as long as your solution doesn't involve 
using regexps on bytecode <wink>, I say go for it!

- Gordon



From guido at CNRI.Reston.VA.US  Tue May 18 06:03:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 18 May 1999 00:03:34 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT."
             <14143.56604.21827.891993@seattle.nightmare.com> 
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
            <14143.56604.21827.891993@seattle.nightmare.com> 
Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us>

Sam (& others),

I thought I understood what continuations were, but the examples of
what you can do with them so far don't clarify the matter at all.

Perhaps it would help to explain what a continuation actually does
with the run-time environment, instead of giving examples of how to
use them and what the result it?

Here's a start of my own understanding (brief because I'm on a 28.8k
connection which makes my ordinary typing habits in Emacs very
painful).

1. All program state is somehow contained in a single execution stack.
This includes globals (which are simply name bindings in the botton
stack frame).  It also includes a code pointer for each stack frame
indicating where the function corresponding to that stack frame is
executing (this is the return address if there is a newer stack frame, 
or the current instruction for the newest frame).

2. A continuation does something equivalent to making a copy of the
entire execution stack.  This can probably be done lazily.  There are
probably lots of details.  I also expect that Scheme's semantic model
is different than Python here -- e.g. does it matter whether deep or
shallow copies are made?  I.e. are there mutable *objects* in Scheme?
(I know there are mutable and immutable *name bindings* -- I think.)

3. Calling a continuation probably makes the saved copy of the
execution stack the current execution state; I presume there's also a
way to pass an extra argument.

4. Coroutines (which I *do* understand) are probably done by swapping
between two (or more) continuations.

5. Other control constructs can be done by various manipulations of
continuations.  I presume that in many situations the saved
continuation becomes the main control locus permanently, and the
(previously) current stack is simply garbage-collected.  Of course the 
lazy copy makes this efficient.



If this all is close enough to the truth, I think that continuations
involving C stack frames are definitely out -- as Tim Peters
mentioned, you don't know what the stuff on the C stack of extensions
refers to.  (My guess would be that Scheme implementations assume that
any pointers on the C stack point to Scheme objects, so that C stack
frames can be copied and conservative GC can be used -- this will
never happen in Python.)

Continuations involving only Python stack frames might be supported,
if we can agree on the the sharing / copying semantics.  This is where 
I don't know enough see questions at #2 above).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Tue May 18 06:46:12 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:46:12 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37406BED.95AEB896@ifu.net>
Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim>

[Aaron Watters]
> ...
> I guess the question of interest is why are threads insufficient?  I
> guess they have system limitations on the number of threads or other
> limitations that wouldn't be a problem with continuations?

Sam is mucking with thousands of simultaneous I/O-bound socket connections,
and makes a good case that threads simply don't fly here (each one consumes
a stack, kernel resources, etc).  It's unclear (to me) that thousands of
continuations would be *much* better, though, by the time Christian gets
done making thousands of copies of the Python stack chain.

> If there aren't a *lot* of situations where coroutines are vital, I'd
> be hesitant to do major surgery.  But I'm a fuddy-duddy.

Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the
docs.  They're very well written and describe the problem space exquisitely.
I don't have any problems like that I need to solve, but it's interesting to
ponder!

alas-no-time-for-it-now-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:52 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:52 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com>
Message-ID: <000301bea0e9$4fd473a0$829e2299@tim>

[Christian Tismer]
> ...
> Yup. With a little counting, it was easy to survive:
>
> def main():
>     global a
>     a=2
>     thing (5)
>     a=a-1
>     if a:
>         saved.throw (0)

Did "a" really need to be global here?  I hope you see the same behavior
without the "global a"; e.g., this Scheme:

(define -cont- #f)

(define thing
  (lambda (n)
    (if (= n 2) (call/cc (lambda (k) (set! -cont- k))))
    (display "n == ") (display n) (newline)
    (if (= n 0)
	(begin (display "Done!") (newline))
	(thing (- n 1)))))

(define main
  (lambda ()
    (let ((a 2))
      (thing 5)
      (display "a is ") (display a) (newline)
      (set! a (- a 1))
      (if (> a 0)
	  (-cont- #f)))))

(main)

prints:

n == 5
n == 4
n == 3
n == 2
n == 1
n == 0
Done!
a is 2
n == 2
n == 1
n == 0
Done!
a is 1

Or does brute-force frame-copying cause the continuation to set "a" back to
2 each time?

> Weird enough

Par for the continuation course!  They're nasty when eaten raw.

> and needs a much better interface.

Ya, like screw 'em and use threads <wink>.

> But finally I'm quite happy that it worked so smoothly
> after just a couple of hours (well, about six :)

Yup!  Playing with Python internals is a treat.

to-be-continued-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:57 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:57 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com>
Message-ID: <000401bea0e9$51e467e0$829e2299@tim>

[Sam]
>>> Continuations are more powerful than coroutines, though I admit
>>> they're a bit esoteric.

[Tim]
>> "More powerful" is a tedious argument you should always avoid <wink>.

[Sam]
> More powerful in the sense that you can use continuations to build
> lots of different control structures (coroutines, backtracking,
> exceptions), but not vice versa.

"More powerful" is a tedious argument you should always avoid <frown -- I'm
not touching this, but you can fight it out now with Aaron et alia <wink>>.

>> Then the continuation would (eventually) "return to" the
>> "print repr(saved)" and we'd get an infinite output tail [...]
>> and never reach line 4.  Right?

> Yes... the continuation object so far isn't very usable.

But it's proper behavior for a continuation all the same!  So this aspect
shouldn't be "fixed".

> ...
> let/cc stores the continuation in a variable binding, while
> introducing a new scope.  It requires a change to the underlying
> language:

Isn't this often implemented via a macro, though, so that

   (let/cc name code)

"acts like"

    (call/cc (lambda (name) code))

?  I haven't used a Scheme with native let/cc, but poking around it appears
that the real intent is to support exception-style function exits with a
mechanism cheaper than 1st-class continuations:  twice saw the let/cc object
(the thingie bound to "name") defined as being invalid the instant after
"code" returns, so it's an "up the call stack" gimmick.  That doesn't sound
powerful enough for what you're after.

> [nice let/cc call/cc tutorialette]
> ...
> In order to do useful work like passing values back and forth between
> coroutines, we have to have some way of returning a value from the
> continuation when it is reinvoked.

Somehow, I suspect that's the least of our problems <0.5 wink>.  If
continuations are in Python's future, though, I agree with the need as
stated.

> I should emphasize that most folks will never see call/cc 'in the
> raw', it will usually have some nice wrapper around to implement
> whatever construct is needed.

Python already has well-developed exception and thread facilities, so it's
hard to make a case for continuations as a catch-all implementation
mechanism.  That may be the rub here:  while any number of things *can* be
implementated via continuations, I think very few *need* to be implemented
that way, and full-blown continuations aren't easy to implement efficiently
& portably.

The Icon language was particularly concerned with backtracking searches, and
came up with generators as another clearer/cheaper implementation technique.
When it went on to full-blown coroutines, it's hard to say whether
continuations would have been a better approach.  But the coroutine
implementation it has is sluggish and buggy and hard to port, so I doubt
they could have done noticeably worse.

Would full-blown coroutines be powerful enough for your needs?

assuming-the-practical-defn-of-"powerful-enough"-ly y'rs  - tim





From rushing at nightmare.com  Tue May 18 07:18:06 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:18:06 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim>
References: <14143.56604.21827.891993@seattle.nightmare.com>
	<000401bea0e9$51e467e0$829e2299@tim>
Message-ID: <14144.61765.308962.101884@seattle.nightmare.com>

Tim Peters writes:
 > Isn't this often implemented via a macro, though, so that
 > 
 >    (let/cc name code)
 > 
 > "acts like"
 > 
 >     (call/cc (lambda (name) code))

Yup, they're equivalent, in the sense that given one you can make a
macro to do the other.  call/cc is preferred because it doesn't
require a new binding construct.

 > ?  I haven't used a Scheme with native let/cc, but poking around it
 > appears that the real intent is to support exception-style function
 > exits with a mechanism cheaper than 1st-class continuations: twice
 > saw the let/cc object (the thingie bound to "name") defined as
 > being invalid the instant after "code" returns, so it's an "up the
 > call stack" gimmick.  That doesn't sound powerful enough for what
 > you're after.

Except that since the escape procedure is 'first-class' it can be
stored away and invoked (and reinvoked) later.  [that's all that
'first-class' means: a thing that can be stored in a variable,
returned from a function, used as an argument, etc..]

I've never seen a let/cc that wasn't full-blown, but it wouldn't
surprise me.

 > The Icon language was particularly concerned with backtracking
 > searches, and came up with generators as another clearer/cheaper
 > implementation technique.  When it went on to full-blown
 > coroutines, it's hard to say whether continuations would have been
 > a better approach.  But the coroutine implementation it has is
 > sluggish and buggy and hard to port, so I doubt they could have
 > done noticeably worse.

Many Scheme implementors either skip it, or only support non-escaping
call/cc (i.e., exceptions in Python).

 > Would full-blown coroutines be powerful enough for your needs?

Yes, I think they would be.  But I think with Python it's going to
be just about as hard, either way.

-Sam




From rushing at nightmare.com  Tue May 18 07:48:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:48:29 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <51325225@toto.iv>
Message-ID: <14144.63787.502454.111804@seattle.nightmare.com>

Aaron Watters writes:
 > Frankly, I think I thought I understood this once but now I know I
 > don't.

8^)  That's what I said when I backed into the idea via medusa a
couple of years ago.

 > How're continuations more powerful than coroutines?  And why can't
 > they be implemented using threads (and semaphores etc)?

My understanding of the original 'coroutine' (from Pascal?) was that
it allows two procedures to 'resume' each other.  The classic
coroutine example is the 'samefringe' problem: given two trees of
differing structure, are they equal in the sense that a traversal of
the leaves results in the same list?  Coroutines let you do this
efficiently, comparing leaf-by-leaf without storing the whole tree.

continuations can do coroutines, but can also be used to implement
backtracking, exceptions, threads... probably other stuff I've never
heard of or needed.

The reason that Scheme and ML are such big fans of continuations is
because they can be used to implement all these other features.  Look
at how much try/except and threads complicate other language
implementations.  It's like a super-tool-widget - if you make sure
it's in your toolbox, you can use it to build your circular saw and
lathe from scratch.

Unfortunately there aren't many good sites on the web with good
explanatory material.  The best reference I have is "Essentials of
Programming Languages".  For those that want to play with some of
these ideas using little VM's written in Python:

  http://www.nightmare.com/software.html#EOPL

-Sam




From rushing at nightmare.com  Tue May 18 07:56:37 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:56:37 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <13631823@toto.iv>
Message-ID: <14144.65355.400281.123856@seattle.nightmare.com>

Jeremy Hylton writes:
 > I have to admit that I'm a bit unclear on the motivation for all
 > this.  As Gordon said, the state machine approach seems like it would
 > be a good approach.

For simple problems, state machines are ideal.  Medusa uses state
machines that are built out of Python methods.  But past a certain
level of complexity, they get too hairy to understand.  A really good
example can be found in /usr/src/linux/net/ipv4.  8^)

-Sam




From rushing at nightmare.com  Tue May 18 09:05:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:05:20 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <60057226@toto.iv>
Message-ID: <14145.927.588572.113256@seattle.nightmare.com>

Guido van Rossum writes:
 > Perhaps it would help to explain what a continuation actually does
 > with the run-time environment, instead of giving examples of how to
 > use them and what the result it?

This helped me a lot, and is the angle used in "Essentials of
Programming Languages":

Usually when folks refer to a 'stack', they're refering to an
*implementation* of the stack data type: really an optimization that
assumes an upper bound on stack size, and that things will only be
pushed and popped in order.

If you were to implement a language's variable and execution stacks
with actual data structures (linked lists), then it's easy to see
what's needed: the head of the list represents the current state.  As
functions exit, they pop things off the list.

The reason I brought this up (during a lull!) was that Python is
already paying all of the cost of heap-allocated frames, and it didn't
seem to me too much of a leap from there.

 > 1. All program state is somehow contained in a single execution stack.
Yup.

 > 2. A continuation does something equivalent to making a copy of the
 > entire execution stack.
Yup.
 > I.e. are there mutable *objects* in Scheme?
 > (I know there are mutable and immutable *name bindings* -- I think.)

Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!,
all the things that make it 'impure'.

I think shallow copies are what's expected.  In the examples I have,
the continuation is kept in a 'register', and call/cc merely packages
it up with a little function wrapper.  You are allowed to stomp all
over lexical variables with "set!".

 > 3. Calling a continuation probably makes the saved copy of the
 > execution stack the current execution state; I presume there's also a
 > way to pass an extra argument.
Yup.
 > 4. Coroutines (which I *do* understand) are probably done by swapping
 > between two (or more) continuations.
Yup.  Here's an example in Scheme:

http://www.nightmare.com/stuff/samefringe.scm

Somewhere I have an example of coroutines being used for parsing, very
elegant.  Something like one coroutine does lexing, and passes tokens
one-by-one to the next level, which passes parsed expressions to a
compiler, or whatever.  Kinda like pipes.

 > 5. Other control constructs can be done by various manipulations of
 > continuations.  I presume that in many situations the saved
 > continuation becomes the main control locus permanently, and the
 > (previously) current stack is simply garbage-collected.  Of course
 > the lazy copy makes this efficient.

Yes... I think backtracking would be an example of this.  You're doing
a search on a large space (say a chess game).  After a certain point
you want to try a previous fork, to see if it's promising, but you
don't want to throw away your current work.  Save it, then unwind back
to the previous fork, try that option out... if it turns out to be
better then toss the original.

 > If this all is close enough to the truth, I think that
 > continuations involving C stack frames are definitely out -- as Tim
 > Peters mentioned, you don't know what the stuff on the C stack of
 > extensions refers to.  (My guess would be that Scheme
 > implementations assume that any pointers on the C stack point to
 > Scheme objects, so that C stack frames can be copied and
 > conservative GC can be used -- this will never happen in Python.)

I think you're probably right here - usually there are heavy
restrictions on what kind of data can pass through the C interface.
But I know of at least one Scheme (mzscheme/PLT) that uses
conservative gc and has c/c++ interfaces. [... dig dig ...]


From rushing at nightmare.com  Tue May 18 09:17:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:17:11 -0700 (PDT)
Subject: [Python-Dev] another good motivation
Message-ID: <14145.4917.164756.300678@seattle.nightmare.com>

"Escaping the event loop: an alternative control structure for multi-threaded GUIs"

http://cs.nyu.edu/phd_students/fuchs/
http://cs.nyu.edu/phd_students/fuchs/gui.ps

-Sam




From tismer at appliedbiometrics.com  Tue May 18 15:46:53 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 15:46:53 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <000901bea0e9$5aa2dec0$829e2299@tim>
Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Aaron Watters]
> > ...
> > I guess the question of interest is why are threads insufficient?  I
> > guess they have system limitations on the number of threads or other
> > limitations that wouldn't be a problem with continuations?
> 
> Sam is mucking with thousands of simultaneous I/O-bound socket connections,
> and makes a good case that threads simply don't fly here (each one consumes
> a stack, kernel resources, etc).  It's unclear (to me) that thousands of
> continuations would be *much* better, though, by the time Christian gets
> done making thousands of copies of the Python stack chain.

Well, what he needs here are coroutines and just a single frame
object for every minithread (I think this is a "fiber"?).
If these fibers later do deep function calls before they switch,
there will of course be more frames then.

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Tue May 18 16:35:30 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 16:35:30 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
	            <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <37417AB2.80920595@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> Sam (& others),
> 
> I thought I understood what continuations were, but the examples of
> what you can do with them so far don't clarify the matter at all.
> 
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?
> 
> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
> 
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).  It also includes a code pointer for each stack frame
> indicating where the function corresponding to that stack frame is
> executing (this is the return address if there is a newer stack frame,
> or the current instruction for the newest frame).

Right. For now, this information is on the C stack for each called
function, although almost completely available in the frame chain.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.  I also expect that Scheme's semantic model
> is different than Python here -- e.g. does it matter whether deep or
> shallow copies are made?  I.e. are there mutable *objects* in Scheme?
> (I know there are mutable and immutable *name bindings* -- I think.)

To make it lazy, a gatekeeper must be put on top of the two
splitted frames, which catches the event that one of them
returns. It appears to me that this it the same callcc.new()
object which catches this, splitting frames when hit by a return.

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.
> 
> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.

Right, which is just two or three assignments.

> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

Yes, great. It looks like that switching continuations
is not more expensive than a single Python function call.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

This would mean to avoid creating incompatible continuations.
A continutation may not switch to a frame chain which was created
by a different VM incarnation since this would later on
corrupt the machine stack. One way to assure that would be
a thread-safe function in sys, similar to sys.exc_info()
which gives an id for the current interpreter. continuations
living somewhere in globals would be marked by the interpreter
which created them, and reject to be thrown if they don't match.

The necessary interpreter support appears to be small:

Extend the PyFrame structure by two fields:
  - interpreter ID  (addr of some local variable would do)
  - stack pointer at current instruction.

Change the CALL_FUNCTION opcode to avoid calling eval recursively
in the case of a Python function/method, but the current frame,
build the new one and start over.
RETURN will pop a frame and reload its local variables instead
of returning, as long as there is a frame to pop.

I'm unclear how exceptions should be handled. Are they currently
propagated up across different C calls other than ceval2
recursions?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Tue May 18 17:05:39 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 18 May 1999 11:05:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com>
References: <60057226@toto.iv>
	<14145.927.588572.113256@seattle.nightmare.com>
Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us>

>>>>> "SR" == rushing  <rushing at nightmare.com> writes:

  SR> Somewhere I have an example of coroutines being used for
  SR> parsing, very elegant.  Something like one coroutine does
  SR> lexing, and passes tokens one-by-one to the next level, which
  SR> passes parsed expressions to a compiler, or whatever.  Kinda
  SR> like pipes.

This is the first example that's used in Structured Programming (Dahl,
Djikstra, and Hoare).  I'd be happy to loan a copy to any of the
Python-dev people who sit nearby.

Jeremy



From tismer at appliedbiometrics.com  Tue May 18 17:31:11 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 17:31:11 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <374187BF.36CC65E7@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

(H?stel) Actually, I inserted the "global" later. It worked as well
with a local variable, but I didn't understand it. Still don't :-)

> Or does brute-force frame-copying cause the continuation to set "a" back to
> 2 each time?

No, it doesn't. Behavior is exactly the same with or without
global. I'm not sure wether this is a bug or a feature.
I *think* 'a' as a local has a slot in the frame, so it's
actually a different 'a' living in both copies. But this
would not have worked.
Can it be that before a function call, the interpreter
turns its locals into a dict, using fast_to_locals?
That would explain it.
This is not what I think it should be! Locals need to be
copied.

> > and needs a much better interface.
> 
> Ya, like screw 'em and use threads <wink>.

Never liked threads. These fibers are so neat since
they don't need threads, no locking, and they are
available on systems without threads.

> > But finally I'm quite happy that it worked so smoothly
> > after just a couple of hours (well, about six :)
> 
> Yup!  Playing with Python internals is a treat.
> 
> to-be-continued-ly y'rs  - tim

throw(42) - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From skip at mojam.com  Tue May 18 17:49:42 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 18 May 1999 11:49:42 -0400
Subject: [Python-Dev] Is there another way to solve the continuation problem?
Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>

Okay, from my feeble understanding of the problem it appears that
coroutines/continuations and threads are going to be problematic at best for 
Sam's needs.  Are there other "solutions"?  We know about state machines.
They have the problem that the number of states grows exponentially (?) as
the number of state variables increases.

Can exceptions be coerced into providing the necessary structure without
botching up the application too badly?  Seems that at some point where you
need to do some I/O, you could raise an exception whose second expression
contains the necessary state to get back to where you need to be once the
I/O is ready to go.  The controller that catches the exceptions would use
select or poll to prepare for the I/O then dispatch back to the handlers
using the information from exceptions.

class IOSetup:
    pass

class WaveHands:
    """maintains exception raise info and selects one to go to next"""
    def choose_one(r,w,e):
	pass

    def remember(info):
	pass

def controller(...):
    waiters = WaveHands()
    while 1:
	r, w, e = select([...], [...], [...])
	# using r,w,e, select a waiter to call
	func, place = waiters.choose_one(r,w,e)
	try:
	    func(place)
	except IOSetup, info:
	    waiters.remember(info)


def spam_func(place):
    if place == "spam":
	# whatever I/O we needed to do is ready to go
	bytes = read(some_fd)
	process(bytes)
	# need to read some more from some_fd. args are:
	#    function, target, fd category (r, w), selectable object, 
	raise IOSetup, (spam_func, "eggs" , "r", some_fd)

    elif place == "eggs":
	# that next chunk is ready - get it and proceed...

    elif yadda, yadda, yadda...


One thread, some craftiness needed to construct things.  Seems like it might
isolate some of the statefulness to smaller functional units than a pure
state machine.  Clearly not as clean as continuations would be.  Totally
bogus?  Totally inadequate?  Maybe Sam already does things this way?


Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Tue May 18 19:23:08 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 19:23:08 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

Actually, the frame-copying was not enough to make this 
all behave correctly. Since I didn't change the interpreter,
the ceval.c incarnations still had copies to the old frames.
The only effect which I achieved with frame copying was
that the refcounts were increased correctly.

I have to remove the hardware stack copying now.
Will try to create a non-recursive version of the interpreter.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Wed May 19 01:16:54 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Wed, 19 May 1999 09:16:54 +1000
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>
Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat>

> Sam's needs.  Are there other "solutions"?  We know about
> state machines.
> They have the problem that the number of states grows
> exponentially (?) as
> the number of state variables increases.

Well, I can give you my feeble understanding of "IO Completion Ports", the
technique Win32 provides to "solve" this problem.

My experience is limited to how we used these in a server product designed
to maintain thousands of long-term client connections each spooling large
chunks of data (MSOffice docs - yes, that large :-).  We too could
obviously not afford a thread per connection.  Searching through NT's
documentation, completion ports are the technique they recommend for
high-performance IO, and it appears to deliver.

NT has the concept of a completion port, which in many ways is like an
"inverted semaphore".  You create a completion port with a "max number of
threads" value.  Then, for every IO object you need to use (files, sockets,
pipes etc) you "attach" it to the completion port, along with an integer
key.  This key is (presumably) unique to the file, and usually a pointer to
some structure maintaing the state of the file (ie, connection)

The general programming model is that you have a small number of threads
(possibly 1), and a large number of io objects (eg files).  Each of these
threads is executing a state machine.  When IO is "ready" for a particular
file, one of the available threads is woken, and passed the "key"
associated with the file.  This key identifies the file, and more
importantly the state of that file.  The thread uses the state to perform
the next IO operation, then immediately go back to sleep.  When that IO
operation completes, some other thread is woken to handle that state
change.  What makes this work of course is that _all_ IO is asynch - not a
single IO call in this whole model can afford to block.  NT provides asynch
IO natively.

This sounds very similar to what Medusa does internally, although the NT
model provides a "thread pooling" scheme built-in.  Although our server
performed very well with a single thread and hundreds of high-volume
connections, we chose to run with a default of 5 threads here.

For those still interested, our project has the multi-threaded state
machine I described above implemented in C.  Most of the work is
responsible for spooling the client request data (possibly 100s of kbs)
before handing that data off to the real server.  When the C code
transitions the client through the state of "send/get from the real
server", we actually set a different completion port.  This other
completion port wakes a thread written in Python.  So our architecture
consists of a C implemented thread-pool managing client connections, and a
different Python implemented thread pool that does the real work for each
of these client connections. (The Python side of the world is bound by the
server we are talking to, so Python performance doesnt matter as much - C
wouldnt buy enough)

This means that our state machines are not that complex.  Each "thread
pool" is managing its own, fairly simple state.  NT automatically allows
you to associate state with the IO object, and as we have multiple thread
pools, each one is simple - the one spooling client data is simple, the one
doing the actual server work is simple.  If we had to have a single,
monolithic state machine managing all aspects of the client spooling, _and_
the server work, it would be horrid.

This is all in a shrink-wrapped relatively cheap "Document Management"
product being targetted (successfully, it appears) at huge NT/Exchange
based sites.  Australia's largest Telco are implementing it, and indeed the
company has VC from Intel!  Lots of support from MS, as it helps compete
with Domino.  Not bad for a little startup - now they are wondering what to
do with this Python-thingy they now have in their product that noone else
has ever heard off; but they are planning on keeping it for now :-)
[Funnily, when they started, they didnt think they even _needed_ a server,
so I said "Ill just knock up a little one in Python", and we havent looked
back :-]

Mark.




From tim_one at email.msn.com  Wed May 19 02:48:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 20:48:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim>

[GvR]
> ...
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?

Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme
and its implementation:

ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html

You can pick up a lot from that fast.  Is Steven (Majewski) on this list?
He doped most of this out years ago.

> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
>
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).

Better to think of name resolution following lexical links.  Lexical
closures with indefinite extent are common in Scheme, so much so that name
resolution is (at least conceptually) best viewed as distinct from execution
stacks.

Here's a key:  continuations are entirely about capturing control flow
state, and nothing about capturing binding or data state.  Indeed, mutating
bindings and/or non-local data are the ways distinct invocations of a
continuation communicate with each other, and for this reason true
functional languages generally don't support continuations of the call/cc
flavor.

> It also includes a code pointer for each stack frame indicating where
> the function corresponding to that stack frame is executing (this is
> the return address if there is a newer stack frame, or the current
> instruction for the newest frame).

Yes, although the return address is one piece of information in the current
frame's continuation object -- continuations are used internally for
"regular calls" too.  When a function returns, it passes control thru its
continuation object.  That process restores-- from the continuation
object --what the caller needs to know (in concept:  a pointer to *its*
continuation object, its PC, its name-resolution chain pointer, and its
local eval stack).

Another key point:  a continuation object is immutable.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.

The point of the above is to get across that for Scheme-calling-Scheme,
creating a continuation object copies just a small, fixed number of pointers
(the current continuation pointer, the current name-resolution chain
pointer, the PC), plus the local eval stack.  This is for a "stackless"
interpreter that heap-allocates name-mapping and execution-frame and
continuation objects.  Half the literature is devoted to optimizing one or
more of those away in special cases (e.g., for continuations provably
"up-level", using a stack + setjmp/longjmp instead).

> I also expect that Scheme's semantic model is different than Python
> here -- e.g. does it matter whether deep or shallow copies are made?
> I.e. are there mutable *objects* in Scheme? (I know there are mutable
> and immutable *name bindings* -- I think.)

Same as Python here; Scheme isn't a functional language; has mutable
bindings and mutable objects; any copies needed should be shallow, since
it's "a feature" that invoking a continuation doesn't restore bindings or
object values (see above re communication).

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.

Right, except "stack" is the wrong mental model in the presence of
continuations; it's a general rooted graph (A calls B, B saves a
continuation pointing back to A, B goes on to call A, A saves a continuation
pointing back to B, etc).  If the explicitly saved continuations are never
*invoked*, control will eventually pop back to the root of the graph, so in
that sense there's *a* stack implicit at any given moment.

> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.
>
> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

There's much less copying going on in Scheme-to-Scheme than you might think;
other than that, right on.

> If this all is close enough to the truth, I think that continuations
> involving C stack frames are definitely out -- as Tim Peters
> mentioned, you don't know what the stuff on the C stack of extensions
> refers to.  (My guess would be that Scheme implementations assume that
> any pointers on the C stack point to Scheme objects, so that C stack
> frames can be copied and conservative GC can be used -- this will
> never happen in Python.)

"Scheme" has become a generic term covering dozens of implementations with
varying semantics, and a quick tour of the web suggests that cross-language
Schemes generally put severe restrictions on continuations across language
boundaries.  Most popular seems to be to outlaw them by decree.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

I'd like to go back to examples of what they'd be used for <wink> -- but
fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
and "lambdaness" and syntax-extension facilities, I'm unsure they're going
to work out reasonably in Python practice; it's not enough that they can be
very useful in Scheme, and Sam is highly motivated to go to extremes here.

give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim





From tismer at appliedbiometrics.com  Wed May 19 03:10:15 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 03:10:15 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea191$3f4d1a20$2e9e2299@tim>
Message-ID: <37420F77.48E9940F@appliedbiometrics.com>


Tim Peters wrote:
...

> > Continuations involving only Python stack frames might be supported,
> > if we can agree on the the sharing / copying semantics.  This is where
> > I don't know enough see questions at #2 above).
> 
> I'd like to go back to examples of what they'd be used for <wink> -- but
> fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
> and "lambdaness" and syntax-extension facilities, I'm unsure they're going
> to work out reasonably in Python practice; it's not enough that they can be
> very useful in Scheme, and Sam is highly motivated to go to extremes here.
> 
> give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim

I've put quite many hours into a non-recursive ceval.c
already. Should I continue? At least this would be a little
improvement, also if the continuation thing will not be born. ?

- chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Wed May 19 04:52:04 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 19:52:04 -0700 (PDT)
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <101382377@toto.iv>
Message-ID: <14146.8395.754509.591141@seattle.nightmare.com>

Skip Montanaro writes:
 > Can exceptions be coerced into providing the necessary structure
 > without botching up the application too badly?  Seems that at some
 > point where you need to do some I/O, you could raise an exception
 > whose second expression contains the necessary state to get back to
 > where you need to be once the I/O is ready to go.  The controller
 > that catches the exceptions would use select or poll to prepare for
 > the I/O then dispatch back to the handlers using the information
 > from exceptions.

 > [... code ...]

Well, you just re-invented the 'Reactor' pattern! 8^)

http://www.cs.wustl.edu/~schmidt/patterns-ace.html

 > One thread, some craftiness needed to construct things.  Seems like
 > it might isolate some of the statefulness to smaller functional
 > units than a pure state machine.  Clearly not as clean as
 > continuations would be.  Totally bogus?  Totally inadequate?  Maybe
 > Sam already does things this way?

What you just described is what Medusa does (well, actually, 'Python'
does it now, because the two core libraries that implement this are
now in the library - asyncore.py and asynchat.py).  asyncore doesn't
really use exceptions exactly that way, and asynchat allows you to add 
another layer of processing (basically, dividing the input into
logical 'lines' or 'records' depending on a 'line terminator').

The same technique is at the heart of many well-known network servers,
including INND, BIND, X11, Squid, etc..  It's really just a state
machine underneath (with python functions or methods implementing the
'states').  As long as things don't get too complex.  Python
simplifies things enough to allow one to 'push the difficulty
envelope' a bit further than one could reasonably tolerate in C.  For
example, Squid implements async HTTP (server and client, because it's
a proxy) - but stops short of trying to implement async FTP.  Medusa
implements async FTP, but it's the largest file in the Medusa
distribution, weighing in at a hefty 32KB.

The hard part comes when you want to plug different pieces and
protocols together.  For example, building a simple HTTP or FTP server
is relatively easy, but building an HTTP server *that proxied to an
FTP server* is much more difficult.  I've done these kinds of things,
viewing each as a challenge; but past a certain point it boggles.

The paper I posted about earlier by Matthew Fuchs has a really good
explanation of this, but in the context of GUI event loops... I think
it ties in neatly with this discussion because at the heart of any X11
app is a little guy manipulating a file descriptor.

-Sam




From tim_one at email.msn.com  Wed May 19 07:41:39 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:39 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com>
Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim>

[Sam]
> ...
> Except that since the escape procedure is 'first-class' it can be
> stored away and invoked (and reinvoked) later.  [that's all that
> 'first-class' means: a thing that can be stored in a variable,
> returned from a function, used as an argument, etc..]
>
> I've never seen a let/cc that wasn't full-blown, but it wouldn't
> surprise me.

The let/cc's in question were specifically defined to create continuations
valid only during let/cc's dynamic extent, so that, sure, you could store
them away, but trying to invoke one later could be an error.  It's in that
sense I meant they weren't "first class".

Other flavors of Scheme appear to call this concept "weak continuation", and
use a different verb to invoke it (like call-with-escaping-continuation, or
call/ec).  Suspect the let/cc oddballs I found were simply confused
implementations (there are a lot of amateur Scheme implementations out
there!).

>> Would full-blown coroutines be powerful enough for your needs?

> Yes, I think they would be.  But I think with Python it's going to
> be just about as hard, either way.

Most people on this list are comfortable with coroutines already because
they already understand them -- Jeremy can even reach across the hall and
hand Guido a helpful book <wink>.  So pondering coroutines increase the
number of brain cells willing to think about the implementation.

continuation-examples-leave-people-still-going-"huh?"-after-an-
    hour-of-explanation-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:45 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com>
Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim>

[Christian Tismer]
>>> ...
>>> Yup. With a little counting, it was easy to survive:
>>>
>>> def main():
>>>     global a
>>>     a=2
>>>     thing (5)
>>>     a=a-1
>>>     if a:
>>>         saved.throw (0)

[Tim]
>> Did "a" really need to be global here?  I hope you see the same behavior
>> without the "global a";
[which he does, but for mysterious reasons]

[Christian]
> Actually, the frame-copying was not enough to make this
> all behave correctly. Since I didn't change the interpreter,
> the ceval.c incarnations still had copies to the old frames.
> The only effect which I achieved with frame copying was
> that the refcounts were increased correctly.

All right!  Now you're closer to the real solution <wink>; i.e., copying
wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
when we entered main originally a set of bindings was created for its
locals, and it is that very same set of bindings to which the continuation
returns.  So the continuation *should* reuse them -- making a copy of the
locals is semantically hosed.

This is clearer in Scheme because its "stack" holds *only* control-flow info
(bindings follow a chain of static links, independent of the current "call
stack"), so there's no temptation to run off copying bindings too.

elegant-and-baffling-for-the-price-of-one<wink>-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:56 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:56 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com>
Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim>

[Christian Tismer]
> I've put quite many hours into a non-recursive ceval.c
> already.

Does that mean 6 or 600 <wink>?

> Should I continue? At least this would be a little improvement, also
> if the continuation thing will not be born. ?

Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
so my belief is it's worth pursuing.

but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim





From arw at ifu.net  Wed May 19 15:04:53 1999
From: arw at ifu.net (Aaron Watters)
Date: Wed, 19 May 1999 09:04:53 -0400
Subject: [Python-Dev] continuations and C extensions?
Message-ID: <3742B6F5.C6CB7313@ifu.net>

the immutable GvR intones:
> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where

> I don't know enough see questions at #2 above).

What if there are native C calls mixed in (eg, list.sort calls back to
myclass.__cmp__ which decides to do a call/cc).  One of the really
big advantages of Python in my book is the relative simplicity of
embedding
and extensions, and this is generally one of the failings of lisp
implementations.
I understand lots of scheme implementations purport
to be extendible and embeddable, but in practice you can't do it with
*existing* code -- there is always a show stopper involving having to
change the way some Oracle library which you don't have the source for
does memory management or something... I've known several grad students
who have been bitten by this...  I think having to unroll the C stack
safely
might be one problem area.

With, eg, a netscape nsapi embedding you can actually get into netscape
code calls my code calls netscape code calls my code... suspends in a
continuation?  How would that work?  [my ignorance is torment!]

Threading and extensions are probably also problematic, but at least
it's
better understood, I think.  Just kvetching.  Sorry.
   -- Aaron Watters

ps: Of course there are valid reasons and excellent advantages
  to having continuations, but it's also interesting to consider the
possible cost.
  There ain't no free lunch.





From tismer at appliedbiometrics.com  Wed May 19 21:30:18 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 21:30:18 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
Message-ID: <3743114A.220FFA0B@appliedbiometrics.com>


Tim Peters wrote:
...
> [Christian]
> > Actually, the frame-copying was not enough to make this
> > all behave correctly. Since I didn't change the interpreter,
> > the ceval.c incarnations still had copies to the old frames.
> > The only effect which I achieved with frame copying was
> > that the refcounts were increased correctly.
> 
> All right!  Now you're closer to the real solution <wink>; i.e., copying
> wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
> when we entered main originally a set of bindings was created for its
> locals, and it is that very same set of bindings to which the continuation
> returns.  So the continuation *should* reuse them -- making a copy of the
> locals is semantically hosed.

I tried the most simple thing, and this seemed to be duplicating
the current state of the machine. The frame holds the stack,
and references to all objects.
By chance, the locals are not in a dict, but unpacked into
the frame. (Sometimes I agree with Guido, that optimization
is considered harmful :-)

> This is clearer in Scheme because its "stack" holds *only* control-flow info
> (bindings follow a chain of static links, independent of the current "call
> stack"), so there's no temptation to run off copying bindings too.

The Python stack, besides its intermingledness with the machine
stack, is basically its chain of frames. The value stack pointer
still hides in the machine stack, but that's easy to change.
So the real Scheme-like part is this chain, methinks, with
the current bytecode offset and value stack info.

Making a copy of this in a restartable way means to increase
the refcount of all objects in a frame. Would it be correct
to undo the effect of fast locals before splitting, and redoing
it on activation?

Or do I need to rethink the whole structure? What should
be natural for Python, it at all?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 21:46:49 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 15:46:49 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
	<3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  [Tim Peters]
  >> This is clearer in Scheme because its "stack" holds *only*
  >> control-flow info (bindings follow a chain of static links,
  >> independent of the current "call stack"), so there's no
  >> temptation to run off copying bindings too.

  CT> The Python stack, besides its intermingledness with the machine
  CT> stack, is basically its chain of frames. The value stack pointer
  CT> still hides in the machine stack, but that's easy to change.  So
  CT> the real Scheme-like part is this chain, methinks, with the
  CT> current bytecode offset and value stack info.

  CT> Making a copy of this in a restartable way means to increase the
  CT> refcount of all objects in a frame. Would it be correct to undo
  CT> the effect of fast locals before splitting, and redoing it on
  CT> activation?

Wouldn't it be easier to increase the refcount on the frame object?
Then you wouldn't need to worry about the recounts on all the objects
in the frame, because they would only be decrefed when the frame is
deallocated. 

It seems like the two other things you would need are some way to get
a copy of the current frame and a means to invoke eval_code2 with an
already existing stack frame instead of a new one.

(This sounds too simple, so it's obviously wrong.  I'm just not sure
where.  Is the problem that you really need a seperate stack/graph to
hold the frames?  If we leave them on the Python stack, it could be
hard to dis-entangle value objects from control objects.)

Jeremy



From tismer at appliedbiometrics.com  Wed May 19 22:10:16 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 22:10:16 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
		<3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>
Message-ID: <37431AA8.BC77C615@appliedbiometrics.com>


Jeremy Hylton wrote:

[TP+CT about frame copies et al]

> Wouldn't it be easier to increase the refcount on the frame object?
> Then you wouldn't need to worry about the recounts on all the objects
> in the frame, because they would only be decrefed when the frame is
> deallocated.

Well, the frame is supposed to be run twice, since there are
two incarnations of interpreters working on it: The original one,
and later, when it is thown, another one (or the same, but, in
principle). 
The frame could have been in any state, with a couple
of objects on the stack. My splitting function can be invoked
in some nested context, so I have a current opcode position,
and a current stack position.
Running this once leaves the stack empty, since all the objects are
decrefed. Running this a second time gives a GPF, since the stack is
empty.
Therefore, I made a copy which means to create a duplicate frame
with an extra refcound for all the objects. This makes sure
that both can be restarted at any time.

> It seems like the two other things you would need are some way to get
> a copy of the current frame and a means to invoke eval_code2 with an
> already existing stack frame instead of a new one.

Well, that's exactly where I'm working on.

> (This sounds too simple, so it's obviously wrong.  I'm just not sure
> where.  Is the problem that you really need a seperate stack/graph to
> hold the frames?  If we leave them on the Python stack, it could be
> hard to dis-entangle value objects from control objects.)

Oh, perhaps I should explain it a bit clearer?
What did you mean by the Python stack? The hardware machine stack?

What do we have at the moment:
The stack is the linked list of frames. Every frame has a
local Python evaluation stack. Calls of Python functions produce
a new frame, and the old one is put beneath. This is the control
stack. The additional info on the hardware stack happens to be
a parallel friend of this chain, and currently holds extra info,
but this is an artifact. Adding the current Python stack level
to the frame makes the hardware stack totally unnecessary.

There is a possible speed loss, anyway.
Today, the recursive call of ceval2 is optimized and quite
fast. The non-recursive Version will have to copy variables
in and out from the frames, instead, so there is of course
a little speed penalty to pay.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Wed May 19 23:38:07 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 23:38:07 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > I've put quite many hours into a non-recursive ceval.c
> > already.
> 
> Does that mean 6 or 600 <wink>?

6, or 10, or 20, if I count the time from the first
start with Sam's code, maybe.

> 
> > Should I continue? At least this would be a little improvement, also
> > if the continuation thing will not be born. ?
> 
> Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
> so my belief is it's worth pursuing.
> 
> but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim

Right. Who'se faces? :-)

On the stackless thing, what should I do.
I started to insert minimum patches, but it turns out
that I have to change frames a little (extending).

I can make quite small changes to the interpreter to replace
the recursive calls, but this involves extra flags in some cases,
where the interpreter is called the first time and so on.

What has more probability to be included into a future Python:
Tweaking the current thing only minimally, to make it as similar
as possible as the former?
Or do as much redesign as I think is needed to do it in
a clean way. This would mean to split eval_code2 into two functions,
where one is the interpreter kernel, and one is the frame manager.

There are also other places which do quite deep function calls
and finally call eval_code2. I think these should return a frame
object now. I could convince them to call or return frame,
depending on a flag, but it would be clean to rename the functions,
let them always deal with frames, and put the original function
on top of it.

Short, I can do larger changes which clean this all a bit up,
or I can make small changes which are more tricky to grasp,
but give just small diffs.

How to touch untouchable code the best? :-)

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 23:49:38 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 17:49:38 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com>
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
	<37432F3F.2694DA0E@appliedbiometrics.com>
Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>

I think it makes sense to avoid being obscure or unclear in order to
minimize the size of the patch or the diff.  Realistically, it's
unlikely that anything like your original patch is going to make it
into the CVS tree.  It's primary value is as proof of concept and as
code that the rest of us can try out.  If you make large changes, but
they are clearer, you'll help us out a lot.

We can worry about minimizing the impact of the changes on the
codebase after, after everyone has figured out what's going on and
agree that its worth doing.

feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
Jeremy




From tismer at appliedbiometrics.com  Thu May 20 00:25:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 00:25:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
		<37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>
Message-ID: <37433A50.31E66CB1@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> I think it makes sense to avoid being obscure or unclear in order to
> minimize the size of the patch or the diff.  Realistically, it's
> unlikely that anything like your original patch is going to make it
> into the CVS tree.  It's primary value is as proof of concept and as
> code that the rest of us can try out.  If you make large changes, but
> they are clearer, you'll help us out a lot.

Many many thanks. This is good advice.
I will make absolutely clear what's going on, keep
parts untouched as possible, cut out parts which must
change, and I will not look into speed too much.

Better have a function call more and a bit less optimization,
but a clear and rock-solid introduction of a concept.

> We can worry about minimizing the impact of the changes on the
> codebase after, after everyone has figured out what's going on and
> agree that its worth doing.
> 
> feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
> Jeremy

Hihi - the new little slot with local variables of the 
interpreter happens to have the name "continuation".
Maybe I'd better rename it to "activation record"?.

Now, there is no longer a recoursive call. Instead, a frame
object is returned, which is waiting to be activated
by a dispatcher.

Some more ideas are popping up. Right now, only the recursive
calls can vanish. Callbacks from C code which is called by
the interpreter whcih is called by... is still a problem.

But it might perhaps vanish completely. We have to see
how much the cost is. But if I can manage to let the interpreter
duck and cover also on every call to a builtin? The interpreter
again returns to the dispatcher which then calls the builtin.
Well, if that builtin happens to call to the interpreter again,
it will be a dispatcher again. The machine stack grows a little,
but since everything is saved in the frames, these stacks are
no longer related. This means, the principle works with existing
extension modules, since interpreter-world and C-stack world
are decoupled.
To avoid stack growth, of course a number of builtins would
be better changed, but it is no must in the first place.
execfile for instance is a candidate which needn't call the
interpreter. It could equally parse the file, generate the
code object, build a frame and just return it. This is what
the dispatcher likes: returned frames are put on the chain
and fired.

waah, my bus - running - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Thu May 20 01:56:33 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 19:56:33 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000701bea253$3a182a00$179e2299@tim>

I'm home sick today, so tortured myself <0.9 wink>.

Sam mentioned using coroutines to compare the fringes of two trees, and I
picked a simpler problem:  given a nested list structure, generate the leaf
elements one at a time, in left-to-right order.  A solution to Sam's problem
can be built on that, by getting a generator for each tree and comparing the
leaves a pair at a time until there's a difference.

Attached are solutions in Icon, Python and Scheme.  I have the least
experience with Scheme, but browsing around didn't find a better Scheme
approach than this.

The Python solution is the least satisfactory, using an explicit stack to
simulate recursion by hand; if you didn't know the routine's purpose in
advance, you'd have a hard time guessing it.

The Icon solution is very short and simple, and I'd guess obvious to an
average Icon programmer.  It uses the subset of Icon ("generators") that
doesn't require any C-stack trickery.  However, alone of the three, it
doesn't create a function that could be explicitly called from several
locations to produce "the next" result; Icon's generators are tied into
Icon's unique control structures to work their magic, and breaking that
connection requires moving to full-blown Icon coroutines.  It doesn't need
to be that way, though.

The Scheme solution was the hardest to write, but is a largely mechanical
transformation of a recursive fringe-lister that constructs the entire
fringe in one shot.  Continuations are used twice:  to enable the recursive
routine to resume itself where it left off, and to get each leaf value back
to the caller.  Getting that to work required rebinding non-local
identifiers in delicate ways.  I doubt the intent would be clear to an
average Scheme programmer.

So what would this look like in Continuation Python?  Note that each place
the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and
up-level references are very common.  Two functions are defined at top
level, but seven more at various levels of nesting; the latter can't be
pulled up to the top because they refer to vrbls local to the top-level
functions.  Another (at least initially) discouraging thing to note is that
Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro
facilities.

may-not-be-as-fun-as-it-sounds<wink>-ly y'rs  - tim

Here's the Icon:

procedure main()
    x := [[1, [[2, 3]]], [4], [], [[[5]], 6]]
    every writes(fringe(x), " ")
    write()
end

procedure fringe(node)
    if type(node) == "list" then
        suspend fringe(!node)
    else
        suspend node
end

Here's the Python:

from types import ListType

class Fringe:
    def __init__(self, value):
        self.stack = [(value, 0)]

    def __getitem__(self, ignored):
        while 1:
            # find topmost pending list with something to do
            while 1:
                if not self.stack:
                    raise IndexError
                v, i = self.stack[-1]
                if i < len(v):
                    break
                self.stack.pop()

            this = v[i]
            self.stack[-1] = (v, i+1)
            if type(this) is ListType:
                self.stack.append((this, 0))
            else:
                break

        return this

testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]]

for x in Fringe(testcase):
    print x,
print

Here's the Scheme:

(define list->generator
  ; Takes a list as argument.
  ; Returns a generator g such that each call to g returns
  ; the next element in the list's symmetric-order fringe.
  (lambda (x)
    (letrec {(produce-value #f) ; set to return-to continuation
             (looper
              (lambda (x)
                (cond
                  ((null? x) 'nada) ; ignore null
                  ((list? x)
                   (looper (car x))
                   (looper (cdr x)))
                  (else
                   ; want to produce this non-list fringe elt,
                   ; and also resume here
                   (call/cc
                    (lambda (here)
                      (set! getnext
                            (lambda () (here 'keep-going)))
                      (produce-value x)))))))
             (getnext
              (lambda ()
                (looper x)
                ; have to signal end of sequence somehow;
                ; assume false isn't a legitimate fringe elt
                (produce-value #f)))}

      ; return niladic function that returns next value
      (lambda ()
        (call/cc
         (lambda (k)
           (set! produce-value k)
           (getnext)))))))

(define display-fringe
  (lambda (x)
    (letrec ((g (list->generator x))
             (thiselt #f)
             (looper
              (lambda ()
                (set! thiselt (g))
                (if thiselt
                    (begin
                      (display thiselt) (display " ")
                      (looper))))))
      (looper))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))

(display-fringe test-case)





From MHammond at skippinet.com.au  Thu May 20 02:14:24 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Thu, 20 May 1999 10:14:24 +1000
Subject: [Python-Dev] Interactive Debugging of Python
Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat>

All this talk about stack frames and manipulating them at runtime has
reminded me of one of my biggest gripes about Python.  When I say "biggest
gripe", I really mean "biggest surprise" or "biggest shame".

That is, Python is very interactive and dynamic.  However, when I am
debugging Python, it seems to lose this.  There is no way for me to
effectively change a running program.  Now with VC6, I can do this with C.
Although it is slow and a little dumb, I can change the C side of my Python
world while my program is running, but not the Python side of the world.

Im wondering how feasable it would be to change Python code _while_ running
under the debugger.  Presumably this would require a way of recompiling the
current block of code, patching this code back into the object, and somehow
tricking the stack frame to use this new block of code; even if a first-cut
had to restart the block or somesuch...

Any thoughts on this?

Mark.




From tim_one at email.msn.com  Thu May 20 04:41:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 22:41:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000901bea26a$34526240$179e2299@tim>

[Christian Tismer]
> I tried the most simple thing, and this seemed to be duplicating
> the current state of the machine. The frame holds the stack,
> and references to all objects.
> By chance, the locals are not in a dict, but unpacked into
> the frame. (Sometimes I agree with Guido, that optimization
> is considered harmful :-)

I don't see that the locals are a problem here -- provided you simply leave
them alone <wink>.

> The Python stack, besides its intermingledness with the machine
> stack, is basically its chain of frames.

Right.

> The value stack pointer still hides in the machine stack, but
> that's easy to change.

I'm not sure what "value stack" means here, or "machine stack".  The latter
means the C stack?  Then I don't know which values you have in mind that are
hiding in it (the locals are, as you say, unpacked in the frame, and the
evaluation stack too).  By "evaluation stack" I mean specifically
f->f_valuestack; the current *top* of stack pointer (specifically
stack_pointer) lives in the C stack -- is that what we're talking about?
Whichever, when we're talking about the code, let's use the names the code
uses <wink>.

> So the real Scheme-like part is this chain, methinks, with
> the current bytecode offset and value stack info.

Curiously, f->f_lasti is already materialized every time we make a call, in
order to support tracing.  So if capturing a continuation is done via a
function call (hard to see any other way it could be done <wink>), a
bytecode offset is already getting saved in the frame object.

> Making a copy of this in a restartable way means to increase
> the refcount of all objects in a frame.

You later had a vision of splitting the frame into two objects -- I think.
Whichever part the locals live in should not be copied at all, but merely
have its (single) refcount increased.  The other part hinges on details of
your approach I don't know.  The nastiest part seems to be f->f_valuestack,
which conceptually needs to be (shallow) copied in the current frame and in
all other frames reachable from the current frame's continuation (the chain
rooted at f->f_back today); that's the sum total (along with the same
frames' bytecode offsets) of capturing the control flow state.

> Would it be correct to undo the effect of fast locals before
> splitting, and redoing it on activation?

Unsure what splitting means, but in any case I can't conceive of a reason
for doing anything to the locals.  Their values aren't *supposed* to get
restored upon continuation invocation, so there's no reason to do anything
with their values upon continuation creation either.  Right?  Or are we
talking about different things?

almost-as-good-as-pantomimem<wink>-ly y'rs  - tim





From rushing at nightmare.com  Thu May 20 06:04:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 19 May 1999 21:04:20 -0700 (PDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <50692631@toto.iv>
Message-ID: <14147.34175.950743.79464@seattle.nightmare.com>

Tim Peters writes:
 > The Scheme solution was the hardest to write, but is a largely
 > mechanical transformation of a recursive fringe-lister that
 > constructs the entire fringe in one shot.  Continuations are used
 > twice: to enable the recursive routine to resume itself where it
 > left off, and to get each leaf value back to the caller.  Getting
 > that to work required rebinding non-local identifiers in delicate
 > ways.  I doubt the intent would be clear to an average Scheme
 > programmer.

It's the only way to do it - every example I've seen of using call/cc
looks just like it.

I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
people.  The following should be equivalent:

(define (list->generator x)
  (let ((produce-value #f))

    (define (looper x)
      (cond ((null? x) 'nada)
	    ((list? x)
	     (looper (car x))
	     (looper (cdr x)))
	    (else
	     (call/cc
	      (lambda (here)
		(set! getnext (lambda () (here 'keep-going)))
		(produce-value x))))))

    (define (getnext)
      (looper x)
      (produce-value #f))

    (lambda ()
      (call/cc
       (lambda (k)
	 (set! produce-value k)
	 (getnext))))))

(define (display-fringe x)
  (let ((g (list->generator x)))
    (let loop ((elt (g)))
      (if elt
	  (begin
             (display elt)
             (display " ")
             (loop (g)))))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))
(display-fringe test-case)

 > So what would this look like in Continuation Python?

Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
do this without having the feature to play with.  This presumes a
function "call_cc" that behaves like Scheme's.  I believe the extra
level of indirection is necessary. (i.e., call_cc takes a function as
an argument that takes a continuation function)

class list_generator:

    def __init__ (x):
        self.x = x
        self.k_suspend = None
        self.k_produce = None

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

    def __call__ (self):
        # call self.resume() with a continuation
        # that will return the next fringe element
        return call_cc (self.resume)

    def resume (self, k_produce):
        self.k_produce = k_produce
        if self.k_suspend:
            # resume the suspended walk
            self.k_suspend (None)
        else:
            self.walk (self.x)

    def suspend (self, k_suspend):
        self.k_suspend = k_suspend
        # return a value for __call__
        self.k_produce (self.item)

Variables hold continuations have a 'k_' prefix.  In real life it
might be possible to put the suspend/call/resume machinery in a base
class (Generator?), and override 'walk' as you please.

-Sam




From tim_one at email.msn.com  Thu May 20 09:21:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 03:21:45 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <001d01bea291$6b3efbc0$179e2299@tim>

[Sam, takes up the Continuation Python Challenge]

Thanks, Sam!  I think this is very helpful.

> ...
> It's the only way to do it - every example I've seen of using call/cc
> looks just like it.

Same here -- alas <0.5 wink>.

> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
> people.  The following should be equivalent:

I confess I stopped paying attention to Scheme after R4RS, and largely
because the std decreed that *so* many forms were optional.  Your rework is
certainly nicer, but internal defines and named let are two that R4RS
refused to require, so I always avoided them.  BTW, I *am* a compiler, so
that never bothered me <wink>.

>> So what would this look like in Continuation Python?

> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
> do this without having the feature to play with.

Fully understood.  It's also really hard to implement the feature without
knowing how someone who wants it would like it to behave.  But I don't think
anyone is getting graded on this, so let's have fun <wink>.

Ack!  I have to sleep.  Will study the code in detail later, but first
impression was it looked good!  Especially nice that it appears possible to
package up most of the funky call_cc magic in a base class, so that
non-wizards could reuse it by following a simple protocol.

great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo-
    from-scratch-every-time-ly y'rs  - tim





From skip at mojam.com  Thu May 20 15:27:59 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 09:27:59 -0400 (EDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
References: <50692631@toto.iv>
	<14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>

    Sam> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
    Sam> people.

Sam, you are aware of course that the timbot *is* a compiler, right? ;-)

    >> So what would this look like in Continuation Python?

    Sam> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
    Sam> do this without having the feature to play with.

The thought that it's unlikely one could arrive at a reasonable
approximation of a correct solution for such a small problem without the
ability to "play with" it is sort of scary.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Thu May 20 16:10:32 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 16:10:32 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <008b01bea255$b80cf790$0801a8c0@bobcat>
Message-ID: <374417D8.8DBCB617@appliedbiometrics.com>


Mark Hammond wrote:
> 
> All this talk about stack frames and manipulating them at runtime has
> reminded me of one of my biggest gripes about Python.  When I say "biggest
> gripe", I really mean "biggest surprise" or "biggest shame".
> 
> That is, Python is very interactive and dynamic.  However, when I am
> debugging Python, it seems to lose this.  There is no way for me to
> effectively change a running program.  Now with VC6, I can do this with C.
> Although it is slow and a little dumb, I can change the C side of my Python
> world while my program is running, but not the Python side of the world.
> 
> Im wondering how feasable it would be to change Python code _while_ running
> under the debugger.  Presumably this would require a way of recompiling the
> current block of code, patching this code back into the object, and somehow
> tricking the stack frame to use this new block of code; even if a first-cut
> had to restart the block or somesuch...
> 
> Any thoughts on this?

I'm writing a prototype of a stackless Python, which means that
you will be able to access the current state of the interpreter
completely.
The inner interpreter loop will be isolated from the frame
dispatcher. It will break whenever the ticker goes zero.
If you set the ticker to one, you will be able to single
step on every opcode, have the value stack, the frame chain,
everything.
I think, with this you can do very much.
But tell me if you want a callback hook somewhere.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Thu May 20 18:52:21 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 18:52:21 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com>

Cleaning up, clarifying, trying to understand...

Tim Peters wrote:
> 
> [Christian Tismer]
> > I tried the most simple thing, and this seemed to be duplicating
> > the current state of the machine. The frame holds the stack,
> > and references to all objects.
> > By chance, the locals are not in a dict, but unpacked into
> > the frame. (Sometimes I agree with Guido, that optimization
> > is considered harmful :-)
> 
> I don't see that the locals are a problem here -- provided you simply leave
> them alone <wink>.

This depends on wether I have to duplicate frames
or not. Below...

> > The Python stack, besides its intermingledness with the machine
> > stack, is basically its chain of frames.
> 
> Right.
> 
> > The value stack pointer still hides in the machine stack, but
> > that's easy to change.
> 
> I'm not sure what "value stack" means here, or "machine stack".  The latter
> means the C stack?  Then I don't know which values you have in mind that are
> hiding in it (the locals are, as you say, unpacked in the frame, and the
> evaluation stack too).  By "evaluation stack" I mean specifically
> f->f_valuestack; the current *top* of stack pointer (specifically
> stack_pointer) lives in the C stack -- is that what we're talking about?

Exactly!

> Whichever, when we're talking about the code, let's use the names the code
> uses <wink>.

The evaluation stack pointer is a local variable in the
C stack and must be written to the frame to become independant
from the C stack. Sounds better now?

> 
> > So the real Scheme-like part is this chain, methinks, with
> > the current bytecode offset and value stack info.
> 
> Curiously, f->f_lasti is already materialized every time we make a call, in
> order to support tracing.  So if capturing a continuation is done via a
> function call (hard to see any other way it could be done <wink>), a
> bytecode offset is already getting saved in the frame object.

You got me. I'm just completing what is partially there.

> > Making a copy of this in a restartable way means to increase
> > the refcount of all objects in a frame.
> 
> You later had a vision of splitting the frame into two objects -- I think.

My wrong wording. Not splitting, but duplicting. If a frame is the
current state, I make it two frames to have two current states.
One will be saved, the other will be run. This is what I call
"splitting".
Actually, splitting must occour whenever a frame can be reached twice,
in order to keep elements alive.

> Whichever part the locals live in should not be copied at all, but merely
> have its (single) refcount increased.  The other part hinges on details of
> your approach I don't know.  The nastiest part seems to be f->f_valuestack,
> which conceptually needs to be (shallow) copied in the current frame and in
> all other frames reachable from the current frame's continuation (the chain
> rooted at f->f_back today); that's the sum total (along with the same
> frames' bytecode offsets) of capturing the control flow state.

Well, I see. You want one locals and one globals, shared by two
incarnations. Gets me into trouble.

> > Would it be correct to undo the effect of fast locals before
> > splitting, and redoing it on activation?
> 
> Unsure what splitting means, but in any case I can't conceive of a reason
> for doing anything to the locals.  Their values aren't *supposed* to get
> restored upon continuation invocation, so there's no reason to do anything
> with their values upon continuation creation either.  Right?  Or are we
> talking about different things?

Let me explain. What Python does right now is:
When a function is invoked, all local variables are copied
into fast_locals, well of course just references are copied
and counts increased. These fast locals give a lot of speed
today, we must have them.
You are saying I have to share locals between frames. Besides
that will be a reasonable slowdown, since an extra structure
must be built and accessed indirectly (right now, i's all fast,
living in the one frame buffer), I cannot say that I'm convinced
that this is what we need.

Suppose you have a function

def f(x):
    # do something
    ...
    # in some context, wanna have a snapshot
    global snapshot  # initialized to None
    if not snapshot:
        snapshot = callcc.new()
    # continue computation
    x = x+1
    ...

What I want to achieve is that I can run this again, from my
snapshot. But with shared locals, my parameter x of the
snapshot would have changed to x+1, which I don't find useful.
I want to fix a state of the current frame and still think
it should "own" its locals. Globals are borrowed, anyway.
Class instances will anyway do what you want, since
the local "self" is a mutable object.

How do you want to keep computations independent
when locals are shared? For me it's just easier to
implement and also to think with the shallow copy.
Otherwise, where is my private place?
Open for becoming convinced, of course :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Thu May 20 21:26:30 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Thu, 20 May 1999 15:26:30 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
References: <000901bea26a$34526240$179e2299@tim>
	<37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  CT> What I want to achieve is that I can run this again, from my
  CT> snapshot. But with shared locals, my parameter x of the snapshot
  CT> would have changed to x+1, which I don't find useful.  I want to
  CT> fix a state of the current frame and still think it should "own"
  CT> its locals. Globals are borrowed, anyway.  Class instances will
  CT> anyway do what you want, since the local "self" is a mutable
  CT> object.

  CT> How do you want to keep computations independent when locals are
  CT> shared? For me it's just easier to implement and also to think
  CT> with the shallow copy.  Otherwise, where is my private place?
  CT> Open for becoming convinced, of course :-)

I think you're making things a lot more complicated by trying to
instantiate new variable bindings for locals every time you create a
continuation.  Can you give an example of why that would be helpful?
(Ok.  I'm not sure I can offer a good example of why it would be
helpful to share them, but it makes intuitive sense to me.)

The call_cc mechanism is going to let you capture the current
continuation, save it somewhere, and call on it again as often as you
like.  Would you get a fresh locals each time you used it?  or just
the first time?  If only the first time, it doesn't seem that you've
gained a whole lot.

Also, all the locals that are references to mutable objects are
already effectively shared.  So it's only a few oddballs like ints
that are an issue.

Jeremy



From tim_one at email.msn.com  Fri May 21 00:04:04 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:04 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>
Message-ID: <000601bea30c$ad51b220$9d9e2299@tim>

[Tim]
> So what would this look like in Continuation Python?

[Sam]
> Here's my first hack at it.  Most likely wrong.  It is
> REALLY HARD to do this without having the feature to play with.

[Skip]
> The thought that it's unlikely one could arrive at a reasonable
> approximation of a correct solution for such a small problem without the
> ability to "play with" it is sort of scary.

Yes it is.  But while the problem is small, it's not easy, and only the Icon
solution wrote itself (not a surprise -- Icon was designed for expressing
this kind of algorithm, and the entire language is actually warped towards
it).  My first stab at the Python stack-fiddling solution had bugs too, but
I conveniently didn't post that <wink>.

After studying Sam's code, I expect it *would* work as written, so it's a
decent bet that it's a reasonable approximation to a correct solution as-is.

A different Python approach using threads can be built using

    Demo/threads/Generator.py

from the source distribution.  To make that a fair comparison, I would have
to post the supporting machinery from Generator.py too -- and we can ask
Guido whether Generator.py worked right the first time he tried it <wink>.

The continuation solution is subtle, requiring real expertise; but the
threads solution doesn't fare any better on that count (building the support
machinery with threads is also a baffler if you don't have thread
expertise).  If we threw Python metaclasses into the pot too, they'd be a
third kind of nightmare for the non-expert.

So, if you're faced with this kind of task, there's simply no easy way to
get it done.  Thread- and (it appears) continuation- based machinery can be
crafted once by an expert, then packaged into an easy-to-use protocol for
non-experts.

All in all, I view continuations as a feature most people should actively
avoid!  I think it has that status in Scheme too (e.g., the famed Schemer's
SICP textbook doesn't even mention call/cc).  Its real value (if any <wink>)
is as a Big Invisible Hammer for certified wizards.  Where call_cc leaks
into the user's view of the world I'd try to hide it; e.g., where Sam has

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

I'd do

    def walk(self, x):
        if type(x) == type([]):
            for item in x:
                self.walk(item)
        else:
            self.put(x)

where "put" is inherited from the base class (part of the protocol) and
hides the call_cc business.  Do enough of this, and we'll rediscover why
Scheme demands that tail calls not push a new stack frame <0.9 wink>.

the-tradeoffs-are-murky-ly y'rs  - tim





From tim_one at email.msn.com  Fri May 21 00:04:09 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:09 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <000701bea30c$af7a1060$9d9e2299@tim>

[Christian]
[... clarified stuff ... thanks! ... much clearer ...]
> ...
> If a frame is the current state, I make it two frames to have two
> current states.  One will be saved, the other will be run. This is
> what I call "splitting".  Actually, splitting must occour whenever
> a frame can be reached twice, in order to keep elements alive.

That part doesn't compute:  if a frame can be reached by more than one path,
its refcount must be at least equal to the number of its immediate
predecessors, and its refcount won't fall to 0 before it becomes
unreachable.  So while you may need to split stuff for *some* reasons, I
can't see how keeping elements alive could be one of those reasons (unless
you're zapping frame contents *before* the frame itself is garbage?).

> ...
> Well, I see. You want one locals and one globals, shared by two
> incarnations. Gets me into trouble.

Just clarifying what Scheme does.  Since they've been doing this forever, I
don't want to toss their semantics on a whim <wink>.  It's at least a
conceptual thing:  why *should* locals follow different rules than globals?
If Python2 grows lexical closures, the only thing special about today's
"locals" is that they happen to be the first guys found on the search path.
Conceptually, that's really all they are today too.

Here's the clearest Scheme example I can dream up:

(define k #f)

(define (printi i)
  (display "i is ") (display i) (newline))

(define (test n)
  (let ((i n))
    (printi i)
    (set! i (- i 1))
    (printi i)
    (display "saving continuation") (newline)
    (call/cc (lambda (here) (set! k here)))
    (set! i (- i 1))
    (printi i)
    (set! i (- i 1))
    (printi i)))

No loops, no recursive calls, just a straight chain of fiddle-a-local ops.
Here's some output:

> (test 5)
i is 5
i is 4
saving continuation
i is 3
i is 2
> (k #f)
i is 1
i is 0
> (k #f)
i is -1
i is -2
> (k #f)
i is -3
i is -4
>

So there's no question about what Scheme thinks is proper behavior here.

> ...
> Let me explain. What Python does right now is:
> When a function is invoked, all local variables are copied
> into fast_locals, well of course just references are copied
> and counts increased. These fast locals give a lot of speed
> today, we must have them.

Scheme (most of 'em, anyway) also resolves locals via straight base + offset
indexing.

> You are saying I have to share locals between frames. Besides
> that will be a reasonable slowdown, since an extra structure
> must be built and accessed indirectly (right now, i's all fast,
> living in the one frame buffer),

GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
care where that points *to* <wink -- but, really, it could point into some
other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
due to continuation needs extra setup work?  Scheme saves itself by putting
name-resolution and continuation info into different structures; to mimic
the semantics, Python would need to get the same end effect.

> I cannot say that I'm convinced that this is what we need.
>
> Suppose you have a function
>
> def f(x):
>     # do something
>     ...
>     # in some context, wanna have a snapshot
>     global snapshot  # initialized to None
>     if not snapshot:
>         snapshot = callcc.new()
>     # continue computation
>     x = x+1
>     ...
>
> What I want to achieve is that I can run this again, from my
> snapshot. But with shared locals, my parameter x of the
> snapshot would have changed to x+1, which I don't find useful.

You need a completely fleshed-out example to score points here:  the use of
call/cc is subtle, hinging on details, and fragments ignore too much.  If
you do want the same x,

    commonx = x
    if not snapshot:
         # get the continuation
    # continue computation
    x = commonx
    x = x+1
    ...

That is, it's easy to get it.  But if you *do* want to see changes to the
locals (which is one way for those distinct continuation invocations to
*cooperate* in solving a task -- see below), but the implementation doesn't
allow for it, I don't know what you can do to worm around it short of making
x global too.  But then different *top* level invocations of f will stomp on
that shared global, so that's not a solution either.  Maybe forget functions
entirely and make everything a class method.

> I want to fix a state of the current frame and still think
> it should "own" its locals. Globals are borrowed, anyway.
> Class instances will anyway do what you want, since
> the local "self" is a mutable object.
>
> How do you want to keep computations independent
> when locals are shared? For me it's just easier to
> implement and also to think with the shallow copy.
> Otherwise, where is my private place?
> Open for becoming convinced, of course :-)

I imagine it comes up less often in Scheme because it has no loops:
communication among "iterations" is via function arguments or up-level
lexical vrbls.

So recall your uses of Icon generators instead:  like Python, Icon does have
loops, and two-level scoping, and I routinely build loopy Icon generators
that keep state in locals.  Here's a dirt-simple example I emailed to Sam
earlier this week:

procedure main()
    every result := fib(0, 1) \ 10 do
        write(result)
end

procedure fib(i, j)
    local temp
    repeat {
        suspend i
        temp := i + j
        i := j
        j := temp
    }
end

which prints

0
1
1
2
3
5
8
13
21
34

If Icon restored the locals (i, j, temp) upon each fib resumption, it would
generate a zero followed by an infinite sequence of ones(!).

Think of a continuation as a *paused* computation (which it is) rather than
an *independent* one (which it isn't <wink>), and I think it gets darned
hard to argue.

theory-and-practice-agree-here-in-my-experience-ly y'rs  - tim





From MHammond at skippinet.com.au  Fri May 21 01:01:22 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 09:01:22 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com>
Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat>

> I'm writing a prototype of a stackless Python, which means that
> you will be able to access the current state of the interpreter
> completely.
> The inner interpreter loop will be isolated from the frame
> dispatcher. It will break whenever the ticker goes zero.
> If you set the ticker to one, you will be able to single
> step on every opcode, have the value stack, the frame chain,
> everything.

I think the main point is how to change code when a Python frame already
references it.  I dont think the structure of the frames is as important as
the general concept.  But while we were talking frame-fiddling it seemed a
good point to try and hijack it a little :-)

Would it be possible to recompile just a block of code (eg, just the
current function or method) and patch it back in such a way that the
current frame continues execution of the new code?

I feel this is somewhat related to the inability to change class
implementation for an existing instance.  I know there have been hacks
around this before but they arent completly reliable and IMO it would be
nice if the core Python made it easier to change already running code -
whether that code is in an existing stack frame, or just in an already
created instance, it is very difficult to do.

This has come to try and deflect some conversation away from changing
Python as such towards an attempt at enhancing its _environment_.  To
paraphrase many people before me, even if we completely froze the language
now there would still plenty of work ahead of us :-)

Mark.




From guido at CNRI.Reston.VA.US  Fri May 21 02:06:51 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 20:06:51 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000."
             <00c001bea314$aefc5b40$0801a8c0@bobcat> 
References: <00c001bea314$aefc5b40$0801a8c0@bobcat> 
Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us>

> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

This topic sounds mostly unrelated to the stackless discussion -- in
either case you need to be able to fiddle the contents of the frame
and the bytecode pointer to reflect the changed function.

Some issues:

  - The slots containing local variables may be renumbered after
    recompilation; fortunately we know the name--number mapping so we can
    move them to their new location.  But it is still tricky.

  - Should you be able to edit functions that are present on the call
    stack below the top?  Suppose we have two functions:

	def f():
	    return 1 + g()

	def g():
	    return 0

    Suppose set a break in g(), and then edit the source of f().  We can
    do all sorts of evil to f(): e.g. we could change it to

	    return g() + 2

    which affects the contents of the value stack when g() returns
    (originally, the value stack contained the value 1, now it is empty).
    Or we could even change f() to

	    return 3

    thereby eliminating the call to g() altogether!

What kind of limitations do other systems that support modifying a
"live" program being debugged impose?  Only allowing modification of
the function at the top of the stack might eliminate some problems,
although there are still ways to mess up.  The value stack is not 
always empty even when we only stop at statement boundaries -- e.g. it 
contains 'for' loop indices, and there's also the 'block' stack, which 
contains try-except information.  E.g. what should happen if we change

    def f():
        for i in range(10):
            print 1

stopped at the 'print 1' into

    def f():
        print 1

???

(Ditto for removing or adding a try/except block.)

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I've been thinking a bit about this.  Function objects now have
mutable func_code attributes (and also func_defaults), I think we can
use this.

The hard part is to do the analysis needed to decide which functions
to recompile!  Ideally, we would simply edit a file and tell the
programming environment "recompile this".  The programming environment
would compare the changed file with the old version that it had saved
for this purpose, and notice (for example) that we changed two methods
of class C.  It would then recompile those methods only and stuff the
new code objects in the corresponding function objects.

But what would it do when we changed a global variable?  Say a module
originally contains a statement "x = 0".  Now we change the source
code to say "x = 100".  Should we change the variable x?  Suppose that
x is modified by some of the computations in the module, and the that,
after some computations, the actual value of x was 50.  Should the
"recompile" reset x to 100 or leave it alone?

One option would be to actually change the semantics of the class and
def statements so that they modify an existing class or function
rather than using assignment.  Effectively, this proposal would change
the semantics of

    class A:
        ...some code...

    class A:
        ...some more code...

to be the same as

    class A:
        ...more code...
        ...some more code...
        
This is somewhat similar to the way the module or package commands in
some other dynamic languages work, I think; and I don't think this
would break too much existing code.

The proposal would also change

    def f():
        ...some code...

    def f():
        ...other code...

but here the equivalence is not so easy to express, since I want
different semantics (I don't want the second f's code to be tacked
onto the end of the first f's code).

If we understand that def f(): ... really does the following:

    f = NewFunctionObject()
    f.func_code = ...code object...

then the construct above (def f():... def f(): ...) would do this:

    f = NewFunctionObject()
    f.func_code = ...some code...

    f.func_code = ...other code...

i.e. there is no assignment of a new function object for the second
def.

Of course if there is a variable f but it is not a function, it would
have to be assigned a new function object first.

But in the case of def, this *does* break existing code.  E.g.

# module A
from B import f
.
.
.
if ...some test...:
    def f(): ...some code...

This idiom conditionally redefines a function that was also imported
from some other module.  The proposed new semantics would change B.f
in place!

So perhaps these new semantics should only be invoked when a special
"reload-compile" is asked for...  Or perhaps the programming
environment could do this through source parsing as I proposed
before...

> This has come to try and deflect some conversation away from changing
> Python as such towards an attempt at enhancing its _environment_.  To
> paraphrase many people before me, even if we completely froze the language
> now there would still plenty of work ahead of us :-)

Please, no more posts about Scheme.  Each new post mentioning call/cc
makes it *less* likely that something like that will ever be part of
Python.  "What if Guido's brain exploded?" :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Fri May 21 03:13:28 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 21:13:28 -0400 (EDT)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
	<199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com>

    Guido> What kind of limitations do other systems that support modifying
    Guido> a "live" program being debugged impose?  Only allowing
    Guido> modification of the function at the top of the stack might
    Guido> eliminate some problems, although there are still ways to mess
    Guido> up.

Frame objects maintain pointers to the active code objects, locals and
globals, so modifying a function object's code or globals shouldn't have any
effect on currently executing frames, right?  I assume frame objects do the
usual INCREF/DECREF dance, so the old code object won't get deleted before
the frame object is tossed.

    Guido> But what would it do when we changed a global variable?  Say a
    Guido> module originally contains a statement "x = 0".  Now we change
    Guido> the source code to say "x = 100".  Should we change the variable
    Guido> x?  Suppose that x is modified by some of the computations in the
    Guido> module, and the that, after some computations, the actual value
    Guido> of x was 50.  Should the "recompile" reset x to 100 or leave it
    Guido> alone?

I think you should note the change for users and give them some way to
easily pick between old initial value, new initial value or current value.

    Guido> Please, no more posts about Scheme.  Each new post mentioning
    Guido> call/cc makes it *less* likely that something like that will ever
    Guido> be part of Python.  "What if Guido's brain exploded?" :-)

I agree.  I see call/cc or set! and my eyes just glaze over...

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From MHammond at skippinet.com.au  Fri May 21 03:42:14 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 11:42:14 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat>

[Guido writes...]
> This topic sounds mostly unrelated to the stackless discussion -- in

Sure is - I just saw that as an excuse to try and hijack it <wink>

> Some issues:
>
>   - The slots containing local variables may be renumbered after

Generally, I think we could make something very useful even with a number
of limitations.  For example, I would find a first cut completely
acceptable and a great improvement on today if:

* Only the function at the top of the stack can be recompiled and have the
code reflected while executing.  This function also must be restarted after
such an edit.  If the function uses global variables or makes calls that
restarting will screw-up, then either a) make the code changes _before_
doing this stuff, or b) live with it for now, and help us remove the
limitation :-)

That may make the locals being renumbered easier to deal with, and also
remove some of the problems you discussed about editing functions below the
top.

> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?  Only allowing modification of

I can only speak for VC, and from experience at that - I havent attempted
to find documentation on it.

It accepts most changes while running.  The current line is fine.  If you
create or change the definition of globals (and possibly even the type of
locals?), the "incremental compilation" fails, and you are given the option
of continuing with the old code, or stopping the process and doing a full
build.

When the debug session terminates, some link process (and maybe even
compilation?) is done to bring the .exe on disk up to date with the
changes.

If you do wierd stuff like delete the line being executed, it usually gives
you some warning message before either restarting the function or trying to
pick a line somewhere near the line you deleted.  Either way, it can screw
up, moving the "current" line somewhere else - it doesnt crash the
debugger, but may not do exactly what you expected.  It is still a _huge_
win, and a great feature!

Ironically, I turn this feature _off_ for Python extensions.  Although
changing the C code is great, in 99% of the cases I also need to change
some .py code, and as existing instances are affected I need to restart the
app anyway - so I may as well do a normal build at that time.  ie, C now
lets me debug incrementally, but a far more dynamic language prevents this
feature being useful ;-)

> the function at the top of the stack might eliminate some problems,
> although there are still ways to mess up.  The value stack is not
> always empty even when we only stop at statement boundaries

If we forced a restart would this be better?  Can we reliably reset the
stack to the start of the current function?

> I've been thinking a bit about this.  Function objects now have
> mutable func_code attributes (and also func_defaults), I think we can
> use this.
>
> The hard part is to do the analysis needed to decide which functions
> to recompile!  Ideally, we would simply edit a file and tell the
> programming environment "recompile this".  The programming environment
> would compare the changed file with the old version that it had saved
> for this purpose, and notice (for example) that we changed two methods
> of class C.  It would then recompile those methods only and stuff the
> new code objects in the corresponding function objects.

If this would work for the few changed functions/methods, what would the
impact be of doing it for _every_ function (changed or not)?  Then the
analysis can drop to the module level which is much easier.  I dont think a
slight performace hit is a problem at all when doing this stuff.

> One option would be to actually change the semantics of the class and
> def statements so that they modify an existing class or function
> rather than using assignment.  Effectively, this proposal would change
> the semantics of
>
>     class A:
>         ...some code...
>
>     class A:
>         ...some more code...
>
> to be the same as
>
>     class A:
>         ...more code...
>         ...some more code...

Or extending this (didnt this come up at the latest IPC?)
# .\package\__init__.py
class BigMutha:
  pass

# .\package\something.py
class package.BigMutha:
  def some_category_of_methods():
    ...

# .\package\other.py
class package.BigMutha:
  def other_category_of_methods():
    ...
[Of course, this wont fly as it stands; just a conceptual possibility]

> So perhaps these new semantics should only be invoked when a special
> "reload-compile" is asked for...  Or perhaps the programming
> environment could do this through source parsing as I proposed
> before...


From guido at CNRI.Reston.VA.US  Fri May 21 05:02:49 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 23:02:49 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000."
             <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us>

> Generally, I think we could make something very useful even with a number
> of limitations.  For example, I would find a first cut completely
> acceptable and a great improvement on today if:
> 
> * Only the function at the top of the stack can be recompiled and have the
> code reflected while executing.  This function also must be restarted after
> such an edit.  If the function uses global variables or makes calls that
> restarting will screw-up, then either a) make the code changes _before_
> doing this stuff, or b) live with it for now, and help us remove the
> limitation :-)

OK, restarting the function seems a reasonable compromise and would
seem relatively easy to implement.  Not *real* easy though: it turns
out that eval_code2() is called with a code object as argument, and
it's not entirely trivial to figure out the corresponding function
object from which to grab the new code object.  But it could be done
-- give it a try.  (Don't wait for me, I'm ducking for cover until at
least mid June.)

> Ironically, I turn this feature _off_ for Python extensions.  Although
> changing the C code is great, in 99% of the cases I also need to change
> some .py code, and as existing instances are affected I need to restart the
> app anyway - so I may as well do a normal build at that time.  ie, C now
> lets me debug incrementally, but a far more dynamic language prevents this
> feature being useful ;-)

I hear you.

> If we forced a restart would this be better?  Can we reliably reset the
> stack to the start of the current function?

Yes, no problem.

> If this would work for the few changed functions/methods, what would the
> impact be of doing it for _every_ function (changed or not)?  Then the
> analysis can drop to the module level which is much easier.  I dont think a
> slight performace hit is a problem at all when doing this stuff.

Yes, this would be fine too.

> >"What if Guido's brain exploded?" :-)
> 
> At least on that particular topic I didnt even consider I was the only one
> in fear of that!  But it is good to know that you specifically are too :-)

Have no fear.  I've learned to say no. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Fri May 21 07:36:44 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 21 May 1999 01:36:44 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <000401bea34b$e93fcda0$d89e2299@tim>

[GvR]
> ...
> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?

As an ex-compiler guy, I should have something wise to say about that.
Alas, I've never used a system that allowed more than poking new values into
vrbls, and the thought of any more than that makes me vaguely ill!  Oh,
that's right -- I'm vaguely ill anyway today.  Still-- oooooh -- the
problems.

This later got reduced to restarting the topmost function from scratch.
That has some attraction, especially on the bang-for-buck-o-meter.

> ...
> Please, no more posts about Scheme.  Each new post mentioning call/cc
> makes it *less* likely that something like that will ever be part of
> Python.  "What if Guido's brain exploded?" :-)

What a pussy <wink>.  Really, overall continuations are much less trouble to
understand than threads -- there's only one function in the entire
interface!

OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
way to get pseudo-parallel semantics regardless of OS.

changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs  - tim





From tim_one at email.msn.com  Sat May  1 10:32:30 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 1 May 1999 04:32:30 -0400
Subject: [Python-Dev] Speed (was RE: [Python-Dev] More flexible namespaces.)
In-Reply-To: <14121.55659.754846.708467@amarok.cnri.reston.va.us>
Message-ID: <000801be93ad$27772ea0$7a9e2299@tim>

[Andrew M. Kuchling]
> ...
> A performance improvement project would definitely be a good idea
> for 1.6, and a good sub-topic for python-dev.

To the extent that optimization requires uglification, optimization got
pushed beyond Guido's comfort zone back around 1.4 -- little has made it in
since then.

Not griping; I'm just trying to avoid enduring the same discussions for the
third to twelfth times <wink>.

Anywho, on the theory that a sweeping speedup patch has no chance of making
it in regardless, how about focusing on one subsystem?  In my experience,
the speed issue Python gets beat up the most for is the relative slowness of
function calls.  It would be very good if eval_code2 somehow or other could
manage to invoke a Python function without all the hair of a recursive C
call, and I believe Guido intends to move in that direction for Python2
anyway.  This would be a good time to start exploring that seriously.

inspirationally y'rs  - tim





From da at ski.org  Sun May  2 00:15:32 1999
From: da at ski.org (David Ascher)
Date: Sat, 1 May 1999 15:15:32 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <37296856.5875AAAF@lemburg.com>
Message-ID: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>

> Since you put out to objectives, I'd like to propose a little
> different approach...
> 
> 1. Have eval/exec accept any mapping object as input
> 
> 2. Make those two copy the content of the mapping object into real
>    dictionaries
> 
> 3. Provide a hook into the dictionary implementation that can be
>    used to redirect KeyErrors and use that redirection to forward
>    the request to the original mapping objects

Interesting counterproposal.  I'm not sure whether any of the proposals on
the table really do what's needed for e.g. case-insensitive namespace
handling.  I can see how all of the proposals so far allow
case-insensitive reference name handling in the global namespace, but
don't we also need to hook into the local-namespace creation process to
allow case-insensitivity to work throughout? 

--david






From da at ski.org  Sun May  2 17:15:57 1999
From: da at ski.org (David Ascher)
Date: Sun, 2 May 1999 08:15:57 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <00bc01be942a$47d94070$0801a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9905020810270.152-100000@david.ski.org>

On Sun, 2 May 1999, Mark Hammond wrote:

> > I'm not sure whether any of the
> > proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation
> > process to
> > allow case-insensitivity to work throughout?
> 
> Why not?  I pictured case insensitive namespaces working so that they
> retain the case of the first assignment, but all lookups would be
> case-insensitive.
> 
> Ohh - right!  Python itself would need changing to support this.  I suppose
> that faced with code such as:
> 
> def func():
>   if spam:
>     Spam=1
> 
> Python would generate code that refers to "spam" as a local, and "Spam" as
> a global.
> 
> Is this why you feel it wont work?

I hadn't thought of that, to be truthful, but I think it's more generic.
[FWIW, I never much cared for the tag-variables-at-compile-time
optimization in CPython, and wouldn't miss it if were lost.]

The point is that if I eval or exec code which calls a function specifying
some strange mapping as the namespaces (global and current-local) I
presumably want to also specify how local namespaces work for the
function calls within that code snippet.  That means that somehow Python
has to know what kind of namespace to use for local environments, and not
use the standard dictionary.  Maybe we can simply have it use a
'.clear()'ed .__copy__ of the specified environment.

  exec 'foo()' in globals(), mylocals

would then call foo and within foo, the local env't would be
mylocals.__copy__.clear().  

Anyway, something for those-with-the-patches to keep in mind.  

--david





From tismer at appliedbiometrics.com  Sun May  2 15:00:37 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 02 May 1999 15:00:37 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <372C4C75.5B7CCAC8@appliedbiometrics.com>


David Ascher wrote:
[Marc:> 
> > Since you put out to objectives, I'd like to propose a little
> > different approach...
> >
> > 1. Have eval/exec accept any mapping object as input
> >
> > 2. Make those two copy the content of the mapping object into real
> >    dictionaries
> >
> > 3. Provide a hook into the dictionary implementation that can be
> >    used to redirect KeyErrors and use that redirection to forward
> >    the request to the original mapping objects

I don't think that this proposal would give so much new
value. Since a mapping can also be implemented in arbitrary
ways, say by functions, a mapping is not necessarily finite
and might not be changeable into a dict.

[David:>
> Interesting counterproposal.  I'm not sure whether any of the proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation process to
> allow case-insensitivity to work throughout?

Case-independant namespaces seem to be a minor point,
nice to have for interfacing to other products, but then,
in a function, I see no benefit in changing the semantics
of function locals? The lookup of foreign symbols would 
always be through a mapping object. If you take COM for 
instance, your access to a COM wrapper for an arbitrary
object would be through properties of this object. After
assignment to a local function variable, why should we
support case-insensitivity at all?

I would think mapping objects would be a great 
simplification of lazy imports in COM, where
we would like to avoid to import really huge
namespaces in one big slurp. Also the wrapper code
could be made quite a lot easier and faster without
so much getattr/setattr trapping.

Does btw. anybody really want to see case-insensitivity
in Python programs? I'm quite happy with it as it is,
and I would even force the use to always use the same
case style after he has touched an external property
once. Example for Excel: You may write "xl.workbooks"
in lowercase, but then you have to stay with it.
This would keep Python source clean for, say, PyLint.

my 0.02 Euro - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Sun May  2 01:28:11 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Sun, 2 May 1999 09:28:11 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org>
Message-ID: <00bc01be942a$47d94070$0801a8c0@bobcat>

> I'm not sure whether any of the
> proposals on
> the table really do what's needed for e.g. case-insensitive namespace
> handling.  I can see how all of the proposals so far allow
> case-insensitive reference name handling in the global namespace, but
> don't we also need to hook into the local-namespace creation
> process to
> allow case-insensitivity to work throughout?

Why not?  I pictured case insensitive namespaces working so that they
retain the case of the first assignment, but all lookups would be
case-insensitive.

Ohh - right!  Python itself would need changing to support this.  I suppose
that faced with code such as:

def func():
  if spam:
    Spam=1

Python would generate code that refers to "spam" as a local, and "Spam" as
a global.

Is this why you feel it wont work?

Mark.




From mal at lemburg.com  Sun May  2 21:24:54 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 02 May 1999 21:24:54 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <Pine.WNT.4.05.9905011508240.154-100000@david.ski.org> <372C4C75.5B7CCAC8@appliedbiometrics.com>
Message-ID: <372CA686.215D71DF@lemburg.com>

Christian Tismer wrote:
> 
> David Ascher wrote:
> [Marc:>
> > > Since you put out the objectives, I'd like to propose a little
> > > different approach...
> > >
> > > 1. Have eval/exec accept any mapping object as input
> > >
> > > 2. Make those two copy the content of the mapping object into real
> > >    dictionaries
> > >
> > > 3. Provide a hook into the dictionary implementation that can be
> > >    used to redirect KeyErrors and use that redirection to forward
> > >    the request to the original mapping objects
> 
> I don't think that this proposal would give so much new
> value. Since a mapping can also be implemented in arbitrary
> ways, say by functions, a mapping is not necessarily finite
> and might not be changeable into a dict.

[Disclaimer: I'm not really keen on having the possibility of
 letting code execute in arbitrary namespace objects... it would
 make code optimizations even less manageable.]

You can easily support infinite mappings by wrapping the
function into an object which returns an empty list
for .items() and then use the hook mentioned in 3 to
redirect the lookup to that function.

The proposal allows one to use such a proxy to simulate any
kind of mapping -- it works much like the __getattr__ hook
provided for instances.
 
> [David:>
> > Interesting counterproposal.  I'm not sure whether any of the proposals on
> > the table really do what's needed for e.g. case-insensitive namespace
> > handling.  I can see how all of the proposals so far allow
> > case-insensitive reference name handling in the global namespace, but
> > don't we also need to hook into the local-namespace creation process to
> > allow case-insensitivity to work throughout?
> 
> Case-independant namespaces seem to be a minor point,
> nice to have for interfacing to other products, but then,
> in a function, I see no benefit in changing the semantics
> of function locals? The lookup of foreign symbols would
> always be through a mapping object. If you take COM for
> instance, your access to a COM wrapper for an arbitrary
> object would be through properties of this object. After
> assignment to a local function variable, why should we
> support case-insensitivity at all?
>
> I would think mapping objects would be a great
> simplification of lazy imports in COM, where
> we would like to avoid to import really huge
> namespaces in one big slurp. Also the wrapper code
> could be made quite a lot easier and faster without
> so much getattr/setattr trapping.

What do lazy imports have to do with case [in]sensitive
namespaces ? Anyway, how about a simple lazy import
mechanism in the standard distribution, i.e. why not make
all imports lazy ? Since modules are first class objects
this should be easy to implement...
 
> Does btw. anybody really want to see case-insensitivity
> in Python programs? I'm quite happy with it as it is,
> and I would even force the use to always use the same
> case style after he has touched an external property
> once. Example for Excel: You may write "xl.workbooks"
> in lowercase, but then you have to stay with it.
> This would keep Python source clean for, say, PyLint.

"No" and "me too" ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 243 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From MHammond at skippinet.com.au  Mon May  3 02:52:41 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Mon, 3 May 1999 10:52:41 +1000
Subject: [Python-Dev] More flexible namespaces.
In-Reply-To: <372CA686.215D71DF@lemburg.com>
Message-ID: <000e01be94ff$4047ef20$0801a8c0@bobcat>

[Marc]
> [Disclaimer: I'm not really keen on having the possibility of
>  letting code execute in arbitrary namespace objects... it would
>  make code optimizations even less manageable.]

Good point - although surely that would simply mean (certain) optimisations
can't be performed for code executing in that environment?  How to detect
this at "optimization time" may be a little difficult :-)

However, this is the primary purpose of this thread - to workout _if_ it is
a good idea, as much as working out _how_ to do it :-)

> The proposal allows one to use such a proxy to simulate any
> kind of mapping -- it works much like the __getattr__ hook
> provided for instances.

My only problem with Marc's proposal is that there already _is_ an
established mapping protocol, and this doesnt use it; instead it invents a
new one with the benefit being potentially less code breakage.

And without attempting to sound flippant, I wonder how many extension
modules will be affected?  Module init code certainly assumes the module
__dict__ is a dictionary, but none of my code assumes anything about other
namespaces.  Marc's extensions may be a special case, as AFAIK they inject
objects into other dictionaries (ie, new builtins?).  Again, not trying to
downplay this too much, but if it is only a problem for Marc's more
esoteric extensions, I dont feel that should hold up an otherwise solid
proposal.

[Chris, I think?]
> > Case-independant namespaces seem to be a minor point,
> > nice to have for interfacing to other products, but then,
> > in a function, I see no benefit in changing the semantics
> > of function locals? The lookup of foreign symbols would

I disagree here.  Consider Alice, and similar projects, where a (arguably
misplaced, but nonetheless) requirement is that the embedded language be
case-insensitive.  Period.  The Alice people are somewhat special in that
they had the resources to change the interpreters guts.  Most people wont,
and will look for a different language to embedd.

Of course, I agree with you for the specific cases you are talking - COM,
Active Scripting etc.  Indeed, everything I would use this for would prefer
to keep the local function semantics identical.

> > Does btw. anybody really want to see case-insensitivity
> > in Python programs? I'm quite happy with it as it is,
> > and I would even force the use to always use the same
> > case style after he has touched an external property
> > once. Example for Excel: You may write "xl.workbooks"
> > in lowercase, but then you have to stay with it.
> > This would keep Python source clean for, say, PyLint.
>
> "No" and "me too" ;-)

I think we are missing the point a little.  If we focus on COM, we may come
up with a different answer.  Indeed, if we are to focus on COM integration
with Python, there are other areas I would prefer to start with :-)

IMO, we should attempt to come up with a more flexible namespace mechanism
that is in the style of Python, and will not noticeably slowdown Python.
Then COM etc can take advantage of it - much in the same way that Python's
existing namespace model existed pre-COM, and COM had to take advantage of
what it could!

Of course, a key indicator of the likely success is how well COM _can_ take
advantage of it, and how much Alice could have taken advantage of it - I
cant think of any other yardsticks?

Mark.




From mal at lemburg.com  Mon May  3 09:56:53 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 03 May 1999 09:56:53 +0200
Subject: [Python-Dev] More flexible namespaces.
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <372D56C5.4738DE3D@lemburg.com>

Mark Hammond wrote:
> 
> [Marc]
> > [Disclaimer: I'm not really keen on having the possibility of
> >  letting code execute in arbitrary namespace objects... it would
> >  make code optimizations even less manageable.]
> 
> Good point - although surely that would simply mean (certain) optimisations
> can't be performed for code executing in that environment?  How to detect
> this at "optimization time" may be a little difficult :-)
> 
> However, this is the primary purpose of this thread - to workout _if_ it is
> a good idea, as much as working out _how_ to do it :-)
> 
> > The proposal allows one to use such a proxy to simulate any
> > kind of mapping -- it works much like the __getattr__ hook
> > provided for instances.
> 
> My only problem with Marc's proposal is that there already _is_ an
> established mapping protocol, and this doesnt use it; instead it invents a
> new one with the benefit being potentially less code breakage.

...and that's the key point: you get the intended features and
the core code will not have to be changed in significant ways.
Basically, I think these kind of core extensions should be done
in generic ways, e.g. by letting the eval/exec machinery accept
subclasses of dictionaries, rather than trying to raise the
abstraction level used and slowing things down in general
just to be able to use the feature on very few occasions.

> And without attempting to sound flippant, I wonder how many extension
> modules will be affected?  Module init code certainly assumes the module
> __dict__ is a dictionary, but none of my code assumes anything about other
> namespaces.  Marc's extensions may be a special case, as AFAIK they inject
> objects into other dictionaries (ie, new builtins?).  Again, not trying to
> downplay this too much, but if it is only a problem for Marc's more
> esoteric extensions, I dont feel that should hold up an otherwise solid
> proposal.

My mxTools extension does the assignment in Python, so it wouldn't
be affected. The others only do the usual modinit() stuff.

Before going any further on this thread we may have to ponder a little
more on the objectives that we have. If it's only case-insensitive
lookups then I guess a simple compile time switch exchanging the
implementations of string hash and compare functions would do the
trick. If we're after doing wild things like lookups accross
networks, then a more specific approach is needed.

So what is it that we want in 1.6 ?

> [Chris, I think?]
> > > Case-independant namespaces seem to be a minor point,
> > > nice to have for interfacing to other products, but then,
> > > in a function, I see no benefit in changing the semantics
> > > of function locals? The lookup of foreign symbols would
> 
> I disagree here.  Consider Alice, and similar projects, where a (arguably
> misplaced, but nonetheless) requirement is that the embedded language be
> case-insensitive.  Period.  The Alice people are somewhat special in that
> they had the resources to change the interpreters guts.  Most people wont,
> and will look for a different language to embedd.
> 
> Of course, I agree with you for the specific cases you are talking - COM,
> Active Scripting etc.  Indeed, everything I would use this for would prefer
> to keep the local function semantics identical.

As I understand the needs in COM and AS you are talking about
object attributes, right ? Making these case-insensitive is
a job for a proxy or a __getattr__ hack.
 
> > > Does btw. anybody really want to see case-insensitivity
> > > in Python programs? I'm quite happy with it as it is,
> > > and I would even force the use to always use the same
> > > case style after he has touched an external property
> > > once. Example for Excel: You may write "xl.workbooks"
> > > in lowercase, but then you have to stay with it.
> > > This would keep Python source clean for, say, PyLint.
> >
> > "No" and "me too" ;-)
> 
> I think we are missing the point a little.  If we focus on COM, we may come
> up with a different answer.  Indeed, if we are to focus on COM integration
> with Python, there are other areas I would prefer to start with :-)
> 
> IMO, we should attempt to come up with a more flexible namespace mechanism
> that is in the style of Python, and will not noticeably slowdown Python.
> Then COM etc can take advantage of it - much in the same way that Python's
> existing namespace model existed pre-COM, and COM had to take advantage of
> what it could!
> 
> Of course, a key indicator of the likely success is how well COM _can_ take
> advantage of it, and how much Alice could have taken advantage of it - I
> cant think of any other yardsticks?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 242 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From fredrik at pythonware.com  Mon May  3 16:01:10 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 16:01:10 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
Message-ID: <005b01be956d$66d48450$f29b12c2@pythonware.com>

scriptics is positioning tcl as a perl killer:

    http://www.scriptics.com/scripting/perl.html

afaict, unicode and event handling are the two
main thingies missing from python 1.5.

-- unicode: is on its way.

-- event handling: asynclib/asynchat provides an
awesome framework for event-driven socket pro-
gramming.  however, Python still lacks good cross-
platform support for event-driven access to files
and pipes.  are threads good enough, or would it
be cool to have something similar to Tcl's fileevent
stuff in Python?

-- regexps: has anyone compared the new uni-
code-aware regexp package in Tcl with pcre?

comments?

</F>

btw, the rebol folks have reached 2.0:
    http://www.rebol.com/

maybe 1.6 should be renamed to Python 6.0?




From akuchlin at cnri.reston.va.us  Mon May  3 17:14:15 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:14:15 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <005b01be956d$66d48450$f29b12c2@pythonware.com>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
Message-ID: <14125.47524.196878.583460@amarok.cnri.reston.va.us>

Fredrik Lundh writes:
>-- regexps: has anyone compared the new uni-
>code-aware regexp package in Tcl with pcre?

	I looked at it a bit when Tcl 8.1 was in beta; it derives from
Henry Spencer's 1998-vintage code, which seems to try to do a lot of
optimization and analysis.  It may even compile DFAs instead of NFAs
when possible, though it's hard for me to be sure.  This might give it
a substantial speed advantage over engines that do less analysis, but
I haven't benchmarked it.  The code is easy to read, but difficult to
understand because the theory underlying the analysis isn't explained
in the comments; one feels there should be an accompanying paper to
explain how everything works, and it's why I'm not sure if it really
is producing DFAs for some expressions.

	Tcl seems to represent everything as UTF-8 internally, so
there's only one regex engine; there's .  The code is scattered over
more files:

amarok generic>ls re*.[ch]
regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
amarok generic>wc -l re*.[ch]
     742 regc_color.c
     170 regc_cvec.c
    1010 regc_lex.c
     781 regc_locale.c
    1528 regc_nfa.c
    2124 regcomp.c
      85 regcustom.h
     627 rege_dfa.c
      82 regerror.c
      18 regerrs.h
     308 regex.h
     952 regexec.c
      25 regfree.c
      56 regfronts.c
     388 regguts.h
    8896 total
amarok generic>

	This would be an issue for using it with Python, since all
these files would wind up scattered around the Modules directory.  For
comparison, pypcre.c is around 4700 lines of code.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Things need not have happened to be true. Tales and dreams are the
shadow-truths that will endure when mere facts are dust and ashes, and forgot.
    -- Neil Gaiman, _Sandman_ #19: _A Midsummer Night's Dream_




From guido at CNRI.Reston.VA.US  Mon May  3 17:32:09 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 11:32:09 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:14:15 EDT."
             <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com>  
            <14125.47524.196878.583460@amarok.cnri.reston.va.us> 
Message-ID: <199905031532.LAA05617@eric.cnri.reston.va.us>

> 	I looked at it a bit when Tcl 8.1 was in beta; it derives from
> Henry Spencer's 1998-vintage code, which seems to try to do a lot of
> optimization and analysis.  It may even compile DFAs instead of NFAs
> when possible, though it's hard for me to be sure.  This might give it
> a substantial speed advantage over engines that do less analysis, but
> I haven't benchmarked it.  The code is easy to read, but difficult to
> understand because the theory underlying the analysis isn't explained
> in the comments; one feels there should be an accompanying paper to
> explain how everything works, and it's why I'm not sure if it really
> is producing DFAs for some expressions.
> 
> 	Tcl seems to represent everything as UTF-8 internally, so
> there's only one regex engine; there's .

Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
point the regex engine was compiled twice, once for 8-bit chars and
once for 16-bit chars.  But this may have changed.

I've noticed that Perl is taking the same position (everything is
UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
from 8-bit bytes.  Python is currently in the Java camp.  This might
be a good time to make sure that we're still convinced that this is
the right thing to do!

> The code is scattered over
> more files:
> 
> amarok generic>ls re*.[ch]
> regc_color.c    regc_locale.c   regcustom.h     regerrs.h       regfree.c
> regc_cvec.c     regc_nfa.c      rege_dfa.c      regex.h         regfronts.c
> regc_lex.c      regcomp.c       regerror.c      regexec.c       regguts.h
> amarok generic>wc -l re*.[ch]
>      742 regc_color.c
>      170 regc_cvec.c
>     1010 regc_lex.c
>      781 regc_locale.c
>     1528 regc_nfa.c
>     2124 regcomp.c
>       85 regcustom.h
>      627 rege_dfa.c
>       82 regerror.c
>       18 regerrs.h
>      308 regex.h
>      952 regexec.c
>       25 regfree.c
>       56 regfronts.c
>      388 regguts.h
>     8896 total
> amarok generic>
> 
> 	This would be an issue for using it with Python, since all
> these files would wind up scattered around the Modules directory.  For
> comparison, pypcre.c is around 4700 lines of code.

I'm sure that if it's good code, we'll find a way.  Perhaps a more
interesting question is whether it is Perl5 compatible.  I contacted
Henry Spencer at the time and he was willing to let us use his code.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From akuchlin at cnri.reston.va.us  Mon May  3 17:56:46 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon,  3 May 1999 11:56:46 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
References: <000e01be94ff$4047ef20$0801a8c0@bobcat>
	<005b01be956d$66d48450$f29b12c2@pythonware.com>
	<14125.47524.196878.583460@amarok.cnri.reston.va.us>
	<199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <14125.49911.982236.754340@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Hmm...  I looked when Tcl 8.1 was in alpha, and I *think* that at that 
>point the regex engine was compiled twice, once for 8-bit chars and
>once for 16-bit chars.  But this may have changed.

	It doesn't seem to currently; the code in tclRegexp.c looks
like this:

    /* Remember the UTF-8 string so Tcl_RegExpRange() can convert the
     * matches from character to byte offsets.
     */
    regexpPtr->string = string;
    Tcl_DStringInit(&stringBuffer);
    uniString = Tcl_UtfToUniCharDString(string, -1, &stringBuffer);
    numChars = Tcl_DStringLength(&stringBuffer) / sizeof(Tcl_UniChar);
    /* Perform the regexp match. */
    result = TclRegExpExecUniChar(interp, re, uniString, numChars, -1,
            ((string > start) ? REG_NOTBOL : 0));

	ISTR the Spencer engine does, however, define a small and
large representation for NFAs and have two versions of the engine, one
for each representation.  Perhaps that's what you're thinking of.

>I've noticed that Perl is taking the same position (everything is
>UTF-8 internally).  On the other hand, Java distinguishes 16-bit chars 
>from 8-bit bytes.  Python is currently in the Java camp.  This might
>be a good time to make sure that we're still convinced that this is
>the right thing to do!

	I don't know.  There's certainly the fundamental dichotomy
that strings are sometimes used to represent characters, where
changing encodings on input and output is reasonably, and sometimes
used to hold chunks of binary data, where any changes are incorrect.
Perhaps Paul Prescod is right, and we should try to get some other
data type (array.array()) for holding binary data, as distinct from
strings.

>I'm sure that if it's good code, we'll find a way.  Perhaps a more
>interesting question is whether it is Perl5 compatible.  I contacted
>Henry Spencer at the time and he was willing to let us use his code.

	Mostly Perl-compatible, though it doesn't look like the 5.005
features are there, and I haven't checked for every single 5.004
feature.  Adding missing features might be problematic, because I
don't really understand what the code is doing at a high level.  Also,
is there a user community for this code?  Do any other projects use
it?  Philip Hazel has been quite helpful with PCRE, an important thing
when making modifications to the code.
 
	Should I make a point of looking at what using the Spencer
engine would entail?  It might not be too difficult (an evening or
two, maybe?) to write a re.py that sat on top of the Spencer code;
that would at least let us do some benchmarking.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In Einstein's theory of relativity the observer is a man who sets out in quest
of truth armed with a measuring-rod. In quantum theory he sets out with a
sieve.
    -- Sir Arthur Eddington





From guido at CNRI.Reston.VA.US  Mon May  3 18:02:22 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Mon, 03 May 1999 12:02:22 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Mon, 03 May 1999 11:56:46 EDT."
             <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>  
            <14125.49911.982236.754340@amarok.cnri.reston.va.us> 
Message-ID: <199905031602.MAA05829@eric.cnri.reston.va.us>

> 	Should I make a point of looking at what using the Spencer
> engine would entail?  It might not be too difficult (an evening or
> two, maybe?) to write a re.py that sat on top of the Spencer code;
> that would at least let us do some benchmarking.

Surely this would be more helpful than weeks of specilative emails --
go for it!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Mon May  3 19:10:55 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:10:55 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat><005b01be956d$66d48450$f29b12c2@pythonware.com><14125.47524.196878.583460@amarok.cnri.reston.va.us><199905031532.LAA05617@eric.cnri.reston.va.us> <14125.49911.982236.754340@amarok.cnri.reston.va.us>
Message-ID: <005801be9588$7ad0fcc0$f29b12c2@pythonware.com>

> Also, is there a user community for this code?

how about comp.lang.tcl ;-)

</F>




From fredrik at pythonware.com  Mon May  3 19:15:00 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 May 1999 19:15:00 +0200
Subject: [Python-Dev] Why Foo is better than Baz
References: <000e01be94ff$4047ef20$0801a8c0@bobcat> <005b01be956d$66d48450$f29b12c2@pythonware.com> <14125.47524.196878.583460@amarok.cnri.reston.va.us> <199905031532.LAA05617@eric.cnri.reston.va.us>             <14125.49911.982236.754340@amarok.cnri.reston.va.us>  <199905031602.MAA05829@eric.cnri.reston.va.us>
Message-ID: <005901be9588$7af59bc0$f29b12c2@pythonware.com>

talking about regexps, here's another thing that
would be quite nice to have in 1.6 (available from
the Python level, that is).  or is it already in there
somewhere?

</F>

...

http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=464362873

Tcl 8.1b3 Request:  Generated by Scriptics' bug entry form at

Submitted by:  Frederic BONNET
OperatingSystem:  Windows 98
CustomShell:  Applied patch to the regexp engine (the exec part)
Synopsis:  regexp improvements

DesiredBehavior:
    As previously requested by Don Libes:
    
    > I see no way for Tcl_RegExpExec to indicate "could match" meaning
    > "could match if more characters arrive that were suitable for a
    > match".  This is required for a class of applications involving
    > matching on a stream required by Expect's interact command.  Henry
    > assured me that this facility would be in the engine (I'm not the only
    > one that needs it).  Note that it is not sufficient to add one more
    > return value to Tcl_RegExpExec (i.e., 2) because one needs to know
    > both if something matches now and can match later.  I recommend
    > another argument (canMatch *int) be added to Tcl_RegExpExec.

/patch info follows/

...




From bwarsaw at cnri.reston.va.us  Tue May  4 00:28:23 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Mon, 3 May 1999 18:28:23 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
Message-ID: <14126.8967.793734.892670@anthem.cnri.reston.va.us>

I've been using Jitterbug for a couple of weeks now as my bug database
for Mailman and JPython.  So it was easy enough for me to set up a
database for Python bug reports.  Guido is in the process of tailoring 
the Jitterbug web interface to his liking and will announce it to the
appropriate forums when he's ready.

In the meantime, I've created YAML that you might be interested in.
All bug reports entered into Jitterbug will be forwarded to
python-bugs-list at python.org.  You are invited to subscribe to the list 
by visiting

    http://www.python.org/mailman/listinfo/python-bugs-list

Enjoy,
-Barry



From jeremy at cnri.reston.va.us  Tue May  4 00:30:10 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon,  3 May 1999 18:30:10 -0400 (EDT)
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
References: <14126.8967.793734.892670@anthem.cnri.reston.va.us>
Message-ID: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>

Pretty low volume list, eh?



From MHammond at skippinet.com.au  Tue May  4 01:28:39 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Tue, 4 May 1999 09:28:39 +1000
Subject: [Python-Dev] New mailing list: python-bugs-list
In-Reply-To: <14126.9061.558631.437892@bitdiddle.cnri.reston.va.us>
Message-ID: <000701be95bc$ad0b45e0$0801a8c0@bobcat>

ha - we wish.  More likely to be full of detailed bug reports about how 1/2
!= 0.5, or that "def foo(baz=[])" is buggy, etc :-)

Mark.

> Pretty low volume list, eh?




From tim_one at email.msn.com  Tue May  4 07:16:17 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:16:17 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905031532.LAA05617@eric.cnri.reston.va.us>
Message-ID: <000701be95ed$3d594180$dca22299@tim>

[Guido & Andrew on Tcl's new regexp code]
> I'm sure that if it's good code, we'll find a way.  Perhaps a more
> interesting question is whether it is Perl5 compatible.  I contacted
> Henry Spencer at the time and he was willing to let us use his code.

Haven't looked at the code, but did read the manpage just now:

    http://www.scriptics.com/man/tcl8.1/TclCmd/regexp.htm

WRT Perl5 compatibility, it sez:

    Incompatibilities of note include `\b', `\B', the lack of special
    treatment for a trailing newline, the addition of complemented
    bracket expressions to the things affected by newline-sensitive
    matching, the restrictions on parentheses and back references in
    lookahead constraints, and the longest/shortest-match (rather than
    first-match) matching semantics.

So some gratuitous differences, and maybe a killer:  Guido hasn't had much
kind to say about "longest" (aka POSIX) matching semantics.  An example from
the page:

    (week|wee)(night|knights)
    matches all ten characters of `weeknights'

which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
'night'.

It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
is correct; indeed, it's a pain to get that behavior any other way!

otoh-it's-potentially-very-much-faster-ly y'rs  - tim





From tim_one at email.msn.com  Tue May  4 07:51:01 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 4 May 1999 01:51:01 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <000701be95ed$3d594180$dca22299@tim>
Message-ID: <000901be95f2$195556c0$dca22299@tim>

[Tim]
> ...
> It's the *natural* semantics if Andrew's suspicion that it's
> compiling a DFA is correct ...

More from the man page:

    AREs report the longest/shortest match for the RE, rather than
    the first found in a specified search order. This may affect some
    RREs which were written in the expectation that the first match
    would be reported. (The careful crafting of RREs to optimize the
    search order for fast matching is obsolete (AREs examine all possible
    matches in parallel, and their performance is largely insensitive to
    their complexity) but cases where the search order was exploited to
    deliberately find a match which was not the longest/shortest will
    need rewriting.)

Nails it, yes?  Now, in 10 seconds, try to remember a regexp where this
really matters <wink>.

Note in passing that IDLE's colorizer regexp *needs* to search for
triple-quoted strings before single-quoted ones, else the P/P semantics
would consider """ to be an empty single-quoted string followed by a double
quote.  This isn't a case where it matters in a bad way, though!  The
"longest" rule picks the correct alternative regardless of the order in
which they're written.

at-least-in-that-specific-regex<0.1-wink>-ly y'rs  - tim





From guido at CNRI.Reston.VA.US  Tue May  4 14:26:04 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 08:26:04 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 01:16:17 EDT."
             <000701be95ed$3d594180$dca22299@tim> 
References: <000701be95ed$3d594180$dca22299@tim> 
Message-ID: <199905041226.IAA07627@eric.cnri.reston.va.us>

[Tim]
> So some gratuitous differences, and maybe a killer:  Guido hasn't had much
> kind to say about "longest" (aka POSIX) matching semantics.
> 
> An example from the page:
> 
>     (week|wee)(night|knights)
>     matches all ten characters of `weeknights'
> 
> which means it matched 'wee' and 'knights'; Python/Perl match 'week' and
> 'night'.
> 
> It's the *natural* semantics if Andrew's suspicion that it's compiling a DFA
> is correct; indeed, it's a pain to get that behavior any other way!

Possibly contradicting what I once said about DFAs (I have no idea
what I said any more :-): I think we shouldn't be hung up about the
subtleties of DFA vs. NFA; for most people, the Perl-compatibility
simply means that they can use the same metacharacters.  My guess is
that people don'y so much translate long Perl regexp's to Python but
simply transport their (always incomplete -- Larry Wall *wants* it
that way :-) knowledge of Perl regexps to Python.  My meta-guess is
that this is also Henry Spencer's and John Ousterhout's guess.  As for
Larry Wall, I guess he really doesn't care :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at cnri.reston.va.us  Tue May  4 18:14:41 1999
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Tue,  4 May 1999 12:14:41 -0400 (EDT)
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
References: <000701be95ed$3d594180$dca22299@tim>
	<199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <14127.6410.646122.342115@amarok.cnri.reston.va.us>

Guido van Rossum writes:
>Possibly contradicting what I once said about DFAs (I have no idea
>what I said any more :-): I think we shouldn't be hung up about the
>subtleties of DFA vs. NFA; for most people, the Perl-compatibility
>simply means that they can use the same metacharacters.  My guess is

	I don't like slipping in such a change to the semantics with
no visible change to the module name or interface.  On the other hand,
if it's not NFA-based, then it can provide POSIX semantics without
danger of taking exponential time to determine the longest match.
BTW, there's an interesting reference, I assume to this code, in
_Mastering Regular Expressions_; Spencer is quoted on page 121 as
saying it's "at worst quadratic in text size.".

	Anyway, we can let it slide until a Python interface gets written.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
In the black shadow of the Baba Yaga babies screamed and mothers miscarried;
milk soured and men went mad.
    -- In SANDMAN #38: "The Hunt"




From guido at CNRI.Reston.VA.US  Tue May  4 18:19:06 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 04 May 1999 12:19:06 -0400
Subject: [Python-Dev] Why Foo is better than Baz
In-Reply-To: Your message of "Tue, 04 May 1999 12:14:41 EDT."
             <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
References: <000701be95ed$3d594180$dca22299@tim> <199905041226.IAA07627@eric.cnri.reston.va.us>  
            <14127.6410.646122.342115@amarok.cnri.reston.va.us> 
Message-ID: <199905041619.MAA08408@eric.cnri.reston.va.us>

> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

Not sure if that was the same code -- this is *new* code, not
Spencer's old code.  I think Friedl's book is older than the current
code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Wed May  5 07:37:02 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 01:37:02 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <199905041226.IAA07627@eric.cnri.reston.va.us>
Message-ID: <000701be96b9$4e434460$799e2299@tim>

I've consistently found that the best way to kill a thread is to rename it
accurately <wink>.

Agree w/ Guido that few people really care about the differing semantics.

Agree w/ Andrew that it's bad to pull a semantic switcheroo at this stage
anyway:  code will definitely break.  Like

    \b(?:
        (?P<keyword>and|if|else|...) |
        (?P<identifier>[a-zA-Z_]\w*)
    )\b

The (special)|(general) idiom relies on left-to-right match-and-out
searching of alternatives to do its job correctly.  Not to mention that \b
is not a word-boundary assertion in the new pkg (talk about pointlessly
irritating differences!  at least this one could be easily hidden via
brainless preprocessing).

Over the long run, moving to a DFA locks Python out of the directions Perl
is *moving*, namely embedding all sorts of runtime gimmicks in regexps that
exploit knowing the "state of the match so far".  DFAs don't work that way.
I don't mind losing those possibilities, because I think the regexp
sublanguage is strained beyond its limits already.  But that's a decision
with Big Consequences, so deserves some thought.

I'd definitely like the (sometimes dramatically) increased speed a DFA can
offer (btw, this code appears to use a lazily-generated DFA, to avoid the
exponential *compile*-time a straightforward DFA implementation can
suffer -- the code is very complex and lacks any high-level internal docs,
so we better hope Henry stays in love with it <0.5 wink>).

> ...
> My guess is that people don't so much translate long Perl regexp's
> to Python but simply transport their (always incomplete -- Larry Wall
> *wants* it that way :-) knowledge of Perl regexps to Python.

This is directly proportional to the number of feeble CGI programmers Python
attracts <wink>.  The good news is that they wouldn't know an NFA from a DFA
if Larry bit Henry on the ass ...

> My meta-guess is that this is also Henry Spencer's and John
> Ousterhout's guess.

I think Spencer strongly favors DFA semantics regardless of fashion, and
Ousterhout is a pragmatist.  So I trust JO's judgment more <0.9 wink>.

> As for Larry Wall, I guess he really doesn't care :-)

I expect he cares a lot!  Because a DFA would prevent Perl from going even
more insane in its present direction.


About the age of the code, postings to comp.lang.tcl have Henry saying he
was working on the alpha version intensely as recently as Decemeber ('98).
A few complaints about the alpha release trickled in, about regexp compile
speed and regexp matching speed in specific cases.  Perhaps paradoxically,
the latter were about especially simple regexps with long fixed substrings
(where this mountain of sophisticated machinery is likely to get beat cold
by an NFA with some fixed-substring lookahead smarts -- which latter Henry
intended to graft into this pkg too).

[Andrew]
> BTW, there's an interesting reference, I assume to this code, in
> _Mastering Regular Expressions_; Spencer is quoted on page 121 as
> saying it's "at worst quadratic in text size.".

[Guido]
> Not sure if that was the same code -- this is *new* code, not
> Spencer's old code.  I think Friedl's book is older than the current
> code.

I expect this is an invariant, though:  it's not natural for a DFA to know
where subexpression matches begin and end, and there's a pile of xxx_dissect
functions in regexec.c that use what strongly appear to be worst-case
quadratic-time algorithms for figuring that out after it's known that the
overall expression has *a* match.  Expect too, but don't know, that only
pathological cases are actually expensive.


Question:  has this package been released in any other context, or is it
unique to Tcl?  I searched in vain for an announcement (let alone code) from
Henry, or any discussion of this code outside the Tcl world.

whatever-happens-i-vote-we-let-them-debug-it<wink>-ly y'rs  - tim





From gstein at lyra.org  Wed May  5 08:22:20 1999
From: gstein at lyra.org (Greg Stein)
Date: Tue, 4 May 1999 23:22:20 -0700 (PDT)
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <000701be96b9$4e434460$799e2299@tim>
Message-ID: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>

On Wed, 5 May 1999, Tim Peters wrote:
>...
> Question:  has this package been released in any other context, or is it
> unique to Tcl?  I searched in vain for an announcement (let alone code) from
> Henry, or any discussion of this code outside the Tcl world.

Apache uses it.

However, the Apache guys have considered possibility updating the thing. I
gather that they have a pretty old snapshot. Another guy mentioned PCRE
and I pointed out that Python uses it for its regex support. In other
words, if Apache *does* update the code, then it may be that Apache will
drop the HS engine in favor of PCRE.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/





From Ivan.Porres at abo.fi  Wed May  5 10:29:21 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 11:29:21 +0300
Subject: [Python-Dev] Python for Small Systems patch
Message-ID: <37300161.8DFD1D7F@abo.fi>

Python for Small Systems is a minimal version of the python interpreter,
intended to run on small embedded systems with a limited amount of
memory. 

Since there is some interest in the newsgroup, we have decide to release
an alpha version of the patch. You can download the patch from the
following page: 

http://www.abo.fi/~iporres/python

There is no documentation about the changes, but I guess that it is not
so difficult to figure out what Raul has been doing. 

There are some simple examples in the Demo/hitachi directory. The
configure scripts are broken. We plan to modify the configure scripts 
for cross-compilation. We are still testing, cleaning
and trying to reduce the memory requirements of the patched interpreter.
We also plan to write some documentation.

Please send comments to Raul (rparra at abo.fi) or to me (iporres at abo.fi),

Regards,
Ivan


-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 13:52:24 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 13:52:24 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi>
Message-ID: <373030F8.21B73451@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Python for Small Systems is a minimal version of the python interpreter,
> intended to run on small embedded systems with a limited amount of
> memory.
> 
> Since there is some interest in the newsgroup, we have decide to release
> an alpha version of the patch. You can download the patch from the
> following page:
> 
> http://www.abo.fi/~iporres/python
> 
> There is no documentation about the changes, but I guess that it is not
> so difficult to figure out what Raul has been doing.

Ivan,
small Python is a very interesting thing,
thanks for the preview.

But, aren't 12600 lines of diff a little too much
to call it "not difficult to figure out"? :-)

The very last line was indeed helpful:

+++ Pss/miniconfigure	Tue Mar 16 16:59:42 1999
@@ -0,0 +1 @@
+./configure --prefix="/home/rparra/python/Python-1.5.1"
--without-complex --without-float --without-long --without-file
--without-libm --without-libc --without-fpectl --without-threads
--without-dec-threads --with-libs=

But I'd be interested in a brief list
of which other features are out, and even more which
structures were changed. Would that be possible?

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From Ivan.Porres at abo.fi  Wed May  5 15:17:17 1999
From: Ivan.Porres at abo.fi (Ivan Porres Paltor)
Date: Wed, 05 May 1999 16:17:17 +0300
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com>
Message-ID: <373044DD.FE4499E@abo.fi>

Christian Tismer wrote:
> Ivan,
> small Python is a very interesting thing,
> thanks for the preview.
> 
> But, aren't 12600 lines of diff a little too much
> to call it "not difficult to figure out"? :-)

Raul Parra (rpb), the author of the patch, got the "source scissors"
(#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
embedded system with some RAM, no keyboard, no screen and no OS. An
example application can be a printer where the print jobs are python
bytecompiled scripts (instead of postscript).

We plan to write some documentation about the patch. Meanwhile, here are
some of the changes:

WITHOUT_PARSER, WITHOUT_COMPILER
Defining WITHOUT_PARSER removes the parser. This has a lot of
implications (no eval() !) but saves a lot of memory. The interpreter
can only execute byte-compiled scripts, that is PyCodeObjects. 

Most embedded processors have poor floating point capabilities. (They
can not compete with DSP's):

WITHOUT-COMPLEX
Removes support for complex numbers

WITHOUT-LONG
Removes long numbers

WITHOUT-FLOAT
Removes floating point numbers

Dependences with the OS:

WITHOUT-FILE
Removes file objects. No file, no print, no input, no interactive
prompt. This is not to bad in a device without hard disk, keyboard or
screen...

WITHOUT-GETPATH
Removes dependencies with os path.(Probabily this change should be
integrated with WITHOUT-FILE)

These changes render most of the standard modules unusable.
There are no fundamental changes on the interpter, just cut and cut....

Ivan
-- 
Ivan Porres Paltor                    Turku Centre for Computer Science
?bo Akademi, Department of Computer Science  Phone: +358-2-2154033   
Lemmink?inengatan 14A                             
FIN-20520 Turku - Finland                    http://www.abo.fi/~iporres



From tismer at appliedbiometrics.com  Wed May  5 15:31:05 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 05 May 1999 15:31:05 +0200
Subject: [Python-Dev] Python for Small Systems patch
References: <37300161.8DFD1D7F@abo.fi> <373030F8.21B73451@appliedbiometrics.com> <373044DD.FE4499E@abo.fi>
Message-ID: <37304819.AD636B67@appliedbiometrics.com>


Ivan Porres Paltor wrote:
> 
> Christian Tismer wrote:
> > Ivan,
> > small Python is a very interesting thing,
> > thanks for the preview.
> >
> > But, aren't 12600 lines of diff a little too much
> > to call it "not difficult to figure out"? :-)
> 
> Raul Parra (rpb), the author of the patch, got the "source scissors"
> (#ifndef WITHOUT... #endif) and cut the interpreter until it fitted in a
> embedded system with some RAM, no keyboard, no screen and no OS. An
> example application can be a printer where the print jobs are python
> bytecompiled scripts (instead of postscript).
> 
> We plan to write some documentation about the patch. Meanwhile, here are
> some of the changes:

Many thanks, this is really interesting

> These changes render most of the standard modules unusable.
> There are no fundamental changes on the interpter, just cut and cut....

I see. A last thing which I'm curious about is the executable
size. If this can be compared to a Windows dll at all. Did you 
compile without the changes for your target as well? 
How is the ratio? The python15.dll file contains everything
of core Python and is about 560 KB large.
If your engine goes down to, say below 200 KB, this could
be a great thing for embedding Python into other apps.

ciao & thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Wed May  5 16:55:40 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Wed, 5 May 1999 10:55:40 -0400 (EDT)
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
References: <199905041226.IAA07627@eric.cnri.reston.va.us>
	<000701be96b9$4e434460$799e2299@tim>
Message-ID: <14128.23532.499380.835737@anthem.cnri.reston.va.us>

>>>>> "TP" == Tim Peters <tim_one at email.msn.com> writes:

    TP> Over the long run, moving to a DFA locks Python out of the
    TP> directions Perl is *moving*, namely embedding all sorts of
    TP> runtime gimmicks in regexps that exploit knowing the "state of
    TP> the match so far".  DFAs don't work that way.  I don't mind
    TP> losing those possibilities, because I think the regexp
    TP> sublanguage is strained beyond its limits already.  But that's
    TP> a decision with Big Consequences, so deserves some thought.

I know zip about the internals of the various regexp package.  But as
far as the Python level interface, would it be feasible to support
both as underlying regexp engines underneath re.py?  The idea would be 
that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
Then all the rest of the magic happens behind the scenes, with
appropriate exceptions thrown if there are syntax mismatches in the
regexp that can't be worked around by preprocessors, etc.

Or would that be more confusing than yet another different regexp
module?

-Barry



From tim_one at email.msn.com  Wed May  5 17:55:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 5 May 1999 11:55:20 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code
In-Reply-To: <Pine.LNX.3.95.990504231846.29915A-100000@ns1.lyra.org>
Message-ID: <000601be970f$adef5740$a59e2299@tim>

[Tim]
> Question:  has this package [Tcl's 8.1 regexp support] been released in
> any other context, or is it unique to Tcl?  I searched in vain for an
> announcement (let alone code) from Henry, or any discussion of this code
> outside the Tcl world.

[Greg Stein]
> Apache uses it.
>
> However, the Apache guys have considered possibility updating the thing. I
> gather that they have a pretty old snapshot. Another guy mentioned PCRE
> and I pointed out that Python uses it for its regex support. In other
> words, if Apache *does* update the code, then it may be that Apache will
> drop the HS engine in favor of PCRE.

Hmm.  I just downloaded the Apache 1.3.4 source to check on this, and it
appears to be using a lightly massaged version of Spencer's old (circa
'92-'94) just-POSIX regexp package.  Henry has been distributing regexp pkgs
for a loooong time <wink>.

The Tcl 8.1 regexp pkg is much hairier.  If the Apache folk want to switch
in order to get the Perl regexp syntax extensions, this Tcl version is worth
looking at too.  If they want to switch for some other reason, it would be
good to know what that is!

The base pkg Apache uses is easily available all over the web; the pkg Tcl
8.1 is using I haven't found anywhere except in the Tcl download (which is
why I'm wondering about it -- so far, it doesn't appear to be distributed by
Spencer himself, in a non-Tcl-customized form).

looks-like-an-entirely-new-pkg-to-me-ly y'rs  - tim





From beazley at cs.uchicago.edu  Wed May  5 18:54:45 1999
From: beazley at cs.uchicago.edu (David Beazley)
Date: Wed, 5 May 1999 11:54:45 -0500 (CDT)
Subject: [Python-Dev] My (possibly delusional) book project
Message-ID: <199905051654.LAA11410@tartarus.cs.uchicago.edu>

Although this is a little off-topic for the developer list, I want to
fill people in on a new Python book project.  A few months ago, 
I was approached about doing a new Python reference book and I've
since decided to proceed with the project (after all, an increased
presence at the bookstore is probably a good thing :-).

In any event, my "vision" for this book is to take the material in the
Python tutorial, language reference, library reference, and extension
guide and squeeze it into a compact book no longer than 300 pages (and
hopefully without having to use a 4-point font).  Actually, what I'm
really trying to do is write something in a style similar to the K&R C
Programming book (very terse, straight to the point, and technically
accurate). The book's target audience is experienced/expert
programmers.

With this said, I would really like to get feedback from the developer
community about this project in a few areas.  First, I want to make
sure the language reference is in sync with the latest version of
Python, that it is as accurate as possible, and that it doesn't leave
out any important topics or recent developments.  Second, I would be
interested in knowing how to emphasize certain topics (for instance,
should I emphasize class-based exceptions over string-based exceptions
even though most books only cover the former case?).  The other big
area is the library reference.  Given the size of the library, I'm
going to cut a number of modules out.  However, the choice of what to
cut is not entirely clear (for now, it's a judgment call on my part).

All of the work in progress for this project is online at:

   http://rustler.cs.uchicago.edu/~beazley/essential/reference.html

I would love to get constructive feedback about this from other
developers.  Of course, I'll keep people posted in any case.

Cheers,

Dave




From tim_one at email.msn.com  Thu May  6 07:43:16 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 6 May 1999 01:43:16 -0400
Subject: [Python-Dev] Tcl 8.1's regexp code (was RE: [Python-Dev] Why Foo is better than Baz)
In-Reply-To: <14128.23532.499380.835737@anthem.cnri.reston.va.us>
Message-ID: <000d01be9783$57543940$2ca22299@tim>

[Tim notes that moving to a DFA regexp engine would rule out some future
 aping of Perl mistakes <wink>]

[Barry "The Great Compromiser" Warsaw]
> I know zip about the internals of the various regexp package.  But as
> far as the Python level interface, would it be feasible to support
> both as underlying regexp engines underneath re.py?  The idea would be
> that you'd add an extra flag (re.PERL / re.TCL ?  re.DFA / re.NFA ?
> re.POSIX / re.USEFUL ? :-) that would select the engine and compiler.
> Then all the rest of the magic happens behind the scenes, with
> appropriate exceptions thrown if there are syntax mismatches in the
> regexp that can't be worked around by preprocessors, etc.
>
> Or would that be more confusing than yet another different regexp
> module?

It depends some on what percentage of the Python distribution Guido wants to
devote to regexp code <0.6 wink>; the Tcl pkg would be the largest block of
code in Modules/, where regexp packages already consume more than anything
else.

It's a lot of delicate, difficult code.  Someone would need to step up and
champion each alternative package.  I haven't asked Andrew lately, but I'd
bet half a buck the thrill of supporting pcre has waned.

If there were competing packages, your suggested interface is fine.  I just
doubt the Python developers will support more than one (Andrew may still be
young, but he can't possibly still be naive enough to sign up for two of
these nightmares <wink>).

i'm-so-old-i-never-signed-up-for-one-ly y'rs  - tim





From rushing at nightmare.com  Thu May 13 08:34:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 12 May 1999 23:34:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905070507.BAA22545@python.org>
References: <199905070507.BAA22545@python.org>
Message-ID: <14138.28243.553816.166686@seattle.nightmare.com>

[list has been quiet, thought I'd liven things up a bit. 8^)]

I'm not sure if this has been brought up before in other forums, but
has there been discussion of separating the Python and C invocation
stacks, (i.e., removing recursive calls to the intepreter) to
facilitate coroutines or first-class continuations?

One of the biggest barriers to getting others to use asyncore/medusa
is the need to program in continuation-passing-style (callbacks,
callbacks to callbacks, state machines, etc...).  Usually there has to
be an overriding requirement for speed/scalability before someone will
even look into it.  And even when you do 'get' it, there are limits to
how inside-out your thinking can go. 8^)

If Python had coroutines/continuations, it would be possible to hide
asyncore-style select()/poll() machinery 'behind the scenes'.  I
believe that Concurrent ML does exactly this...

Other advantages might be restartable exceptions, different threading
models, etc...

-Sam
rushing at nightmare.com
rushing at eGroups.net




From mal at lemburg.com  Thu May 13 10:23:13 1999
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 13 May 1999 10:23:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <373A8BF1.AE124BF@lemburg.com>

rushing at nightmare.com wrote:
> 
> [list has been quiet, thought I'd liven things up a bit. 8^)]

Well, there certainly is enough on the todo list... it's probably
the usual "ain't got no time" thing.

> I'm not sure if this has been brought up before in other forums, but
> has there been discussion of separating the Python and C invocation
> stacks, (i.e., removing recursive calls to the intepreter) to
> facilitate coroutines or first-class continuations?

Wouldn't it be possible to move all the C variables passed to
eval_code() via the execution frame ? AFAIK, the frame is
generated on every call to eval_code() and thus could also
be generated *before* calling it.

> One of the biggest barriers to getting others to use asyncore/medusa
> is the need to program in continuation-passing-style (callbacks,
> callbacks to callbacks, state machines, etc...).  Usually there has to
> be an overriding requirement for speed/scalability before someone will
> even look into it.  And even when you do 'get' it, there are limits to
> how inside-out your thinking can go. 8^)
> 
> If Python had coroutines/continuations, it would be possible to hide
> asyncore-style select()/poll() machinery 'behind the scenes'.  I
> believe that Concurrent ML does exactly this...
> 
> Other advantages might be restartable exceptions, different threading
> models, etc...

Don't know if moving the C stack stuff into the frame objects
will get you the desired effect: what about other things having
state (e.g. connections or files), that are not even touched
by this mechanism ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                            Y2000: 232 days left
Business:                                      http://www.lemburg.com/
Python Pages:                 http://starship.python.net/crew/lemburg/





From rushing at nightmare.com  Thu May 13 11:40:19 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 02:40:19 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373A8BF1.AE124BF@lemburg.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<373A8BF1.AE124BF@lemburg.com>
Message-ID: <14138.38550.89759.752058@seattle.nightmare.com>

M.-A. Lemburg writes:

 > Wouldn't it be possible to move all the C variables passed to
 > eval_code() via the execution frame ? AFAIK, the frame is
 > generated on every call to eval_code() and thus could also
 > be generated *before* calling it.

I think this solves half of the problem.  The C stack is both a value
stack and an execution stack (i.e., it holds variables and return
addresses).  Getting rid of arguments (and a return value!) gets rid
of the need for the 'value stack' aspect.

In aiming for an enter-once, exit-once VM, the thorniest part is to
somehow allow python->c->python calls.  The second invocation could
never save a continuation because its execution context includes a C
frame.  This is a general problem, not specific to Python; I probably
should have thought about it a bit before posting...

 > Don't know if moving the C stack stuff into the frame objects
 > will get you the desired effect: what about other things having
 > state (e.g. connections or files), that are not even touched
 > by this mechanism ?

I don't think either of those cause 'real' problems (i.e., nothing
should crash that assumes an open file or socket), but there may be
other stateful things that might.  I don't think that refcounts would
be a problem - a saved continuation wouldn't be all that different
from an exception traceback.

-Sam

p.s. Here's a tiny VM experiment I wrote a while back, to explain
what I mean by 'stackless':

http://www.nightmare.com/stuff/machine.h
http://www.nightmare.com/stuff/machine.c

Note how OP_INVOKE (the PROC_CLOSURE clause) pushes new context
onto heap-allocated data structures rather than calling the VM
recursively.




From skip at mojam.com  Thu May 13 13:38:39 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 13 May 1999 07:38:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14138.28243.553816.166686@seattle.nightmare.com>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
Message-ID: <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>

    Sam> I'm not sure if this has been brought up before in other forums,
    Sam> but has there been discussion of separating the Python and C
    Sam> invocation stacks, (i.e., removing recursive calls to the
    Sam> intepreter) to facilitate coroutines or first-class continuations?

I thought Guido was working on that for the mobile agent stuff he was
working on at CNRI.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From bwarsaw at cnri.reston.va.us  Thu May 13 17:10:52 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Thu, 13 May 1999 11:10:52 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
Message-ID: <14138.60284.584739.711112@anthem.cnri.reston.va.us>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> I thought Guido was working on that for the mobile agent stuff
    SM> he was working on at CNRI.

Nope, we decided that we could accomplish everything we needed without 
this.  We occasionally revisit this but Guido keeps insisting it's a
lot of work for not enough benefit :-)

-Barry



From guido at CNRI.Reston.VA.US  Thu May 13 17:19:10 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 13 May 1999 11:19:10 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 07:38:39 EDT."
             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>  
            <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> 
Message-ID: <199905131519.LAA01097@eric.cnri.reston.va.us>

Interesting topic!  While I 'm on the road, a few short notes.

> I thought Guido was working on that for the mobile agent stuff he was
> working on at CNRI.

Indeed.  At least I planned on working on it.  I ended up abandoning
the idea because I expected it would be a lot of work and I never had
the time (same old story indeed).

Sam also hit it on the nail: the hardest problem is what to do about
all the places where C calls back into Python.

I've come up with two partial solutions: (1) allow for a way to
arrange for a call to be made immediately after you return to the VM
from C; this would take care of apply() at least and a few other
"tail-recursive" cases; (2) invoke a new VM when C code needs a Python
result, requiring it to return.  The latter clearly breaks certain
uses of coroutines but could probably be made to work most of the
time.  Typical use of the 80-20 rule.

And I've just come up with a third solution: a variation on (1) where
you arrange *two* calls: one to Python and then one to C, with the
result of the first.  (And a bit saying whether you want the C call to 
be made even when an exception happened.)

In general, I still think it's a cool idea, but I also still think
that continuations are too complicated for most programmers.  (This
comes from the realization that they are too complicated for me!)
Corollary: even if we had continuations, I'm not sure if this would
take away the resistance against asyncore/asynchat.  Of course I could 
be wrong.

Different suggestion: it would be cool to work on completely
separating out the VM from the rest of Python, through some kind of
C-level API specification.  Two things should be possiblw with this
new architecture: (1) small platform ports could cut out the
interactive interpreter, the parser and compiler, and certain data
types such as long, complex and files; (2) there could be alternative
pluggable VMs with certain desirable properties such as
platform-specific optimization (Christian, are you listening? :-).

I think the most challenging part might be defining an API for passing 
in the set of supported object types and operations.  E.g. the
EXEC_STMT opcode needs to be be implemented in a way that allows
"exec" to be absent from the language.  Perhaps an __exec__ function
(analogous to __import__) is the way to go.  The set of built-in
functions should also be passed in, so that e.g. one can easily leave
out open(), eval() and comppile(), complex(), long(), float(), etc.

I think it would be ideal if no #ifdefs were needed to remove features
(at least not in the VM code proper).  Fortunately, the VM doesn't
really know about many object types -- frames, fuctions, methods,
classes, ints, strings, dictionaries, tuples, tracebacks, that may be
all it knows.  (Lists?)

Gotta run,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Thu May 13 21:50:44 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 13 May 1999 21:50:44 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com>             <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>  <199905131519.LAA01097@eric.cnri.reston.va.us>
Message-ID: <01d501be9d79$e4890060$f29b12c2@pythonware.com>

> In general, I still think it's a cool idea, but I also still think
> that continuations are too complicated for most programmers.  (This
> comes from the realization that they are too complicated for me!)

in an earlier life, I used non-preemtive threads (that is,
explicit yields) and co-routines to do some really cool
stuff with very little code.  looks like a stack-less inter-
preter would make it trivial to implement that.

might just be nostalgia, but I think I would give an arm
or two to get that (not necessarily my own, though ;-)

</F>




From rushing at nightmare.com  Fri May 14 04:00:09 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Thu, 13 May 1999 19:00:09 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
Message-ID: <14139.30970.644343.612721@seattle.nightmare.com>

Guido van Rossum writes:
  > I've come up with two partial solutions: (1) allow for a way to
  > arrange for a call to be made immediately after you return to the
  > VM from C; this would take care of apply() at least and a few
  > other "tail-recursive" cases; (2) invoke a new VM when C code
  > needs a Python result, requiring it to return.  The latter clearly
  > breaks certain uses of coroutines but could probably be made to
  > work most of the time.  Typical use of the 80-20 rule.

I know this is disgusting, but could setjmp/longjmp 'automagically'
force a 'recursive call' to jump back into the top-level loop?  This
would put some serious restraint on what C called from Python could
do...

I think just about any Scheme implementation has to solve this same
problem... I'll dig through my collection of them for ideas.

  > In general, I still think it's a cool idea, but I also still think
  > that continuations are too complicated for most programmers.  (This
  > comes from the realization that they are too complicated for me!)
  > Corollary: even if we had continuations, I'm not sure if this would
  > take away the resistance against asyncore/asynchat.  Of course I could 
  > be wrong.

Theoretically, you could have a bit of code that looked just like
'normal' imperative code, that would actually be entering and exiting
the context for non-blocking i/o.  If it were done right, the same
exact code might even run under 'normal' threads.

Recently I've written an async server that needed to talk to several
other RPC servers, and a mysql server.  Pseudo-example, with
possibly-async calls in UPPERCASE:

  auth, archive = db.FETCH_USER_INFO (user)
  if verify_login(user,auth):
    rpc_server = self.archive_servers[archive]
    group_info = rpc_server.FETCH_GROUP_INFO (group)
    if valid (group_info):
      return rpc_server.FETCH_MESSAGE (message_number)
    else:
      ...
   else:
     ...

This code in CPS is a horrible, complicated mess, it takes something
like 8 callback methods, variables and exceptions have to be passed
around in 'continuation' objects.  It's hairy because there are three
levels of callback state.  Ugh.

If Python had closures, then it would be a *little* easier, but would
still make the average Pythoneer swoon.  Closures would let you put
the above logic all in one method, but the code would still be
'inside-out'.

  > Different suggestion: it would be cool to work on completely
  > separating out the VM from the rest of Python, through some kind of
  > C-level API specification.

I think this is a great idea.  I've been staring at python bytecodes a
bit lately thinking about how to do something like this, for some
subset of Python.

[...]

Ok, we've all seen the 'stick'.  I guess I should give an example of
the 'carrot': I think that a web server built on such a Python could
have the performance/scalability of thttpd, with the
ease-of-programming of Roxen.  As far as I know, there's nothing like
it out there.  Medusa would be put out to pasture. 8^)

-Sam




From guido at CNRI.Reston.VA.US  Fri May 14 14:03:31 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 08:03:31 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Thu, 13 May 1999 19:00:09 PDT."
             <14139.30970.644343.612721@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
            <14139.30970.644343.612721@seattle.nightmare.com> 
Message-ID: <199905141203.IAA01808@eric.cnri.reston.va.us>

> I know this is disgusting, but could setjmp/longjmp 'automagically'
> force a 'recursive call' to jump back into the top-level loop?  This
> would put some serious restraint on what C called from Python could
> do...

Forget about it.  setjmp/longjmp are invitations to problems.  I also
assume that they would interfere badly with C++.

> I think just about any Scheme implementation has to solve this same
> problem... I'll dig through my collection of them for ideas.

Anything that assumes knowledge about how the C compiler and/or the
CPU and OS lay out the stack is a no-no, because it means that the
first thing one has to do for a port to a new architecture is figure
out how the stack is laid out.  Another thread in this list is porting 
Python to microplatforms like PalmOS.  Typically the scheme Hackers
are not afraid to delve deep into the machine, but I refuse to do that
-- I think it's too risky.

>   > In general, I still think it's a cool idea, but I also still think
>   > that continuations are too complicated for most programmers.  (This
>   > comes from the realization that they are too complicated for me!)
>   > Corollary: even if we had continuations, I'm not sure if this would
>   > take away the resistance against asyncore/asynchat.  Of course I could 
>   > be wrong.
> 
> Theoretically, you could have a bit of code that looked just like
> 'normal' imperative code, that would actually be entering and exiting
> the context for non-blocking i/o.  If it were done right, the same
> exact code might even run under 'normal' threads.

Yes -- I remember in 92 or 93 I worked out a way to emulat coroutines
with regular threads.  (I think in cooperation with Steve Majewski.)

> Recently I've written an async server that needed to talk to several
> other RPC servers, and a mysql server.  Pseudo-example, with
> possibly-async calls in UPPERCASE:
> 
>   auth, archive = db.FETCH_USER_INFO (user)
>   if verify_login(user,auth):
>     rpc_server = self.archive_servers[archive]
>     group_info = rpc_server.FETCH_GROUP_INFO (group)
>     if valid (group_info):
>       return rpc_server.FETCH_MESSAGE (message_number)
>     else:
>       ...
>    else:
>      ...
> 
> This code in CPS is a horrible, complicated mess, it takes something
> like 8 callback methods, variables and exceptions have to be passed
> around in 'continuation' objects.  It's hairy because there are three
> levels of callback state.  Ugh.

Agreed.

> If Python had closures, then it would be a *little* easier, but would
> still make the average Pythoneer swoon.  Closures would let you put
> the above logic all in one method, but the code would still be
> 'inside-out'.

I forget how this worked :-(

>   > Different suggestion: it would be cool to work on completely
>   > separating out the VM from the rest of Python, through some kind of
>   > C-level API specification.
> 
> I think this is a great idea.  I've been staring at python bytecodes a
> bit lately thinking about how to do something like this, for some
> subset of Python.
> 
> [...]
> 
> Ok, we've all seen the 'stick'.  I guess I should give an example of
> the 'carrot': I think that a web server built on such a Python could
> have the performance/scalability of thttpd, with the
> ease-of-programming of Roxen.  As far as I know, there's nothing like
> it out there.  Medusa would be put out to pasture. 8^)

I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
they do that Apache doesn't?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at pythonware.com  Fri May 14 15:16:13 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 14 May 1999 15:16:13 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>             <14139.30970.644343.612721@seattle.nightmare.com>  <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <001701be9e0b$f1bc4930$f29b12c2@pythonware.com>

> I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
> they do that Apache doesn't?

http://www.roxen.com/

a lean and mean secure web server written in Pike
(http://pike.idonex.se/), from a company here in
Link?ping.

</F>




From tismer at appliedbiometrics.com  Fri May 14 17:15:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 14 May 1999 17:15:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us>  
	            <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <373C3E08.FCCB141B@appliedbiometrics.com>


Guido van Rossum wrote:

[setjmp/longjmp -no-no]

> Forget about it.  setjmp/longjmp are invitations to problems.  I also
> assume that they would interfere badly with C++.
> 
> > I think just about any Scheme implementation has to solve this same
> > problem... I'll dig through my collection of them for ideas.
> 
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.
...

I agree that this is generally bad. While it's a cakewalk
to do a stack swap for the few (X86 based:) platforms where
I work with. This is much less than a thread change.

But on the general issues:
Can the Python-calls-C and C-calls-Python problem just be solved
by turning the whole VM state into a data structure, including
a Python call stack which is independent? Maybe this has been
mentioned already.

This might give a little slowdown, but opens possibilities
like continuation-passing style, and context switches
between different interpreter states would be under direct
control.

Just a little dreaming: Not using threads, but just tiny
interpreter incarnations with local state, and a special
C call or better a new opcode which activates the next
state in some list (of course a Python list).
This would automagically produce ICON iterators (duck)
and coroutines (cover).
If I guess right, continuation passing could be done
by just shifting tiny tuples around. Well, Tim, help me :-)

[closures]

> > I think this is a great idea.  I've been staring at python bytecodes a
> > bit lately thinking about how to do something like this, for some
> > subset of Python.

Lumberjack? How is it going? [to Sam]

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From bwarsaw at cnri.reston.va.us  Fri May 14 17:32:51 1999
From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw)
Date: Fri, 14 May 1999 11:32:51 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
	<001701be9e0b$f1bc4930$f29b12c2@pythonware.com>
Message-ID: <14140.16931.987089.887772@anthem.cnri.reston.va.us>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

    FL> a lean and mean secure web server written in Pike
    FL> (http://pike.idonex.se/), from a company here in
    FL> Link?ping.

Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
original came on board to add Pike support, which has a syntax similar 
enough to C to be easily integrated.  I think I've had as much success 
convincing him to use Python as he's had convincing me to use Pike :-)

-Barry



From gstein at lyra.org  Fri May 14 23:54:02 1999
From: gstein at lyra.org (Greg Stein)
Date: Fri, 14 May 1999 14:54:02 -0700
Subject: [Python-Dev] Roxen (was Re: [Python-Dev] 'stackless' python?)
References: <199905070507.BAA22545@python.org>
		<14138.28243.553816.166686@seattle.nightmare.com>
		<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
		<14138.60284.584739.711112@anthem.cnri.reston.va.us>
		<14139.30970.644343.612721@seattle.nightmare.com>
		<199905141203.IAA01808@eric.cnri.reston.va.us>
		<001701be9e0b$f1bc4930$f29b12c2@pythonware.com> <14140.16931.987089.887772@anthem.cnri.reston.va.us>
Message-ID: <373C9B7A.3676A910@lyra.org>

Barry A. Warsaw wrote:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:
> 
>     FL> a lean and mean secure web server written in Pike
>     FL> (http://pike.idonex.se/), from a company here in
>     FL> Link?ping.
> 
> Interesting off-topic Pike connection.  My co-maintainer for CC-Mode
> original came on board to add Pike support, which has a syntax similar
> enough to C to be easily integrated.  I think I've had as much success
> convincing him to use Python as he's had convincing me to use Pike :-)

<HistoricalNote>

Heh. Pike is an outgrowth of the MUD world's LPC programming language. A
guy named "Profezzorn" started a project (in '94?) to redevelop an LPC
compiler/interpreter ("driver") from scratch to avoid some licensing
constraints. The project grew into a generalized network handler, since
MUDs' typical designs are excellent for these tasks. From there, you get
the Roxen web server.

</HistoricalNote>

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From rushing at nightmare.com  Sat May 15 01:36:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Fri, 14 May 1999 16:36:11 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
References: <199905070507.BAA22545@python.org>
	<14138.28243.553816.166686@seattle.nightmare.com>
	<14138.47464.699243.853550@cm-29-94-2.nycap.rr.com>
	<14138.60284.584739.711112@anthem.cnri.reston.va.us>
	<14139.30970.644343.612721@seattle.nightmare.com>
	<199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <14140.44469.848840.740112@seattle.nightmare.com>

Guido van Rossum writes:
 > > If Python had closures, then it would be a *little* easier, but would
 > > still make the average Pythoneer swoon.  Closures would let you put
 > > the above logic all in one method, but the code would still be
 > > 'inside-out'.
 > 
 > I forget how this worked :-(

[with a faked-up lambda-ish syntax]

def thing (a):
  return do_async_job_1 (a,
    lambda (b):
      if (a>1):
        do_async_job_2a (b,
          lambda (c):
            [...]
          )
      else:
        do_async_job_2b (a,b,
          lambda (d,e,f):
            [...]
          )
     )

The call to do_async_job_1 passes 'a', and a callback, which is
specified 'in-line'.  You can follow the logic of something like this
more easily than if each lambda is spun off into a different
function/method.

 > > I think that a web server built on such a Python could have the
 > > performance/scalability of thttpd, with the ease-of-programming
 > > of Roxen.  As far as I know, there's nothing like it out there.
 > > Medusa would be put out to pasture. 8^)
 > 
 > I'm afraid I haven't kept up -- what are Roxen and thttpd?  What do
 > they do that Apache doesn't?

thttpd (& Zeus, Squid, Xitami) use select()/poll() to gain performance
and scalability, but suffer from the same programmability problem as
Medusa (only worse, 'cause they're in C).

Roxen is written in Pike, a c-like language with gc, threads,
etc... Roxen is I think now the official 'GNU Web Server'.

Here's an interesting web-server comparison chart:

http://www.acme.com/software/thttpd/benchmarks.html

-Sam




From guido at CNRI.Reston.VA.US  Sat May 15 04:23:24 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Fri, 14 May 1999 22:23:24 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Fri, 14 May 1999 16:36:11 PDT."
             <14140.44469.848840.740112@seattle.nightmare.com> 
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
            <14140.44469.848840.740112@seattle.nightmare.com> 
Message-ID: <199905150223.WAA02457@eric.cnri.reston.va.us>

> def thing (a):
>   return do_async_job_1 (a,
>     lambda (b):
>       if (a>1):
>         do_async_job_2a (b,
>           lambda (c):
>             [...]
>           )
>       else:
>         do_async_job_2b (a,b,
>           lambda (d,e,f):
>             [...]
>           )
>      )
> 
> The call to do_async_job_1 passes 'a', and a callback, which is
> specified 'in-line'.  You can follow the logic of something like this
> more easily than if each lambda is spun off into a different
> function/method.

I agree that it is still ugly.

> http://www.acme.com/software/thttpd/benchmarks.html

I see.  Any pointers to a graph of thttp market share?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat May 15 09:51:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905141203.IAA01808@eric.cnri.reston.va.us>
Message-ID: <000701be9ea7$acab7f40$159e2299@tim>

[GvR]
> ...
> Anything that assumes knowledge about how the C compiler and/or the
> CPU and OS lay out the stack is a no-no, because it means that the
> first thing one has to do for a port to a new architecture is figure
> out how the stack is laid out.  Another thread in this list is porting
> Python to microplatforms like PalmOS.  Typically the scheme Hackers
> are not afraid to delve deep into the machine, but I refuse to do that
> -- I think it's too risky.

The Icon language needs a bit of platform-specific context-switching
assembly code to support its full coroutine features, although its
bread-and-butter generators ("semi coroutines") don't need anything special.

The result is that Icon ports sometimes limp for a year before they support
full coroutines, waiting for someone wizardly enough to write the necessary
code.  This can, in fact, be quite difficult; e.g., on machines with HW
register windows (where "the stack" can be a complicated beast half buried
in hidden machine state, sometimes needing kernel privilege to uncover).

Not attractive.  Generators are, though <wink>.

threads-too-ly y'rs  - tim





From tim_one at email.msn.com  Sat May 15 09:51:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 03:51:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000801be9ea7$ae45f560$159e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From gstein at lyra.org  Sat May 15 10:17:15 1999
From: gstein at lyra.org (Greg Stein)
Date: Sat, 15 May 1999 01:17:15 -0700
Subject: [Python-Dev] 'stackless' python?
References: <199905070507.BAA22545@python.org> <14138.28243.553816.166686@seattle.nightmare.com> <14138.47464.699243.853550@cm-29-94-2.nycap.rr.com> <14138.60284.584739.711112@anthem.cnri.reston.va.us> <14139.30970.644343.612721@seattle.nightmare.com> <199905141203.IAA01808@eric.cnri.reston.va.us>  
	            <14140.44469.848840.740112@seattle.nightmare.com> <199905150223.WAA02457@eric.cnri.reston.va.us>
Message-ID: <373D2D8B.390C523C@lyra.org>

Guido van Rossum wrote:
> ...
> > http://www.acme.com/software/thttpd/benchmarks.html
> 
> I see.  Any pointers to a graph of thttp market share?

thttpd currently has about 70k sites (of 5.4mil found by Netcraft). That
puts it at #6. However, it is interesting to note that 60k of those
sites are in the .uk domain. I can't figure out who is running it, but I
would guess that a large UK-based ISP is hosting a bunch of domains on
thttpd.

It is somewhat difficult to navigate the various reports (and it never
fails that the one you want is not present), but the data is from
Netcraft's survey at: http://www.netcraft.com/survey/

Cheers,
-g

--
Greg Stein, http://www.lyra.org/



From tim_one at email.msn.com  Sat May 15 18:43:20 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 15 May 1999 12:43:20 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373C3E08.FCCB141B@appliedbiometrics.com>
Message-ID: <000701be9ef2$0a9713e0$659e2299@tim>

[Christian Tismer]
> ...
> But on the general issues:
> Can the Python-calls-C and C-calls-Python problem just be solved
> by turning the whole VM state into a data structure, including
> a Python call stack which is independent? Maybe this has been
> mentioned already.

The problem is that when C calls Python, any notion of continuation has to
include C's state too, else resuming the continuation won't return into C
correctly.  The C code that *implements* Python could be reworked to support
this, but in the general case you've got some external C extension module
calling into Python, and then Python hasn't a clue about its caller's state.

I'm not a fan of continuations myself; coroutines can be implemented
faithfully via threads (I posted a rather complete set of Python classes for
that in the pre-DejaNews days, a bit more flexible than Icon's coroutines);
and:

> This would automagically produce ICON iterators (duck)
> and coroutines (cover).

Icon iterators/generators could be implemented today if anyone bothered
(Majewski essentially implemented them back around '93 already, but seemed
to lose interest when he realized it couldn't be extended to full
continuations, because of C/Python stack intertwingling).

> If I guess right, continuation passing could be done
> by just shifting tiny tuples around. Well, Tim, help me :-)

Python-calling-Python continuations should be easily doable in a "stackless"
Python; the key ideas were already covered in this thread, I think.  The
thing that makes generators so much easier is that they always return
directly to their caller, at the point of call; so no C frame can get stuck
in the middle even under today's implementation; it just requires not
deleting the generator's frame object, and adding an opcode to *resume* the
frame's execution the next time the generator is called.  Unlike as in Icon,
it wouldn't even need to be tied to a funky notion of goal-directed
evaluation.

don't-try-to-traverse-a-tree-without-it-ly y'rs  - tim





From rushing at nightmare.com  Sun May 16 13:10:18 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Sun, 16 May 1999 04:10:18 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <81365478@toto.iv>
Message-ID: <14142.40867.103424.764346@seattle.nightmare.com>

Tim Peters writes:
 > I'm not a fan of continuations myself; coroutines can be
 > implemented faithfully via threads (I posted a rather complete set
 > of Python classes for that in the pre-DejaNews days, a bit more
 > flexible than Icon's coroutines); and:

Continuations are more powerful than coroutines, though I admit
they're a bit esoteric.  I programmed in Scheme for years without
seeing the need for them.  But when you need 'em, you *really* need
'em.  No way around it.

For my purposes (massively scalable single-process servers and
clients) threads don't cut it... for example I have a mailing-list
exploder that juggles up to 2048 simultaneous SMTP connections.  I
think it can go higher - I've tested select() on FreeBSD with 16,000
file descriptors.

[...]

BTW, I have actually made progress borrowing a bit of code from SCM.
It uses the stack-copying technique, along with setjmp/longjmp.  It's
too ugly and unportable to be a real candidate for inclusion in
Official Python.  [i.e., if it could be made to work it should be
considered a stopgap measure for the desperate].

I haven't tested it thoroughly, but I have successfully saved and
invoked (and reinvoked) a continuation.  Caveat: I have to turn off
Py_DECREF in order to keep it from crashing.

  | >>> import callcc
  | >>> saved = None
  | >>> def thing(n):
  | ...     if n == 2:
  | ...             global saved
  | ...             saved = callcc.new()
  | ...     print 'n==',n
  | ...     if n == 0:
  | ...             print 'Done!'
  | ...     else:
  | ...             thing (n-1)
  | ... 
  | >>> thing (5)
  | n== 5
  | n== 4
  | n== 3
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved
  | <Continuation object at 80d30d0>
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> saved.throw (0)
  | n== 2
  | n== 1
  | n== 0
  | Done!
  | >>> 

I will probably not be able to work on this for a while (baby due any
day now), so anyone is welcome to dive right in.  I don't have much
experience wading through gdb tracking down reference bugs, I'm hoping
a brave soul will pick up where I left off. 8^)

http://www.nightmare.com/stuff/python-callcc.tar.gz
ftp://www.nightmare.com/stuff/python-callcc.tar.gz

-Sam




From tismer at appliedbiometrics.com  Sun May 16 17:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 17:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <373EE4B5.6EE6A678@appliedbiometrics.com>


rushing at nightmare.com wrote:

[...]

> BTW, I have actually made progress borrowing a bit of code from SCM.
> It uses the stack-copying technique, along with setjmp/longjmp.  It's
> too ugly and unportable to be a real candidate for inclusion in
> Official Python.  [i.e., if it could be made to work it should be
> considered a stopgap measure for the desperate].

I tried it and built it as a Win32 .pyd file, and it seems to
work, but...

> I haven't tested it thoroughly, but I have successfully saved and
> invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> Py_DECREF in order to keep it from crashing.

Indeed, and this seems to be a problem too hard to solve
without lots of work.
Since you keep a snapshot of the current machine stack,
it contains a number of object references which have been
valid when the snapshot was taken, but many are most
probably invalid when you restart the continuation.
I guess, incref-ing all current alive objects on
the interpreter stack would be the minimum, maybe more.

A tuple of necessary references could be used as an
attribute of a Continuation object. I will look
how difficult this is.

ciao - chris


-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Sun May 16 20:31:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 16 May 1999 20:31:01 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <373EE4B5.6EE6A678@appliedbiometrics.com>
Message-ID: <373F0EE5.A8DE00C5@appliedbiometrics.com>


Christian Tismer wrote:
> 
> rushing at nightmare.com wrote:
[...]

> > I haven't tested it thoroughly, but I have successfully saved and
> > invoked (and reinvoked) a continuation.  Caveat: I have to turn off
> > Py_DECREF in order to keep it from crashing.

It is possible, but a little hard.
To take a working snapshot of the current thread's
stack, one needs not only the stack snapshot which 
continue.c provides, but also a restorable copy of
all frame objects involved so far.
A copy of the current frame chain must be built, with
proper reference counting of all involved elements.
And this is the crux: The current stack pointer of the
VM is not present in the frame objects, but hangs
around somewhere on the machine stack.
Two solutions:

1) modify PyFrameObject by adding a field which holds
   the stack pointer, when a function is called. 
   I don't like to change the VM in any way for this.
2) use the lasti field which holds the last VM instruction
   offset. Then scan the opcodes of the code object
   and calculate the current stack level. This is possible
   since Guido's code generator creates code with the stack
   level lexically bound to the code offset.

Now we can incref all the referenced objects in the frame.
This must be done for the whole chain, which is copied and
relinked during that. This chain is then held as a
property of the continuation object.

To throw the continuation, the current frame chain must
be cleared, and the saved one is inserted, together with
the machine stack operation which Sam has already.

A little hefty, isn't it?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Mon May 17 07:42:59 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Mon, 17 May 1999 01:42:59 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14142.40867.103424.764346@seattle.nightmare.com>
Message-ID: <000f01bea028$1f75c360$fb9e2299@tim>

[Sam]
> Continuations are more powerful than coroutines, though I admit
> they're a bit esoteric.

"More powerful" is a tedious argument you should always avoid <wink>.

> I programmed in Scheme for years without seeing the need for them.
> But when you need 'em, you *really* need 'em.  No way around it.
>
> For my purposes (massively scalable single-process servers and
> clients) threads don't cut it... for example I have a mailing-list
> exploder that juggles up to 2048 simultaneous SMTP connections.  I
> think it can go higher - I've tested select() on FreeBSD with 16,000
> file descriptors.

The other point being that you want to avoid "inside out" logic, though,
right?  Earlier you posted a kind of ideal:

    Recently I've written an async server that needed to talk to several
    other RPC servers, and a mysql server.  Pseudo-example, with
    possibly-async calls in UPPERCASE:

      auth, archive = db.FETCH_USER_INFO (user)
      if verify_login(user,auth):
          rpc_server = self.archive_servers[archive]
          group_info = rpc_server.FETCH_GROUP_INFO (group)
          if valid (group_info):
              return rpc_server.FETCH_MESSAGE (message_number)
          else:
              ...
          else:
              ...

I assume you want to capture a continuation object in the UPPERCASE methods,
store it away somewhere, run off to your select/poll/whatever loop, and have
it invoke the stored continuation objects as the data they're waiting for
arrives.

If so, that's got to be the nicest use for continuations I've seen!  All
invisible to the end user.  I don't know how to fake it pleasantly without
threads, either, and understand that threads aren't appropriate for resource
reasons.  So I don't have a nice alternative.

> ...
>   | >>> import callcc
>   | >>> saved = None
>   | >>> def thing(n):
>   | ...     if n == 2:
>   | ...             global saved
>   | ...             saved = callcc.new()
>   | ...     print 'n==',n
>   | ...     if n == 0:
>   | ...             print 'Done!'
>   | ...     else:
>   | ...             thing (n-1)
>   | ...
>   | >>> thing (5)
>   | n== 5
>   | n== 4
>   | n== 3
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved
>   | <Continuation object at 80d30d0>
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>> saved.throw (0)
>   | n== 2
>   | n== 1
>   | n== 0
>   | Done!
>   | >>>

Suppose the driver were in a script instead:

thing(5)           # line 1
print repr(saved)  # line 2
saved.throw(0)     # line 3
saved.throw(0)     # line 4

Then the continuation would (eventually) "return to" the "print repr(saved)"
and we'd get an infinite output tail of:

Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
Continuation object at 80d30d0>
n== 2
n== 1
n== 0
Done!
...

and never reach line 4.  Right?  That's the part that Guido hates <wink>.

takes-one-to-know-one-ly y'rs  - tim





From tismer at appliedbiometrics.com  Mon May 17 09:07:22 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 09:07:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <373FC02A.69F2D912@appliedbiometrics.com>


Tim Peters wrote:

[to Sam]

> The other point being that you want to avoid "inside out" logic, though,
> right?  Earlier you posted a kind of ideal:
> 
>     Recently I've written an async server that needed to talk to several
>     other RPC servers, and a mysql server.  Pseudo-example, with
>     possibly-async calls in UPPERCASE:
> 
>       auth, archive = db.FETCH_USER_INFO (user)
>       if verify_login(user,auth):
>           rpc_server = self.archive_servers[archive]
>           group_info = rpc_server.FETCH_GROUP_INFO (group)
>           if valid (group_info):
>               return rpc_server.FETCH_MESSAGE (message_number)
>           else:
>               ...
>           else:
>               ...
> 
> I assume you want to capture a continuation object in the UPPERCASE methods,
> store it away somewhere, run off to your select/poll/whatever loop, and have
> it invoke the stored continuation objects as the data they're waiting for
> arrives.
> 
> If so, that's got to be the nicest use for continuations I've seen!  All
> invisible to the end user.  I don't know how to fake it pleasantly without
> threads, either, and understand that threads aren't appropriate for resource
> reasons.  So I don't have a nice alternative.

It can always be done with threads, but also without. Tried it
last night, with proper refcounting, and it wasn't too easy
since I had to duplicate the Python frame chain.

...

> Suppose the driver were in a script instead:
> 
> thing(5)           # line 1
> print repr(saved)  # line 2
> saved.throw(0)     # line 3
> saved.throw(0)     # line 4
> 
> Then the continuation would (eventually) "return to" the "print repr(saved)"
> and we'd get an infinite output tail of:
> 
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!
> Continuation object at 80d30d0>
> n== 2
> n== 1
> n== 0
> Done!

This is at the moment exactly what happens, with the difference that
after some repetitions we GPF due to dangling references
to too often decref'ed objects. My incref'ing prepares for
just one re-incarnation and should prevend a second call.
But this will be solved, soon.

> and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yup. With a little counting, it was easy to survive:

def main():
    global a
    a=2
    thing (5)
    a=a-1
    if a:
        saved.throw (0)

Weird enough and needs a much better interface.
But finally I'm quite happy that it worked so smoothly
after just a couple of hours (well, about six :)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Mon May 17 11:46:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 02:46:29 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000f01bea028$1f75c360$fb9e2299@tim>
References: <14142.40867.103424.764346@seattle.nightmare.com>
	<000f01bea028$1f75c360$fb9e2299@tim>
Message-ID: <14143.56604.21827.891993@seattle.nightmare.com>

Tim Peters writes:
 > [Sam]
 > > Continuations are more powerful than coroutines, though I admit
 > > they're a bit esoteric.
 > 
 > "More powerful" is a tedious argument you should always avoid <wink>.

More powerful in the sense that you can use continuations to build
lots of different control structures (coroutines, backtracking,
exceptions), but not vice versa.

Kinda like a better tool for blowing one's own foot off. 8^)

 > Suppose the driver were in a script instead:
 > 
 > thing(5)           # line 1
 > print repr(saved)  # line 2
 > saved.throw(0)     # line 3
 > saved.throw(0)     # line 4
 > 
 > Then the continuation would (eventually) "return to" the "print repr(saved)"
 > and we'd get an infinite output tail [...]
 > 
 > and never reach line 4.  Right?  That's the part that Guido hates <wink>.

Yes... the continuation object so far isn't very usable.  It needs a
driver of some kind around it.  In the Scheme world, there are two
common ways of using continuations - let/cc and call/cc.  [call/cc is what
is in the standard, it's official name is call-with-current-continuation]

let/cc stores the continuation in a variable binding, while
introducing a new scope.  It requires a change to the underlying
language:

(+ 1
  (let/cc escape
    (...)
    (escape 34)))
=> 35

'escape' is a function that when called will 'resume' with whatever
follows the let/cc clause.  In this case it would continue with the
addition...

call/cc is a little trickier, but doesn't require any change to the
language...  instead of making a new binding directly, you pass in
a function that will receive the binding:

(+ 1
   (call/cc
     (lambda (escape)
       (...)
       (escape 34))))
=> 35

In words, it's much more frightening: "call/cc is a function, that
when called with a function as an argument, will pass that function an
argument that is a new function, which when called with a value will
resume the computation with that value as the result of the entire
expression"  Phew.

In Python, an example might look like this:

SAVED = None
def save_continuation (k):
  global SAVED
  SAVED = k

def thing():
  [...]
  value = callcc (lambda k: save_continuation(k))

# or more succinctly:
def thing():
  [...]
  value = callcc (save_continuation)

In order to do useful work like passing values back and forth between
coroutines, we have to have some way of returning a value from the
continuation when it is reinvoked.

I should emphasize that most folks will never see call/cc 'in the
raw', it will usually have some nice wrapper around to implement
whatever construct is needed.

-Sam




From arw at ifu.net  Mon May 17 20:06:18 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 14:06:18 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <37405A99.1DBAF399@ifu.net>

The illustrious Sam Rushing avers:
>Continuations are more powerful than coroutines, though I admit
>they're a bit esoteric.  I programmed in Scheme for years without
>seeing the need for them.  But when you need 'em, you *really* need
>'em.  No way around it.

Frankly, I think I thought I understood this once but now I know I
don't.
How're continuations more powerful than coroutines?
And why can't they be implemented using threads (and semaphores etc)?

...I'm not promising I'll understand the answer...
    -- Aaron Watters

===
I taught I taw a putty-cat!





From gmcm at hypernet.com  Mon May 17 21:18:43 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 14:18:43 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
Message-ID: <1285153546-166193857@hypernet.com>

The estimable Aaron Watters queries:
> The illustrious Sam Rushing avers:
> >Continuations are more powerful than coroutines, though I admit
> >they're a bit esoteric.  I programmed in Scheme for years without
> >seeing the need for them.  But when you need 'em, you *really* need
> >'em.  No way around it.
> 
> Frankly, I think I thought I understood this once but now I know I
> don't. How're continuations more powerful than coroutines? And why
> can't they be implemented using threads (and semaphores etc)?

I think Sam's (immediate <wink>) problem is that he can't afford 
threads - he may have hundreds to thousands of these suckers.

As a fuddy-duddy old imperative programmer, I'm inclined to think 
"state machine". But I'd guess that functional-ophiles probably see 
that as inelegant. (Safe guess - they see _anything_ that isn't 
functional as inelegant!).

crude-but-not-rude-ly y'rs

- Gordon



From jeremy at cnri.reston.va.us  Mon May 17 20:43:34 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Mon, 17 May 1999 14:43:34 -0400 (EDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37405A99.1DBAF399@ifu.net>
References: <37405A99.1DBAF399@ifu.net>
Message-ID: <14144.24242.128959.726878@bitdiddle.cnri.reston.va.us>

>>>>> "AW" == Aaron Watters <arw at ifu.net> writes:

  AW> The illustrious Sam Rushing avers:
  >> Continuations are more powerful than coroutines, though I admit
  >> they're a bit esoteric.  I programmed in Scheme for years without
  >> seeing the need for them.  But when you need 'em, you *really*
  >> need 'em.  No way around it.

  AW> Frankly, I think I thought I understood this once but now I know
  AW> I don't.  How're continuations more powerful than coroutines?
  AW> And why can't they be implemented using threads (and semaphores
  AW> etc)?

I think I understood, too.  I'm hoping that someone will debug my
answer and enlighten us both.

A continuation is a mechanism for making control flow explicit.  A
continuation is a means of naming and manipulating "the rest of the
program."   In Scheme terms, the continuation is the function that the 
value of the current expression should be passed to.  The call/cc
mechanisms lets you capture the current continuation and explicitly
call on it.  The most typical use of call/cc is non-local exits, but
it gives you incredible flexibility for implementing your control
flow.

I'm fuzzy on coroutines, as I've only seen them in "Structure
Programming" (which is as old as I am :-) and never actually used
them.  The basic idea is that when a coroutine calls another
coroutine, control is transfered to the second coroutine at the point
at which it last left off (by itself calling another coroutine or by
detaching, which returns control to the lexically enclosing scope).

It seems to me that coroutines are an example of the kind of control
structure that you could build with continuations.  It's not clear
that the reverse is true.

I have to admit that I'm a bit unclear on the motivation for all
this.  As Gordon said, the state machine approach seems like it would
be a good approach.

Jeremy



From klm at digicool.com  Mon May 17 21:08:57 1999
From: klm at digicool.com (Ken Manheimer)
Date: Mon, 17 May 1999 15:08:57 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
Message-ID: <613145F79272D211914B0020AFF640190BEEDE@gandalf.digicool.com>

Jeremy Hylton:

> I have to admit that I'm a bit unclear on the motivation for all
> this.  As Gordon said, the state machine approach seems like it would
> be a good approach.

If i understand what you mean by state machine programming, it's pretty
inherently uncompartmented, all the combinations of state variables need
to be accounted for, so the number of states grows factorially on the
number of state vars, in general it's awkward.  The advantage of going
with what functional folks come up with, like continuations, is that it
tends to be well compartmented - functional.  (Come to think of it, i
suppose that compartmentalization as opposed to state is their mania.)

As abstract as i can be (because i hardly know what i'm talking about)
(but i have done some specifically finite state machine programming, and
did not enjoy it),

Ken
klm at digicool.com



From arw at ifu.net  Mon May 17 21:20:13 1999
From: arw at ifu.net (Aaron Watters)
Date: Mon, 17 May 1999 15:20:13 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com>
Message-ID: <37406BED.95AEB896@ifu.net>

The ineffible Gordon McMillan retorts:

> As a fuddy-duddy old imperative programmer, I'm inclined to think
> "state machine". But I'd guess that functional-ophiles probably see
> that as inelegant. (Safe guess - they see _anything_ that isn't
> functional as inelegant!).

As a fellow fuddy-duddy I'd agree except that if you write properlylayered
software you have to unrole and rerole all those layers for every
transition of the multi-level state machine, and even though with proper
discipline it can be implemented without becoming hideous, it still adds
significant overhead compared to "stop right here and come back later"
which could be implemented using threads/coroutines(?)/continuations.
I think this is particularly true in Python with the relatively high
function
call overhead.  Or maybe I'm out in left field doing cartwheels...

I guess the question of interest is why are threads insufficient?  I guess

they have system limitations on the number of threads or other limitations

that wouldn't be a problem with continuations?  If there aren't a *lot* of

situations where coroutines are vital, I'd be hesitant to do major
surgery.
But I'm a fuddy-duddy.

   -- Aaron Watters

===
I did! I did!





From tismer at appliedbiometrics.com  Mon May 17 22:03:01 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Mon, 17 May 1999 22:03:01 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <1285153546-166193857@hypernet.com> <37406BED.95AEB896@ifu.net>
Message-ID: <374075F5.F29B4EAB@appliedbiometrics.com>


Aaron Watters wrote:
> 
> The ineffible Gordon McMillan retorts:
> 
> > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > "state machine". But I'd guess that functional-ophiles probably see
> > that as inelegant. (Safe guess - they see _anything_ that isn't
> > functional as inelegant!).
> 
> As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> software you have to unrole and rerole all those layers for every
> transition of the multi-level state machine, and even though with proper
> discipline it can be implemented without becoming hideous, it still adds
> significant overhead compared to "stop right here and come back later"
> which could be implemented using threads/coroutines(?)/continuations.

Coroutines are most elegant here, since (fir a simple example)
they are a symmetric pair of functions which call each other.
There is neither the one-pulls, the other pushes asymmetry, nor
the need to maintain state and be controlled by a supervisor
function.

> I think this is particularly true in Python with the relatively high
> function
> call overhead.  Or maybe I'm out in left field doing cartwheels...
> I guess the question of interest is why are threads insufficient?  I guess
> they have system limitations on the number of threads or other limitations
> that wouldn't be a problem with continuations?  If there aren't a *lot* of
> situations where coroutines are vital, I'd be hesitant to do major
> surgery.

For me (as always) most interesting is the possible speed of
coroutines. They involve no threads overhead, no locking,
no nothing. Python supports it better than expected. If the
stack level of two code objects is the same at a switching point,
the whole switch is nothing more than swapping two frame objects,
and we're done. This might be even cheaper than general call/cc,
like a function call. Sam's prototype works already, with no change to
the
interpreter (but knowledge of Python frames, and a .dll of course).

I think we'll continue a while.

continuously - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From gmcm at hypernet.com  Tue May 18 00:17:25 1999
From: gmcm at hypernet.com (Gordon McMillan)
Date: Mon, 17 May 1999 17:17:25 -0500
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <374075F5.F29B4EAB@appliedbiometrics.com>
Message-ID: <1285142823-166838954@hypernet.com>

Co-Christian-routines Tismer continues:

> Aaron Watters wrote:
> > 
> > The ineffible Gordon McMillan retorts:
> > 
> > > As a fuddy-duddy old imperative programmer, I'm inclined to think
> > > "state machine". But I'd guess that functional-ophiles probably see
> > > that as inelegant. (Safe guess - they see _anything_ that isn't
> > > functional as inelegant!).
> > 
> > As a fellow fuddy-duddy I'd agree except that if you write properlylayered
> > software you have to unrole and rerole all those layers for every
> > transition of the multi-level state machine, and even though with proper
> > discipline it can be implemented without becoming hideous, it still adds
> > significant overhead compared to "stop right here and come back later"
> > which could be implemented using threads/coroutines(?)/continuations.
> 
> Coroutines are most elegant here, since (fir a simple example)
> they are a symmetric pair of functions which call each other.
> There is neither the one-pulls, the other pushes asymmetry, nor the
> need to maintain state and be controlled by a supervisor function.

Well, the state maintains you, instead of the other way 'round. (Any 
other ex-Big-Blue-ers out there that used to play these games with 
checkpoint and SyncSort?).

I won't argue elegance. Just a couple points:

- there's an art to writing state machines which is largely 
unrecognized (most of them are unnecessarily horrid).

- a multiplexed solution (vs a threaded solution) requires that 
something be inside out. In one case it's your code, in the other, 
your understanding of the problem. Neither is trivial.

Not to be discouraging - as long as your solution doesn't involve 
using regexps on bytecode <wink>, I say go for it!

- Gordon



From guido at CNRI.Reston.VA.US  Tue May 18 06:03:34 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Tue, 18 May 1999 00:03:34 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: Your message of "Mon, 17 May 1999 02:46:29 PDT."
             <14143.56604.21827.891993@seattle.nightmare.com> 
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
            <14143.56604.21827.891993@seattle.nightmare.com> 
Message-ID: <199905180403.AAA04772@eric.cnri.reston.va.us>

Sam (& others),

I thought I understood what continuations were, but the examples of
what you can do with them so far don't clarify the matter at all.

Perhaps it would help to explain what a continuation actually does
with the run-time environment, instead of giving examples of how to
use them and what the result it?

Here's a start of my own understanding (brief because I'm on a 28.8k
connection which makes my ordinary typing habits in Emacs very
painful).

1. All program state is somehow contained in a single execution stack.
This includes globals (which are simply name bindings in the botton
stack frame).  It also includes a code pointer for each stack frame
indicating where the function corresponding to that stack frame is
executing (this is the return address if there is a newer stack frame, 
or the current instruction for the newest frame).

2. A continuation does something equivalent to making a copy of the
entire execution stack.  This can probably be done lazily.  There are
probably lots of details.  I also expect that Scheme's semantic model
is different than Python here -- e.g. does it matter whether deep or
shallow copies are made?  I.e. are there mutable *objects* in Scheme?
(I know there are mutable and immutable *name bindings* -- I think.)

3. Calling a continuation probably makes the saved copy of the
execution stack the current execution state; I presume there's also a
way to pass an extra argument.

4. Coroutines (which I *do* understand) are probably done by swapping
between two (or more) continuations.

5. Other control constructs can be done by various manipulations of
continuations.  I presume that in many situations the saved
continuation becomes the main control locus permanently, and the
(previously) current stack is simply garbage-collected.  Of course the 
lazy copy makes this efficient.



If this all is close enough to the truth, I think that continuations
involving C stack frames are definitely out -- as Tim Peters
mentioned, you don't know what the stuff on the C stack of extensions
refers to.  (My guess would be that Scheme implementations assume that
any pointers on the C stack point to Scheme objects, so that C stack
frames can be copied and conservative GC can be used -- this will
never happen in Python.)

Continuations involving only Python stack frames might be supported,
if we can agree on the the sharing / copying semantics.  This is where 
I don't know enough see questions at #2 above).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Tue May 18 06:46:12 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:46:12 -0400
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <37406BED.95AEB896@ifu.net>
Message-ID: <000901bea0e9$5aa2dec0$829e2299@tim>

[Aaron Watters]
> ...
> I guess the question of interest is why are threads insufficient?  I
> guess they have system limitations on the number of threads or other
> limitations that wouldn't be a problem with continuations?

Sam is mucking with thousands of simultaneous I/O-bound socket connections,
and makes a good case that threads simply don't fly here (each one consumes
a stack, kernel resources, etc).  It's unclear (to me) that thousands of
continuations would be *much* better, though, by the time Christian gets
done making thousands of copies of the Python stack chain.

> If there aren't a *lot* of situations where coroutines are vital, I'd
> be hesitant to do major surgery.  But I'm a fuddy-duddy.

Go to Sam's site (http://www.nightmare.com/), download Medusa, and read the
docs.  They're very well written and describe the problem space exquisitely.
I don't have any problems like that I need to solve, but it's interesting to
ponder!

alas-no-time-for-it-now-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:52 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:52 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <373FC02A.69F2D912@appliedbiometrics.com>
Message-ID: <000301bea0e9$4fd473a0$829e2299@tim>

[Christian Tismer]
> ...
> Yup. With a little counting, it was easy to survive:
>
> def main():
>     global a
>     a=2
>     thing (5)
>     a=a-1
>     if a:
>         saved.throw (0)

Did "a" really need to be global here?  I hope you see the same behavior
without the "global a"; e.g., this Scheme:

(define -cont- #f)

(define thing
  (lambda (n)
    (if (= n 2) (call/cc (lambda (k) (set! -cont- k))))
    (display "n == ") (display n) (newline)
    (if (= n 0)
	(begin (display "Done!") (newline))
	(thing (- n 1)))))

(define main
  (lambda ()
    (let ((a 2))
      (thing 5)
      (display "a is ") (display a) (newline)
      (set! a (- a 1))
      (if (> a 0)
	  (-cont- #f)))))

(main)

prints:

n == 5
n == 4
n == 3
n == 2
n == 1
n == 0
Done!
a is 2
n == 2
n == 1
n == 0
Done!
a is 1

Or does brute-force frame-copying cause the continuation to set "a" back to
2 each time?

> Weird enough

Par for the continuation course!  They're nasty when eaten raw.

> and needs a much better interface.

Ya, like screw 'em and use threads <wink>.

> But finally I'm quite happy that it worked so smoothly
> after just a couple of hours (well, about six :)

Yup!  Playing with Python internals is a treat.

to-be-continued-ly y'rs  - tim





From tim_one at email.msn.com  Tue May 18 06:45:57 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 00:45:57 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14143.56604.21827.891993@seattle.nightmare.com>
Message-ID: <000401bea0e9$51e467e0$829e2299@tim>

[Sam]
>>> Continuations are more powerful than coroutines, though I admit
>>> they're a bit esoteric.

[Tim]
>> "More powerful" is a tedious argument you should always avoid <wink>.

[Sam]
> More powerful in the sense that you can use continuations to build
> lots of different control structures (coroutines, backtracking,
> exceptions), but not vice versa.

"More powerful" is a tedious argument you should always avoid <frown -- I'm
not touching this, but you can fight it out now with Aaron et alia <wink>>.

>> Then the continuation would (eventually) "return to" the
>> "print repr(saved)" and we'd get an infinite output tail [...]
>> and never reach line 4.  Right?

> Yes... the continuation object so far isn't very usable.

But it's proper behavior for a continuation all the same!  So this aspect
shouldn't be "fixed".

> ...
> let/cc stores the continuation in a variable binding, while
> introducing a new scope.  It requires a change to the underlying
> language:

Isn't this often implemented via a macro, though, so that

   (let/cc name code)

"acts like"

    (call/cc (lambda (name) code))

?  I haven't used a Scheme with native let/cc, but poking around it appears
that the real intent is to support exception-style function exits with a
mechanism cheaper than 1st-class continuations:  twice saw the let/cc object
(the thingie bound to "name") defined as being invalid the instant after
"code" returns, so it's an "up the call stack" gimmick.  That doesn't sound
powerful enough for what you're after.

> [nice let/cc call/cc tutorialette]
> ...
> In order to do useful work like passing values back and forth between
> coroutines, we have to have some way of returning a value from the
> continuation when it is reinvoked.

Somehow, I suspect that's the least of our problems <0.5 wink>.  If
continuations are in Python's future, though, I agree with the need as
stated.

> I should emphasize that most folks will never see call/cc 'in the
> raw', it will usually have some nice wrapper around to implement
> whatever construct is needed.

Python already has well-developed exception and thread facilities, so it's
hard to make a case for continuations as a catch-all implementation
mechanism.  That may be the rub here:  while any number of things *can* be
implementated via continuations, I think very few *need* to be implemented
that way, and full-blown continuations aren't easy to implement efficiently
& portably.

The Icon language was particularly concerned with backtracking searches, and
came up with generators as another clearer/cheaper implementation technique.
When it went on to full-blown coroutines, it's hard to say whether
continuations would have been a better approach.  But the coroutine
implementation it has is sluggish and buggy and hard to port, so I doubt
they could have done noticeably worse.

Would full-blown coroutines be powerful enough for your needs?

assuming-the-practical-defn-of-"powerful-enough"-ly y'rs  - tim





From rushing at nightmare.com  Tue May 18 07:18:06 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:18:06 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <000401bea0e9$51e467e0$829e2299@tim>
References: <14143.56604.21827.891993@seattle.nightmare.com>
	<000401bea0e9$51e467e0$829e2299@tim>
Message-ID: <14144.61765.308962.101884@seattle.nightmare.com>

Tim Peters writes:
 > Isn't this often implemented via a macro, though, so that
 > 
 >    (let/cc name code)
 > 
 > "acts like"
 > 
 >     (call/cc (lambda (name) code))

Yup, they're equivalent, in the sense that given one you can make a
macro to do the other.  call/cc is preferred because it doesn't
require a new binding construct.

 > ?  I haven't used a Scheme with native let/cc, but poking around it
 > appears that the real intent is to support exception-style function
 > exits with a mechanism cheaper than 1st-class continuations: twice
 > saw the let/cc object (the thingie bound to "name") defined as
 > being invalid the instant after "code" returns, so it's an "up the
 > call stack" gimmick.  That doesn't sound powerful enough for what
 > you're after.

Except that since the escape procedure is 'first-class' it can be
stored away and invoked (and reinvoked) later.  [that's all that
'first-class' means: a thing that can be stored in a variable,
returned from a function, used as an argument, etc..]

I've never seen a let/cc that wasn't full-blown, but it wouldn't
surprise me.

 > The Icon language was particularly concerned with backtracking
 > searches, and came up with generators as another clearer/cheaper
 > implementation technique.  When it went on to full-blown
 > coroutines, it's hard to say whether continuations would have been
 > a better approach.  But the coroutine implementation it has is
 > sluggish and buggy and hard to port, so I doubt they could have
 > done noticeably worse.

Many Scheme implementors either skip it, or only support non-escaping
call/cc (i.e., exceptions in Python).

 > Would full-blown coroutines be powerful enough for your needs?

Yes, I think they would be.  But I think with Python it's going to
be just about as hard, either way.

-Sam




From rushing at nightmare.com  Tue May 18 07:48:29 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:48:29 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <51325225@toto.iv>
Message-ID: <14144.63787.502454.111804@seattle.nightmare.com>

Aaron Watters writes:
 > Frankly, I think I thought I understood this once but now I know I
 > don't.

8^)  That's what I said when I backed into the idea via medusa a
couple of years ago.

 > How're continuations more powerful than coroutines?  And why can't
 > they be implemented using threads (and semaphores etc)?

My understanding of the original 'coroutine' (from Pascal?) was that
it allows two procedures to 'resume' each other.  The classic
coroutine example is the 'samefringe' problem: given two trees of
differing structure, are they equal in the sense that a traversal of
the leaves results in the same list?  Coroutines let you do this
efficiently, comparing leaf-by-leaf without storing the whole tree.

continuations can do coroutines, but can also be used to implement
backtracking, exceptions, threads... probably other stuff I've never
heard of or needed.

The reason that Scheme and ML are such big fans of continuations is
because they can be used to implement all these other features.  Look
at how much try/except and threads complicate other language
implementations.  It's like a super-tool-widget - if you make sure
it's in your toolbox, you can use it to build your circular saw and
lathe from scratch.

Unfortunately there aren't many good sites on the web with good
explanatory material.  The best reference I have is "Essentials of
Programming Languages".  For those that want to play with some of
these ideas using little VM's written in Python:

  http://www.nightmare.com/software.html#EOPL

-Sam




From rushing at nightmare.com  Tue May 18 07:56:37 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Mon, 17 May 1999 22:56:37 -0700 (PDT)
Subject: [Python-Dev] coroutines vs. continuations vs. threads
In-Reply-To: <13631823@toto.iv>
Message-ID: <14144.65355.400281.123856@seattle.nightmare.com>

Jeremy Hylton writes:
 > I have to admit that I'm a bit unclear on the motivation for all
 > this.  As Gordon said, the state machine approach seems like it would
 > be a good approach.

For simple problems, state machines are ideal.  Medusa uses state
machines that are built out of Python methods.  But past a certain
level of complexity, they get too hairy to understand.  A really good
example can be found in /usr/src/linux/net/ipv4.  8^)

-Sam




From rushing at nightmare.com  Tue May 18 09:05:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:05:20 -0700 (PDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <60057226@toto.iv>
Message-ID: <14145.927.588572.113256@seattle.nightmare.com>

Guido van Rossum writes:
 > Perhaps it would help to explain what a continuation actually does
 > with the run-time environment, instead of giving examples of how to
 > use them and what the result it?

This helped me a lot, and is the angle used in "Essentials of
Programming Languages":

Usually when folks refer to a 'stack', they're refering to an
*implementation* of the stack data type: really an optimization that
assumes an upper bound on stack size, and that things will only be
pushed and popped in order.

If you were to implement a language's variable and execution stacks
with actual data structures (linked lists), then it's easy to see
what's needed: the head of the list represents the current state.  As
functions exit, they pop things off the list.

The reason I brought this up (during a lull!) was that Python is
already paying all of the cost of heap-allocated frames, and it didn't
seem to me too much of a leap from there.

 > 1. All program state is somehow contained in a single execution stack.
Yup.

 > 2. A continuation does something equivalent to making a copy of the
 > entire execution stack.
Yup.
 > I.e. are there mutable *objects* in Scheme?
 > (I know there are mutable and immutable *name bindings* -- I think.)

Yes, Scheme is pro-functional... but it has arrays, i/o, and set-cdr!,
all the things that make it 'impure'.

I think shallow copies are what's expected.  In the examples I have,
the continuation is kept in a 'register', and call/cc merely packages
it up with a little function wrapper.  You are allowed to stomp all
over lexical variables with "set!".

 > 3. Calling a continuation probably makes the saved copy of the
 > execution stack the current execution state; I presume there's also a
 > way to pass an extra argument.
Yup.
 > 4. Coroutines (which I *do* understand) are probably done by swapping
 > between two (or more) continuations.
Yup.  Here's an example in Scheme:

http://www.nightmare.com/stuff/samefringe.scm

Somewhere I have an example of coroutines being used for parsing, very
elegant.  Something like one coroutine does lexing, and passes tokens
one-by-one to the next level, which passes parsed expressions to a
compiler, or whatever.  Kinda like pipes.

 > 5. Other control constructs can be done by various manipulations of
 > continuations.  I presume that in many situations the saved
 > continuation becomes the main control locus permanently, and the
 > (previously) current stack is simply garbage-collected.  Of course
 > the lazy copy makes this efficient.

Yes... I think backtracking would be an example of this.  You're doing
a search on a large space (say a chess game).  After a certain point
you want to try a previous fork, to see if it's promising, but you
don't want to throw away your current work.  Save it, then unwind back
to the previous fork, try that option out... if it turns out to be
better then toss the original.

 > If this all is close enough to the truth, I think that
 > continuations involving C stack frames are definitely out -- as Tim
 > Peters mentioned, you don't know what the stuff on the C stack of
 > extensions refers to.  (My guess would be that Scheme
 > implementations assume that any pointers on the C stack point to
 > Scheme objects, so that C stack frames can be copied and
 > conservative GC can be used -- this will never happen in Python.)

I think you're probably right here - usually there are heavy
restrictions on what kind of data can pass through the C interface.
But I know of at least one Scheme (mzscheme/PLT) that uses
conservative gc and has c/c++ interfaces. [... dig dig ...]


From rushing at nightmare.com  Tue May 18 09:17:11 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 00:17:11 -0700 (PDT)
Subject: [Python-Dev] another good motivation
Message-ID: <14145.4917.164756.300678@seattle.nightmare.com>

"Escaping the event loop: an alternative control structure for multi-threaded GUIs"

http://cs.nyu.edu/phd_students/fuchs/
http://cs.nyu.edu/phd_students/fuchs/gui.ps

-Sam




From tismer at appliedbiometrics.com  Tue May 18 15:46:53 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 15:46:53 +0200
Subject: [Python-Dev] coroutines vs. continuations vs. threads
References: <000901bea0e9$5aa2dec0$829e2299@tim>
Message-ID: <37416F4D.8E95D71A@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Aaron Watters]
> > ...
> > I guess the question of interest is why are threads insufficient?  I
> > guess they have system limitations on the number of threads or other
> > limitations that wouldn't be a problem with continuations?
> 
> Sam is mucking with thousands of simultaneous I/O-bound socket connections,
> and makes a good case that threads simply don't fly here (each one consumes
> a stack, kernel resources, etc).  It's unclear (to me) that thousands of
> continuations would be *much* better, though, by the time Christian gets
> done making thousands of copies of the Python stack chain.

Well, what he needs here are coroutines and just a single frame
object for every minithread (I think this is a "fiber"?).
If these fibers later do deep function calls before they switch,
there will of course be more frames then.

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Tue May 18 16:35:30 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 16:35:30 +0200
Subject: [Python-Dev] 'stackless' python?
References: <14142.40867.103424.764346@seattle.nightmare.com> <000f01bea028$1f75c360$fb9e2299@tim>  
	            <14143.56604.21827.891993@seattle.nightmare.com> <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <37417AB2.80920595@appliedbiometrics.com>


Guido van Rossum wrote:
> 
> Sam (& others),
> 
> I thought I understood what continuations were, but the examples of
> what you can do with them so far don't clarify the matter at all.
> 
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?
> 
> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
> 
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).  It also includes a code pointer for each stack frame
> indicating where the function corresponding to that stack frame is
> executing (this is the return address if there is a newer stack frame,
> or the current instruction for the newest frame).

Right. For now, this information is on the C stack for each called
function, although almost completely available in the frame chain.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.  I also expect that Scheme's semantic model
> is different than Python here -- e.g. does it matter whether deep or
> shallow copies are made?  I.e. are there mutable *objects* in Scheme?
> (I know there are mutable and immutable *name bindings* -- I think.)

To make it lazy, a gatekeeper must be put on top of the two
splitted frames, which catches the event that one of them
returns. It appears to me that this it the same callcc.new()
object which catches this, splitting frames when hit by a return.

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.
> 
> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.

Right, which is just two or three assignments.

> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

Yes, great. It looks like that switching continuations
is not more expensive than a single Python function call.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

This would mean to avoid creating incompatible continuations.
A continutation may not switch to a frame chain which was created
by a different VM incarnation since this would later on
corrupt the machine stack. One way to assure that would be
a thread-safe function in sys, similar to sys.exc_info()
which gives an id for the current interpreter. continuations
living somewhere in globals would be marked by the interpreter
which created them, and reject to be thrown if they don't match.

The necessary interpreter support appears to be small:

Extend the PyFrame structure by two fields:
  - interpreter ID  (addr of some local variable would do)
  - stack pointer at current instruction.

Change the CALL_FUNCTION opcode to avoid calling eval recursively
in the case of a Python function/method, but the current frame,
build the new one and start over.
RETURN will pop a frame and reload its local variables instead
of returning, as long as there is a frame to pop.

I'm unclear how exceptions should be handled. Are they currently
propagated up across different C calls other than ceval2
recursions?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Tue May 18 17:05:39 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Tue, 18 May 1999 11:05:39 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14145.927.588572.113256@seattle.nightmare.com>
References: <60057226@toto.iv>
	<14145.927.588572.113256@seattle.nightmare.com>
Message-ID: <14145.33150.767551.472591@bitdiddle.cnri.reston.va.us>

>>>>> "SR" == rushing  <rushing at nightmare.com> writes:

  SR> Somewhere I have an example of coroutines being used for
  SR> parsing, very elegant.  Something like one coroutine does
  SR> lexing, and passes tokens one-by-one to the next level, which
  SR> passes parsed expressions to a compiler, or whatever.  Kinda
  SR> like pipes.

This is the first example that's used in Structured Programming (Dahl,
Djikstra, and Hoare).  I'd be happy to loan a copy to any of the
Python-dev people who sit nearby.

Jeremy



From tismer at appliedbiometrics.com  Tue May 18 17:31:11 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 17:31:11 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <374187BF.36CC65E7@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

(H?stel) Actually, I inserted the "global" later. It worked as well
with a local variable, but I didn't understand it. Still don't :-)

> Or does brute-force frame-copying cause the continuation to set "a" back to
> 2 each time?

No, it doesn't. Behavior is exactly the same with or without
global. I'm not sure wether this is a bug or a feature.
I *think* 'a' as a local has a slot in the frame, so it's
actually a different 'a' living in both copies. But this
would not have worked.
Can it be that before a function call, the interpreter
turns its locals into a dict, using fast_to_locals?
That would explain it.
This is not what I think it should be! Locals need to be
copied.

> > and needs a much better interface.
> 
> Ya, like screw 'em and use threads <wink>.

Never liked threads. These fibers are so neat since
they don't need threads, no locking, and they are
available on systems without threads.

> > But finally I'm quite happy that it worked so smoothly
> > after just a couple of hours (well, about six :)
> 
> Yup!  Playing with Python internals is a treat.
> 
> to-be-continued-ly y'rs  - tim

throw(42) - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From skip at mojam.com  Tue May 18 17:49:42 1999
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 18 May 1999 11:49:42 -0400
Subject: [Python-Dev] Is there another way to solve the continuation problem?
Message-ID: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>

Okay, from my feeble understanding of the problem it appears that
coroutines/continuations and threads are going to be problematic at best for 
Sam's needs.  Are there other "solutions"?  We know about state machines.
They have the problem that the number of states grows exponentially (?) as
the number of state variables increases.

Can exceptions be coerced into providing the necessary structure without
botching up the application too badly?  Seems that at some point where you
need to do some I/O, you could raise an exception whose second expression
contains the necessary state to get back to where you need to be once the
I/O is ready to go.  The controller that catches the exceptions would use
select or poll to prepare for the I/O then dispatch back to the handlers
using the information from exceptions.

class IOSetup:
    pass

class WaveHands:
    """maintains exception raise info and selects one to go to next"""
    def choose_one(r,w,e):
	pass

    def remember(info):
	pass

def controller(...):
    waiters = WaveHands()
    while 1:
	r, w, e = select([...], [...], [...])
	# using r,w,e, select a waiter to call
	func, place = waiters.choose_one(r,w,e)
	try:
	    func(place)
	except IOSetup, info:
	    waiters.remember(info)


def spam_func(place):
    if place == "spam":
	# whatever I/O we needed to do is ready to go
	bytes = read(some_fd)
	process(bytes)
	# need to read some more from some_fd. args are:
	#    function, target, fd category (r, w), selectable object, 
	raise IOSetup, (spam_func, "eggs" , "r", some_fd)

    elif place == "eggs":
	# that next chunk is ready - get it and proceed...

    elif yadda, yadda, yadda...


One thread, some craftiness needed to construct things.  Seems like it might
isolate some of the statefulness to smaller functional units than a pure
state machine.  Clearly not as clean as continuations would be.  Totally
bogus?  Totally inadequate?  Maybe Sam already does things this way?


Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Tue May 18 19:23:08 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Tue, 18 May 1999 19:23:08 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000301bea0e9$4fd473a0$829e2299@tim>
Message-ID: <3741A1FC.E84DC926@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > ...
> > Yup. With a little counting, it was easy to survive:
> >
> > def main():
> >     global a
> >     a=2
> >     thing (5)
> >     a=a-1
> >     if a:
> >         saved.throw (0)
> 
> Did "a" really need to be global here?  I hope you see the same behavior
> without the "global a"; e.g., this Scheme:

Actually, the frame-copying was not enough to make this 
all behave correctly. Since I didn't change the interpreter,
the ceval.c incarnations still had copies to the old frames.
The only effect which I achieved with frame copying was
that the refcounts were increased correctly.

I have to remove the hardware stack copying now.
Will try to create a non-recursive version of the interpreter.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From MHammond at skippinet.com.au  Wed May 19 01:16:54 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Wed, 19 May 1999 09:16:54 +1000
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <199905181549.LAA03206@cm-29-94-2.nycap.rr.com>
Message-ID: <006d01bea184$869f1480$0801a8c0@bobcat>

> Sam's needs.  Are there other "solutions"?  We know about
> state machines.
> They have the problem that the number of states grows
> exponentially (?) as
> the number of state variables increases.

Well, I can give you my feeble understanding of "IO Completion Ports", the
technique Win32 provides to "solve" this problem.

My experience is limited to how we used these in a server product designed
to maintain thousands of long-term client connections each spooling large
chunks of data (MSOffice docs - yes, that large :-).  We too could
obviously not afford a thread per connection.  Searching through NT's
documentation, completion ports are the technique they recommend for
high-performance IO, and it appears to deliver.

NT has the concept of a completion port, which in many ways is like an
"inverted semaphore".  You create a completion port with a "max number of
threads" value.  Then, for every IO object you need to use (files, sockets,
pipes etc) you "attach" it to the completion port, along with an integer
key.  This key is (presumably) unique to the file, and usually a pointer to
some structure maintaing the state of the file (ie, connection)

The general programming model is that you have a small number of threads
(possibly 1), and a large number of io objects (eg files).  Each of these
threads is executing a state machine.  When IO is "ready" for a particular
file, one of the available threads is woken, and passed the "key"
associated with the file.  This key identifies the file, and more
importantly the state of that file.  The thread uses the state to perform
the next IO operation, then immediately go back to sleep.  When that IO
operation completes, some other thread is woken to handle that state
change.  What makes this work of course is that _all_ IO is asynch - not a
single IO call in this whole model can afford to block.  NT provides asynch
IO natively.

This sounds very similar to what Medusa does internally, although the NT
model provides a "thread pooling" scheme built-in.  Although our server
performed very well with a single thread and hundreds of high-volume
connections, we chose to run with a default of 5 threads here.

For those still interested, our project has the multi-threaded state
machine I described above implemented in C.  Most of the work is
responsible for spooling the client request data (possibly 100s of kbs)
before handing that data off to the real server.  When the C code
transitions the client through the state of "send/get from the real
server", we actually set a different completion port.  This other
completion port wakes a thread written in Python.  So our architecture
consists of a C implemented thread-pool managing client connections, and a
different Python implemented thread pool that does the real work for each
of these client connections. (The Python side of the world is bound by the
server we are talking to, so Python performance doesnt matter as much - C
wouldnt buy enough)

This means that our state machines are not that complex.  Each "thread
pool" is managing its own, fairly simple state.  NT automatically allows
you to associate state with the IO object, and as we have multiple thread
pools, each one is simple - the one spooling client data is simple, the one
doing the actual server work is simple.  If we had to have a single,
monolithic state machine managing all aspects of the client spooling, _and_
the server work, it would be horrid.

This is all in a shrink-wrapped relatively cheap "Document Management"
product being targetted (successfully, it appears) at huge NT/Exchange
based sites.  Australia's largest Telco are implementing it, and indeed the
company has VC from Intel!  Lots of support from MS, as it helps compete
with Domino.  Not bad for a little startup - now they are wondering what to
do with this Python-thingy they now have in their product that noone else
has ever heard off; but they are planning on keeping it for now :-)
[Funnily, when they started, they didnt think they even _needed_ a server,
so I said "Ill just knock up a little one in Python", and we havent looked
back :-]

Mark.




From tim_one at email.msn.com  Wed May 19 02:48:00 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 18 May 1999 20:48:00 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <199905180403.AAA04772@eric.cnri.reston.va.us>
Message-ID: <000701bea191$3f4d1a20$2e9e2299@tim>

[GvR]
> ...
> Perhaps it would help to explain what a continuation actually does
> with the run-time environment, instead of giving examples of how to
> use them and what the result it?

Paul Wilson (the GC guy) has a very nice-- but incomplete --intro to Scheme
and its implementation:

ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html

You can pick up a lot from that fast.  Is Steven (Majewski) on this list?
He doped most of this out years ago.

> Here's a start of my own understanding (brief because I'm on a 28.8k
> connection which makes my ordinary typing habits in Emacs very
> painful).
>
> 1. All program state is somehow contained in a single execution stack.
> This includes globals (which are simply name bindings in the botton
> stack frame).

Better to think of name resolution following lexical links.  Lexical
closures with indefinite extent are common in Scheme, so much so that name
resolution is (at least conceptually) best viewed as distinct from execution
stacks.

Here's a key:  continuations are entirely about capturing control flow
state, and nothing about capturing binding or data state.  Indeed, mutating
bindings and/or non-local data are the ways distinct invocations of a
continuation communicate with each other, and for this reason true
functional languages generally don't support continuations of the call/cc
flavor.

> It also includes a code pointer for each stack frame indicating where
> the function corresponding to that stack frame is executing (this is
> the return address if there is a newer stack frame, or the current
> instruction for the newest frame).

Yes, although the return address is one piece of information in the current
frame's continuation object -- continuations are used internally for
"regular calls" too.  When a function returns, it passes control thru its
continuation object.  That process restores-- from the continuation
object --what the caller needs to know (in concept:  a pointer to *its*
continuation object, its PC, its name-resolution chain pointer, and its
local eval stack).

Another key point:  a continuation object is immutable.

> 2. A continuation does something equivalent to making a copy of the
> entire execution stack.  This can probably be done lazily.  There are
> probably lots of details.

The point of the above is to get across that for Scheme-calling-Scheme,
creating a continuation object copies just a small, fixed number of pointers
(the current continuation pointer, the current name-resolution chain
pointer, the PC), plus the local eval stack.  This is for a "stackless"
interpreter that heap-allocates name-mapping and execution-frame and
continuation objects.  Half the literature is devoted to optimizing one or
more of those away in special cases (e.g., for continuations provably
"up-level", using a stack + setjmp/longjmp instead).

> I also expect that Scheme's semantic model is different than Python
> here -- e.g. does it matter whether deep or shallow copies are made?
> I.e. are there mutable *objects* in Scheme? (I know there are mutable
> and immutable *name bindings* -- I think.)

Same as Python here; Scheme isn't a functional language; has mutable
bindings and mutable objects; any copies needed should be shallow, since
it's "a feature" that invoking a continuation doesn't restore bindings or
object values (see above re communication).

> 3. Calling a continuation probably makes the saved copy of the
> execution stack the current execution state; I presume there's also a
> way to pass an extra argument.

Right, except "stack" is the wrong mental model in the presence of
continuations; it's a general rooted graph (A calls B, B saves a
continuation pointing back to A, B goes on to call A, A saves a continuation
pointing back to B, etc).  If the explicitly saved continuations are never
*invoked*, control will eventually pop back to the root of the graph, so in
that sense there's *a* stack implicit at any given moment.

> 4. Coroutines (which I *do* understand) are probably done by swapping
> between two (or more) continuations.
>
> 5. Other control constructs can be done by various manipulations of
> continuations.  I presume that in many situations the saved
> continuation becomes the main control locus permanently, and the
> (previously) current stack is simply garbage-collected.  Of course the
> lazy copy makes this efficient.

There's much less copying going on in Scheme-to-Scheme than you might think;
other than that, right on.

> If this all is close enough to the truth, I think that continuations
> involving C stack frames are definitely out -- as Tim Peters
> mentioned, you don't know what the stuff on the C stack of extensions
> refers to.  (My guess would be that Scheme implementations assume that
> any pointers on the C stack point to Scheme objects, so that C stack
> frames can be copied and conservative GC can be used -- this will
> never happen in Python.)

"Scheme" has become a generic term covering dozens of implementations with
varying semantics, and a quick tour of the web suggests that cross-language
Schemes generally put severe restrictions on continuations across language
boundaries.  Most popular seems to be to outlaw them by decree.

> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where
> I don't know enough see questions at #2 above).

I'd like to go back to examples of what they'd be used for <wink> -- but
fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
and "lambdaness" and syntax-extension facilities, I'm unsure they're going
to work out reasonably in Python practice; it's not enough that they can be
very useful in Scheme, and Sam is highly motivated to go to extremes here.

give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim





From tismer at appliedbiometrics.com  Wed May 19 03:10:15 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 03:10:15 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea191$3f4d1a20$2e9e2299@tim>
Message-ID: <37420F77.48E9940F@appliedbiometrics.com>


Tim Peters wrote:
...

> > Continuations involving only Python stack frames might be supported,
> > if we can agree on the the sharing / copying semantics.  This is where
> > I don't know enough see questions at #2 above).
> 
> I'd like to go back to examples of what they'd be used for <wink> -- but
> fully fleshed out.  In the absence of Scheme's ubiquitous lexical closures
> and "lambdaness" and syntax-extension facilities, I'm unsure they're going
> to work out reasonably in Python practice; it's not enough that they can be
> very useful in Scheme, and Sam is highly motivated to go to extremes here.
> 
> give-me-a-womb-and-i-still-won't-give-birth-ly y'rs  - tim

I've put quite many hours into a non-recursive ceval.c
already. Should I continue? At least this would be a little
improvement, also if the continuation thing will not be born. ?

- chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From rushing at nightmare.com  Wed May 19 04:52:04 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Tue, 18 May 1999 19:52:04 -0700 (PDT)
Subject: [Python-Dev] Is there another way to solve the continuation problem?
In-Reply-To: <101382377@toto.iv>
Message-ID: <14146.8395.754509.591141@seattle.nightmare.com>

Skip Montanaro writes:
 > Can exceptions be coerced into providing the necessary structure
 > without botching up the application too badly?  Seems that at some
 > point where you need to do some I/O, you could raise an exception
 > whose second expression contains the necessary state to get back to
 > where you need to be once the I/O is ready to go.  The controller
 > that catches the exceptions would use select or poll to prepare for
 > the I/O then dispatch back to the handlers using the information
 > from exceptions.

 > [... code ...]

Well, you just re-invented the 'Reactor' pattern! 8^)

http://www.cs.wustl.edu/~schmidt/patterns-ace.html

 > One thread, some craftiness needed to construct things.  Seems like
 > it might isolate some of the statefulness to smaller functional
 > units than a pure state machine.  Clearly not as clean as
 > continuations would be.  Totally bogus?  Totally inadequate?  Maybe
 > Sam already does things this way?

What you just described is what Medusa does (well, actually, 'Python'
does it now, because the two core libraries that implement this are
now in the library - asyncore.py and asynchat.py).  asyncore doesn't
really use exceptions exactly that way, and asynchat allows you to add 
another layer of processing (basically, dividing the input into
logical 'lines' or 'records' depending on a 'line terminator').

The same technique is at the heart of many well-known network servers,
including INND, BIND, X11, Squid, etc..  It's really just a state
machine underneath (with python functions or methods implementing the
'states').  As long as things don't get too complex.  Python
simplifies things enough to allow one to 'push the difficulty
envelope' a bit further than one could reasonably tolerate in C.  For
example, Squid implements async HTTP (server and client, because it's
a proxy) - but stops short of trying to implement async FTP.  Medusa
implements async FTP, but it's the largest file in the Medusa
distribution, weighing in at a hefty 32KB.

The hard part comes when you want to plug different pieces and
protocols together.  For example, building a simple HTTP or FTP server
is relatively easy, but building an HTTP server *that proxied to an
FTP server* is much more difficult.  I've done these kinds of things,
viewing each as a challenge; but past a certain point it boggles.

The paper I posted about earlier by Matthew Fuchs has a really good
explanation of this, but in the context of GUI event loops... I think
it ties in neatly with this discussion because at the heart of any X11
app is a little guy manipulating a file descriptor.

-Sam




From tim_one at email.msn.com  Wed May 19 07:41:39 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:39 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <14144.61765.308962.101884@seattle.nightmare.com>
Message-ID: <000b01bea1ba$443a1a00$2e9e2299@tim>

[Sam]
> ...
> Except that since the escape procedure is 'first-class' it can be
> stored away and invoked (and reinvoked) later.  [that's all that
> 'first-class' means: a thing that can be stored in a variable,
> returned from a function, used as an argument, etc..]
>
> I've never seen a let/cc that wasn't full-blown, but it wouldn't
> surprise me.

The let/cc's in question were specifically defined to create continuations
valid only during let/cc's dynamic extent, so that, sure, you could store
them away, but trying to invoke one later could be an error.  It's in that
sense I meant they weren't "first class".

Other flavors of Scheme appear to call this concept "weak continuation", and
use a different verb to invoke it (like call-with-escaping-continuation, or
call/ec).  Suspect the let/cc oddballs I found were simply confused
implementations (there are a lot of amateur Scheme implementations out
there!).

>> Would full-blown coroutines be powerful enough for your needs?

> Yes, I think they would be.  But I think with Python it's going to
> be just about as hard, either way.

Most people on this list are comfortable with coroutines already because
they already understand them -- Jeremy can even reach across the hall and
hand Guido a helpful book <wink>.  So pondering coroutines increase the
number of brain cells willing to think about the implementation.

continuation-examples-leave-people-still-going-"huh?"-after-an-
    hour-of-explanation-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:45 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3741A1FC.E84DC926@appliedbiometrics.com>
Message-ID: <000e01bea1ba$47fe7500$2e9e2299@tim>

[Christian Tismer]
>>> ...
>>> Yup. With a little counting, it was easy to survive:
>>>
>>> def main():
>>>     global a
>>>     a=2
>>>     thing (5)
>>>     a=a-1
>>>     if a:
>>>         saved.throw (0)

[Tim]
>> Did "a" really need to be global here?  I hope you see the same behavior
>> without the "global a";
[which he does, but for mysterious reasons]

[Christian]
> Actually, the frame-copying was not enough to make this
> all behave correctly. Since I didn't change the interpreter,
> the ceval.c incarnations still had copies to the old frames.
> The only effect which I achieved with frame copying was
> that the refcounts were increased correctly.

All right!  Now you're closer to the real solution <wink>; i.e., copying
wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
when we entered main originally a set of bindings was created for its
locals, and it is that very same set of bindings to which the continuation
returns.  So the continuation *should* reuse them -- making a copy of the
locals is semantically hosed.

This is clearer in Scheme because its "stack" holds *only* control-flow info
(bindings follow a chain of static links, independent of the current "call
stack"), so there's no temptation to run off copying bindings too.

elegant-and-baffling-for-the-price-of-one<wink>-ly y'rs  - tim





From tim_one at email.msn.com  Wed May 19 07:41:56 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 01:41:56 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37420F77.48E9940F@appliedbiometrics.com>
Message-ID: <001301bea1ba$4eb498c0$2e9e2299@tim>

[Christian Tismer]
> I've put quite many hours into a non-recursive ceval.c
> already.

Does that mean 6 or 600 <wink>?

> Should I continue? At least this would be a little improvement, also
> if the continuation thing will not be born. ?

Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
so my belief is it's worth pursuing.

but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim





From arw at ifu.net  Wed May 19 15:04:53 1999
From: arw at ifu.net (Aaron Watters)
Date: Wed, 19 May 1999 09:04:53 -0400
Subject: [Python-Dev] continuations and C extensions?
Message-ID: <3742B6F5.C6CB7313@ifu.net>

the immutable GvR intones:
> Continuations involving only Python stack frames might be supported,
> if we can agree on the the sharing / copying semantics.  This is where

> I don't know enough see questions at #2 above).

What if there are native C calls mixed in (eg, list.sort calls back to
myclass.__cmp__ which decides to do a call/cc).  One of the really
big advantages of Python in my book is the relative simplicity of
embedding
and extensions, and this is generally one of the failings of lisp
implementations.
I understand lots of scheme implementations purport
to be extendible and embeddable, but in practice you can't do it with
*existing* code -- there is always a show stopper involving having to
change the way some Oracle library which you don't have the source for
does memory management or something... I've known several grad students
who have been bitten by this...  I think having to unroll the C stack
safely
might be one problem area.

With, eg, a netscape nsapi embedding you can actually get into netscape
code calls my code calls netscape code calls my code... suspends in a
continuation?  How would that work?  [my ignorance is torment!]

Threading and extensions are probably also problematic, but at least
it's
better understood, I think.  Just kvetching.  Sorry.
   -- Aaron Watters

ps: Of course there are valid reasons and excellent advantages
  to having continuations, but it's also interesting to consider the
possible cost.
  There ain't no free lunch.





From tismer at appliedbiometrics.com  Wed May 19 21:30:18 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 21:30:18 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
Message-ID: <3743114A.220FFA0B@appliedbiometrics.com>


Tim Peters wrote:
...
> [Christian]
> > Actually, the frame-copying was not enough to make this
> > all behave correctly. Since I didn't change the interpreter,
> > the ceval.c incarnations still had copies to the old frames.
> > The only effect which I achieved with frame copying was
> > that the refcounts were increased correctly.
> 
> All right!  Now you're closer to the real solution <wink>; i.e., copying
> wasn't really needed here, but keeping stuff alive was.  In Scheme terms,
> when we entered main originally a set of bindings was created for its
> locals, and it is that very same set of bindings to which the continuation
> returns.  So the continuation *should* reuse them -- making a copy of the
> locals is semantically hosed.

I tried the most simple thing, and this seemed to be duplicating
the current state of the machine. The frame holds the stack,
and references to all objects.
By chance, the locals are not in a dict, but unpacked into
the frame. (Sometimes I agree with Guido, that optimization
is considered harmful :-)

> This is clearer in Scheme because its "stack" holds *only* control-flow info
> (bindings follow a chain of static links, independent of the current "call
> stack"), so there's no temptation to run off copying bindings too.

The Python stack, besides its intermingledness with the machine
stack, is basically its chain of frames. The value stack pointer
still hides in the machine stack, but that's easy to change.
So the real Scheme-like part is this chain, methinks, with
the current bytecode offset and value stack info.

Making a copy of this in a restartable way means to increase
the refcount of all objects in a frame. Would it be correct
to undo the effect of fast locals before splitting, and redoing
it on activation?

Or do I need to rethink the whole structure? What should
be natural for Python, it at all?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 21:46:49 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 15:46:49 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
	<3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  [Tim Peters]
  >> This is clearer in Scheme because its "stack" holds *only*
  >> control-flow info (bindings follow a chain of static links,
  >> independent of the current "call stack"), so there's no
  >> temptation to run off copying bindings too.

  CT> The Python stack, besides its intermingledness with the machine
  CT> stack, is basically its chain of frames. The value stack pointer
  CT> still hides in the machine stack, but that's easy to change.  So
  CT> the real Scheme-like part is this chain, methinks, with the
  CT> current bytecode offset and value stack info.

  CT> Making a copy of this in a restartable way means to increase the
  CT> refcount of all objects in a frame. Would it be correct to undo
  CT> the effect of fast locals before splitting, and redoing it on
  CT> activation?

Wouldn't it be easier to increase the refcount on the frame object?
Then you wouldn't need to worry about the recounts on all the objects
in the frame, because they would only be decrefed when the frame is
deallocated. 

It seems like the two other things you would need are some way to get
a copy of the current frame and a means to invoke eval_code2 with an
already existing stack frame instead of a new one.

(This sounds too simple, so it's obviously wrong.  I'm just not sure
where.  Is the problem that you really need a seperate stack/graph to
hold the frames?  If we leave them on the Python stack, it could be
hard to dis-entangle value objects from control objects.)

Jeremy



From tismer at appliedbiometrics.com  Wed May 19 22:10:16 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 22:10:16 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000e01bea1ba$47fe7500$2e9e2299@tim>
		<3743114A.220FFA0B@appliedbiometrics.com> <14147.4976.608139.212336@bitdiddle.cnri.reston.va.us>
Message-ID: <37431AA8.BC77C615@appliedbiometrics.com>


Jeremy Hylton wrote:

[TP+CT about frame copies et al]

> Wouldn't it be easier to increase the refcount on the frame object?
> Then you wouldn't need to worry about the recounts on all the objects
> in the frame, because they would only be decrefed when the frame is
> deallocated.

Well, the frame is supposed to be run twice, since there are
two incarnations of interpreters working on it: The original one,
and later, when it is thown, another one (or the same, but, in
principle). 
The frame could have been in any state, with a couple
of objects on the stack. My splitting function can be invoked
in some nested context, so I have a current opcode position,
and a current stack position.
Running this once leaves the stack empty, since all the objects are
decrefed. Running this a second time gives a GPF, since the stack is
empty.
Therefore, I made a copy which means to create a duplicate frame
with an extra refcound for all the objects. This makes sure
that both can be restarted at any time.

> It seems like the two other things you would need are some way to get
> a copy of the current frame and a means to invoke eval_code2 with an
> already existing stack frame instead of a new one.

Well, that's exactly where I'm working on.

> (This sounds too simple, so it's obviously wrong.  I'm just not sure
> where.  Is the problem that you really need a seperate stack/graph to
> hold the frames?  If we leave them on the Python stack, it could be
> hard to dis-entangle value objects from control objects.)

Oh, perhaps I should explain it a bit clearer?
What did you mean by the Python stack? The hardware machine stack?

What do we have at the moment:
The stack is the linked list of frames. Every frame has a
local Python evaluation stack. Calls of Python functions produce
a new frame, and the old one is put beneath. This is the control
stack. The additional info on the hardware stack happens to be
a parallel friend of this chain, and currently holds extra info,
but this is an artifact. Adding the current Python stack level
to the frame makes the hardware stack totally unnecessary.

There is a possible speed loss, anyway.
Today, the recursive call of ceval2 is optimized and quite
fast. The non-recursive Version will have to copy variables
in and out from the frames, instead, so there is of course
a little speed penalty to pay.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Wed May 19 23:38:07 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 May 1999 23:38:07 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
Message-ID: <37432F3F.2694DA0E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian Tismer]
> > I've put quite many hours into a non-recursive ceval.c
> > already.
> 
> Does that mean 6 or 600 <wink>?

6, or 10, or 20, if I count the time from the first
start with Sam's code, maybe.

> 
> > Should I continue? At least this would be a little improvement, also
> > if the continuation thing will not be born. ?
> 
> Guido wanted to move in the "flat interpreter" direction for Python2 anyway,
> so my belief is it's worth pursuing.
> 
> but-then-i-flipped-a-coin-with-two-heads-ly y'rs  - tim

Right. Who'se faces? :-)

On the stackless thing, what should I do.
I started to insert minimum patches, but it turns out
that I have to change frames a little (extending).

I can make quite small changes to the interpreter to replace
the recursive calls, but this involves extra flags in some cases,
where the interpreter is called the first time and so on.

What has more probability to be included into a future Python:
Tweaking the current thing only minimally, to make it as similar
as possible as the former?
Or do as much redesign as I think is needed to do it in
a clean way. This would mean to split eval_code2 into two functions,
where one is the interpreter kernel, and one is the frame manager.

There are also other places which do quite deep function calls
and finally call eval_code2. I think these should return a frame
object now. I could convince them to call or return frame,
depending on a flag, but it would be clean to rename the functions,
let them always deal with frames, and put the original function
on top of it.

Short, I can do larger changes which clean this all a bit up,
or I can make small changes which are more tricky to grasp,
but give just small diffs.

How to touch untouchable code the best? :-)

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Wed May 19 23:49:38 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Wed, 19 May 1999 17:49:38 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37432F3F.2694DA0E@appliedbiometrics.com>
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
	<37432F3F.2694DA0E@appliedbiometrics.com>
Message-ID: <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>

I think it makes sense to avoid being obscure or unclear in order to
minimize the size of the patch or the diff.  Realistically, it's
unlikely that anything like your original patch is going to make it
into the CVS tree.  It's primary value is as proof of concept and as
code that the rest of us can try out.  If you make large changes, but
they are clearer, you'll help us out a lot.

We can worry about minimizing the impact of the changes on the
codebase after, after everyone has figured out what's going on and
agree that its worth doing.

feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
Jeremy




From tismer at appliedbiometrics.com  Thu May 20 00:25:20 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 00:25:20 +0200
Subject: [Python-Dev] 'stackless' python?
References: <001301bea1ba$4eb498c0$2e9e2299@tim>
		<37432F3F.2694DA0E@appliedbiometrics.com> <14147.12613.88669.456608@bitdiddle.cnri.reston.va.us>
Message-ID: <37433A50.31E66CB1@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> I think it makes sense to avoid being obscure or unclear in order to
> minimize the size of the patch or the diff.  Realistically, it's
> unlikely that anything like your original patch is going to make it
> into the CVS tree.  It's primary value is as proof of concept and as
> code that the rest of us can try out.  If you make large changes, but
> they are clearer, you'll help us out a lot.

Many many thanks. This is good advice.
I will make absolutely clear what's going on, keep
parts untouched as possible, cut out parts which must
change, and I will not look into speed too much.

Better have a function call more and a bit less optimization,
but a clear and rock-solid introduction of a concept.

> We can worry about minimizing the impact of the changes on the
> codebase after, after everyone has figured out what's going on and
> agree that its worth doing.
> 
> feeling-much-more-confident-because-I-didn't-say-continuation-ly yr's,
> Jeremy

Hihi - the new little slot with local variables of the 
interpreter happens to have the name "continuation".
Maybe I'd better rename it to "activation record"?.

Now, there is no longer a recoursive call. Instead, a frame
object is returned, which is waiting to be activated
by a dispatcher.

Some more ideas are popping up. Right now, only the recursive
calls can vanish. Callbacks from C code which is called by
the interpreter whcih is called by... is still a problem.

But it might perhaps vanish completely. We have to see
how much the cost is. But if I can manage to let the interpreter
duck and cover also on every call to a builtin? The interpreter
again returns to the dispatcher which then calls the builtin.
Well, if that builtin happens to call to the interpreter again,
it will be a dispatcher again. The machine stack grows a little,
but since everything is saved in the frames, these stacks are
no longer related. This means, the principle works with existing
extension modules, since interpreter-world and C-stack world
are decoupled.
To avoid stack growth, of course a number of builtins would
be better changed, but it is no must in the first place.
execfile for instance is a candidate which needn't call the
interpreter. It could equally parse the file, generate the
code object, build a frame and just return it. This is what
the dispatcher likes: returned frames are put on the chain
and fired.

waah, my bus - running - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Thu May 20 01:56:33 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 19:56:33 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000701bea253$3a182a00$179e2299@tim>

I'm home sick today, so tortured myself <0.9 wink>.

Sam mentioned using coroutines to compare the fringes of two trees, and I
picked a simpler problem:  given a nested list structure, generate the leaf
elements one at a time, in left-to-right order.  A solution to Sam's problem
can be built on that, by getting a generator for each tree and comparing the
leaves a pair at a time until there's a difference.

Attached are solutions in Icon, Python and Scheme.  I have the least
experience with Scheme, but browsing around didn't find a better Scheme
approach than this.

The Python solution is the least satisfactory, using an explicit stack to
simulate recursion by hand; if you didn't know the routine's purpose in
advance, you'd have a hard time guessing it.

The Icon solution is very short and simple, and I'd guess obvious to an
average Icon programmer.  It uses the subset of Icon ("generators") that
doesn't require any C-stack trickery.  However, alone of the three, it
doesn't create a function that could be explicitly called from several
locations to produce "the next" result; Icon's generators are tied into
Icon's unique control structures to work their magic, and breaking that
connection requires moving to full-blown Icon coroutines.  It doesn't need
to be that way, though.

The Scheme solution was the hardest to write, but is a largely mechanical
transformation of a recursive fringe-lister that constructs the entire
fringe in one shot.  Continuations are used twice:  to enable the recursive
routine to resume itself where it left off, and to get each leaf value back
to the caller.  Getting that to work required rebinding non-local
identifiers in delicate ways.  I doubt the intent would be clear to an
average Scheme programmer.

So what would this look like in Continuation Python?  Note that each place
the Scheme says "lambda" or "letrec", it's creating a new lexical scope, and
up-level references are very common.  Two functions are defined at top
level, but seven more at various levels of nesting; the latter can't be
pulled up to the top because they refer to vrbls local to the top-level
functions.  Another (at least initially) discouraging thing to note is that
Scheme schemes for hiding the pain of raw call/cc often use Scheme's macro
facilities.

may-not-be-as-fun-as-it-sounds<wink>-ly y'rs  - tim

Here's the Icon:

procedure main()
    x := [[1, [[2, 3]]], [4], [], [[[5]], 6]]
    every writes(fringe(x), " ")
    write()
end

procedure fringe(node)
    if type(node) == "list" then
        suspend fringe(!node)
    else
        suspend node
end

Here's the Python:

from types import ListType

class Fringe:
    def __init__(self, value):
        self.stack = [(value, 0)]

    def __getitem__(self, ignored):
        while 1:
            # find topmost pending list with something to do
            while 1:
                if not self.stack:
                    raise IndexError
                v, i = self.stack[-1]
                if i < len(v):
                    break
                self.stack.pop()

            this = v[i]
            self.stack[-1] = (v, i+1)
            if type(this) is ListType:
                self.stack.append((this, 0))
            else:
                break

        return this

testcase = [[1, [[2, 3]]], [4], [], [[[5]], 6]]

for x in Fringe(testcase):
    print x,
print

Here's the Scheme:

(define list->generator
  ; Takes a list as argument.
  ; Returns a generator g such that each call to g returns
  ; the next element in the list's symmetric-order fringe.
  (lambda (x)
    (letrec {(produce-value #f) ; set to return-to continuation
             (looper
              (lambda (x)
                (cond
                  ((null? x) 'nada) ; ignore null
                  ((list? x)
                   (looper (car x))
                   (looper (cdr x)))
                  (else
                   ; want to produce this non-list fringe elt,
                   ; and also resume here
                   (call/cc
                    (lambda (here)
                      (set! getnext
                            (lambda () (here 'keep-going)))
                      (produce-value x)))))))
             (getnext
              (lambda ()
                (looper x)
                ; have to signal end of sequence somehow;
                ; assume false isn't a legitimate fringe elt
                (produce-value #f)))}

      ; return niladic function that returns next value
      (lambda ()
        (call/cc
         (lambda (k)
           (set! produce-value k)
           (getnext)))))))

(define display-fringe
  (lambda (x)
    (letrec ((g (list->generator x))
             (thiselt #f)
             (looper
              (lambda ()
                (set! thiselt (g))
                (if thiselt
                    (begin
                      (display thiselt) (display " ")
                      (looper))))))
      (looper))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))

(display-fringe test-case)





From MHammond at skippinet.com.au  Thu May 20 02:14:24 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Thu, 20 May 1999 10:14:24 +1000
Subject: [Python-Dev] Interactive Debugging of Python
Message-ID: <008b01bea255$b80cf790$0801a8c0@bobcat>

All this talk about stack frames and manipulating them at runtime has
reminded me of one of my biggest gripes about Python.  When I say "biggest
gripe", I really mean "biggest surprise" or "biggest shame".

That is, Python is very interactive and dynamic.  However, when I am
debugging Python, it seems to lose this.  There is no way for me to
effectively change a running program.  Now with VC6, I can do this with C.
Although it is slow and a little dumb, I can change the C side of my Python
world while my program is running, but not the Python side of the world.

Im wondering how feasable it would be to change Python code _while_ running
under the debugger.  Presumably this would require a way of recompiling the
current block of code, patching this code back into the object, and somehow
tricking the stack frame to use this new block of code; even if a first-cut
had to restart the block or somesuch...

Any thoughts on this?

Mark.




From tim_one at email.msn.com  Thu May 20 04:41:03 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Wed, 19 May 1999 22:41:03 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <3743114A.220FFA0B@appliedbiometrics.com>
Message-ID: <000901bea26a$34526240$179e2299@tim>

[Christian Tismer]
> I tried the most simple thing, and this seemed to be duplicating
> the current state of the machine. The frame holds the stack,
> and references to all objects.
> By chance, the locals are not in a dict, but unpacked into
> the frame. (Sometimes I agree with Guido, that optimization
> is considered harmful :-)

I don't see that the locals are a problem here -- provided you simply leave
them alone <wink>.

> The Python stack, besides its intermingledness with the machine
> stack, is basically its chain of frames.

Right.

> The value stack pointer still hides in the machine stack, but
> that's easy to change.

I'm not sure what "value stack" means here, or "machine stack".  The latter
means the C stack?  Then I don't know which values you have in mind that are
hiding in it (the locals are, as you say, unpacked in the frame, and the
evaluation stack too).  By "evaluation stack" I mean specifically
f->f_valuestack; the current *top* of stack pointer (specifically
stack_pointer) lives in the C stack -- is that what we're talking about?
Whichever, when we're talking about the code, let's use the names the code
uses <wink>.

> So the real Scheme-like part is this chain, methinks, with
> the current bytecode offset and value stack info.

Curiously, f->f_lasti is already materialized every time we make a call, in
order to support tracing.  So if capturing a continuation is done via a
function call (hard to see any other way it could be done <wink>), a
bytecode offset is already getting saved in the frame object.

> Making a copy of this in a restartable way means to increase
> the refcount of all objects in a frame.

You later had a vision of splitting the frame into two objects -- I think.
Whichever part the locals live in should not be copied at all, but merely
have its (single) refcount increased.  The other part hinges on details of
your approach I don't know.  The nastiest part seems to be f->f_valuestack,
which conceptually needs to be (shallow) copied in the current frame and in
all other frames reachable from the current frame's continuation (the chain
rooted at f->f_back today); that's the sum total (along with the same
frames' bytecode offsets) of capturing the control flow state.

> Would it be correct to undo the effect of fast locals before
> splitting, and redoing it on activation?

Unsure what splitting means, but in any case I can't conceive of a reason
for doing anything to the locals.  Their values aren't *supposed* to get
restored upon continuation invocation, so there's no reason to do anything
with their values upon continuation creation either.  Right?  Or are we
talking about different things?

almost-as-good-as-pantomimem<wink>-ly y'rs  - tim





From rushing at nightmare.com  Thu May 20 06:04:20 1999
From: rushing at nightmare.com (rushing at nightmare.com)
Date: Wed, 19 May 1999 21:04:20 -0700 (PDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <50692631@toto.iv>
Message-ID: <14147.34175.950743.79464@seattle.nightmare.com>

Tim Peters writes:
 > The Scheme solution was the hardest to write, but is a largely
 > mechanical transformation of a recursive fringe-lister that
 > constructs the entire fringe in one shot.  Continuations are used
 > twice: to enable the recursive routine to resume itself where it
 > left off, and to get each leaf value back to the caller.  Getting
 > that to work required rebinding non-local identifiers in delicate
 > ways.  I doubt the intent would be clear to an average Scheme
 > programmer.

It's the only way to do it - every example I've seen of using call/cc
looks just like it.

I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
people.  The following should be equivalent:

(define (list->generator x)
  (let ((produce-value #f))

    (define (looper x)
      (cond ((null? x) 'nada)
	    ((list? x)
	     (looper (car x))
	     (looper (cdr x)))
	    (else
	     (call/cc
	      (lambda (here)
		(set! getnext (lambda () (here 'keep-going)))
		(produce-value x))))))

    (define (getnext)
      (looper x)
      (produce-value #f))

    (lambda ()
      (call/cc
       (lambda (k)
	 (set! produce-value k)
	 (getnext))))))

(define (display-fringe x)
  (let ((g (list->generator x)))
    (let loop ((elt (g)))
      (if elt
	  (begin
             (display elt)
             (display " ")
             (loop (g)))))))

(define test-case '((1 ((2 3))) (4) () (((5)) 6)))
(display-fringe test-case)

 > So what would this look like in Continuation Python?

Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
do this without having the feature to play with.  This presumes a
function "call_cc" that behaves like Scheme's.  I believe the extra
level of indirection is necessary. (i.e., call_cc takes a function as
an argument that takes a continuation function)

class list_generator:

    def __init__ (x):
        self.x = x
        self.k_suspend = None
        self.k_produce = None

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

    def __call__ (self):
        # call self.resume() with a continuation
        # that will return the next fringe element
        return call_cc (self.resume)

    def resume (self, k_produce):
        self.k_produce = k_produce
        if self.k_suspend:
            # resume the suspended walk
            self.k_suspend (None)
        else:
            self.walk (self.x)

    def suspend (self, k_suspend):
        self.k_suspend = k_suspend
        # return a value for __call__
        self.k_produce (self.item)

Variables hold continuations have a 'k_' prefix.  In real life it
might be possible to put the suspend/call/resume machinery in a base
class (Generator?), and override 'walk' as you please.

-Sam




From tim_one at email.msn.com  Thu May 20 09:21:45 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 03:21:45 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <001d01bea291$6b3efbc0$179e2299@tim>

[Sam, takes up the Continuation Python Challenge]

Thanks, Sam!  I think this is very helpful.

> ...
> It's the only way to do it - every example I've seen of using call/cc
> looks just like it.

Same here -- alas <0.5 wink>.

> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
> people.  The following should be equivalent:

I confess I stopped paying attention to Scheme after R4RS, and largely
because the std decreed that *so* many forms were optional.  Your rework is
certainly nicer, but internal defines and named let are two that R4RS
refused to require, so I always avoided them.  BTW, I *am* a compiler, so
that never bothered me <wink>.

>> So what would this look like in Continuation Python?

> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
> do this without having the feature to play with.

Fully understood.  It's also really hard to implement the feature without
knowing how someone who wants it would like it to behave.  But I don't think
anyone is getting graded on this, so let's have fun <wink>.

Ack!  I have to sleep.  Will study the code in detail later, but first
impression was it looked good!  Especially nice that it appears possible to
package up most of the funky call_cc magic in a base class, so that
non-wizards could reuse it by following a simple protocol.

great-fun-to-come-up-with-one-of-these-but-i'd-hate-to-have-to-redo-
    from-scratch-every-time-ly y'rs  - tim





From skip at mojam.com  Thu May 20 15:27:59 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 09:27:59 -0400 (EDT)
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14147.34175.950743.79464@seattle.nightmare.com>
References: <50692631@toto.iv>
	<14147.34175.950743.79464@seattle.nightmare.com>
Message-ID: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>

    Sam> I reworked your Scheme a bit.  IMHO letrec is for compilers, not for
    Sam> people.

Sam, you are aware of course that the timbot *is* a compiler, right? ;-)

    >> So what would this look like in Continuation Python?

    Sam> Here's my first hack at it.  Most likely wrong.  It is REALLY HARD to
    Sam> do this without having the feature to play with.

The thought that it's unlikely one could arrive at a reasonable
approximation of a correct solution for such a small problem without the
ability to "play with" it is sort of scary.

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From tismer at appliedbiometrics.com  Thu May 20 16:10:32 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 16:10:32 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <008b01bea255$b80cf790$0801a8c0@bobcat>
Message-ID: <374417D8.8DBCB617@appliedbiometrics.com>


Mark Hammond wrote:
> 
> All this talk about stack frames and manipulating them at runtime has
> reminded me of one of my biggest gripes about Python.  When I say "biggest
> gripe", I really mean "biggest surprise" or "biggest shame".
> 
> That is, Python is very interactive and dynamic.  However, when I am
> debugging Python, it seems to lose this.  There is no way for me to
> effectively change a running program.  Now with VC6, I can do this with C.
> Although it is slow and a little dumb, I can change the C side of my Python
> world while my program is running, but not the Python side of the world.
> 
> Im wondering how feasable it would be to change Python code _while_ running
> under the debugger.  Presumably this would require a way of recompiling the
> current block of code, patching this code back into the object, and somehow
> tricking the stack frame to use this new block of code; even if a first-cut
> had to restart the block or somesuch...
> 
> Any thoughts on this?

I'm writing a prototype of a stackless Python, which means that
you will be able to access the current state of the interpreter
completely.
The inner interpreter loop will be isolated from the frame
dispatcher. It will break whenever the ticker goes zero.
If you set the ticker to one, you will be able to single
step on every opcode, have the value stack, the frame chain,
everything.
I think, with this you can do very much.
But tell me if you want a callback hook somewhere.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Thu May 20 18:52:21 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Thu, 20 May 1999 18:52:21 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
Message-ID: <37443DC5.1330EAC6@appliedbiometrics.com>

Cleaning up, clarifying, trying to understand...

Tim Peters wrote:
> 
> [Christian Tismer]
> > I tried the most simple thing, and this seemed to be duplicating
> > the current state of the machine. The frame holds the stack,
> > and references to all objects.
> > By chance, the locals are not in a dict, but unpacked into
> > the frame. (Sometimes I agree with Guido, that optimization
> > is considered harmful :-)
> 
> I don't see that the locals are a problem here -- provided you simply leave
> them alone <wink>.

This depends on wether I have to duplicate frames
or not. Below...

> > The Python stack, besides its intermingledness with the machine
> > stack, is basically its chain of frames.
> 
> Right.
> 
> > The value stack pointer still hides in the machine stack, but
> > that's easy to change.
> 
> I'm not sure what "value stack" means here, or "machine stack".  The latter
> means the C stack?  Then I don't know which values you have in mind that are
> hiding in it (the locals are, as you say, unpacked in the frame, and the
> evaluation stack too).  By "evaluation stack" I mean specifically
> f->f_valuestack; the current *top* of stack pointer (specifically
> stack_pointer) lives in the C stack -- is that what we're talking about?

Exactly!

> Whichever, when we're talking about the code, let's use the names the code
> uses <wink>.

The evaluation stack pointer is a local variable in the
C stack and must be written to the frame to become independant
from the C stack. Sounds better now?

> 
> > So the real Scheme-like part is this chain, methinks, with
> > the current bytecode offset and value stack info.
> 
> Curiously, f->f_lasti is already materialized every time we make a call, in
> order to support tracing.  So if capturing a continuation is done via a
> function call (hard to see any other way it could be done <wink>), a
> bytecode offset is already getting saved in the frame object.

You got me. I'm just completing what is partially there.

> > Making a copy of this in a restartable way means to increase
> > the refcount of all objects in a frame.
> 
> You later had a vision of splitting the frame into two objects -- I think.

My wrong wording. Not splitting, but duplicting. If a frame is the
current state, I make it two frames to have two current states.
One will be saved, the other will be run. This is what I call
"splitting".
Actually, splitting must occour whenever a frame can be reached twice,
in order to keep elements alive.

> Whichever part the locals live in should not be copied at all, but merely
> have its (single) refcount increased.  The other part hinges on details of
> your approach I don't know.  The nastiest part seems to be f->f_valuestack,
> which conceptually needs to be (shallow) copied in the current frame and in
> all other frames reachable from the current frame's continuation (the chain
> rooted at f->f_back today); that's the sum total (along with the same
> frames' bytecode offsets) of capturing the control flow state.

Well, I see. You want one locals and one globals, shared by two
incarnations. Gets me into trouble.

> > Would it be correct to undo the effect of fast locals before
> > splitting, and redoing it on activation?
> 
> Unsure what splitting means, but in any case I can't conceive of a reason
> for doing anything to the locals.  Their values aren't *supposed* to get
> restored upon continuation invocation, so there's no reason to do anything
> with their values upon continuation creation either.  Right?  Or are we
> talking about different things?

Let me explain. What Python does right now is:
When a function is invoked, all local variables are copied
into fast_locals, well of course just references are copied
and counts increased. These fast locals give a lot of speed
today, we must have them.
You are saying I have to share locals between frames. Besides
that will be a reasonable slowdown, since an extra structure
must be built and accessed indirectly (right now, i's all fast,
living in the one frame buffer), I cannot say that I'm convinced
that this is what we need.

Suppose you have a function

def f(x):
    # do something
    ...
    # in some context, wanna have a snapshot
    global snapshot  # initialized to None
    if not snapshot:
        snapshot = callcc.new()
    # continue computation
    x = x+1
    ...

What I want to achieve is that I can run this again, from my
snapshot. But with shared locals, my parameter x of the
snapshot would have changed to x+1, which I don't find useful.
I want to fix a state of the current frame and still think
it should "own" its locals. Globals are borrowed, anyway.
Class instances will anyway do what you want, since
the local "self" is a mutable object.

How do you want to keep computations independent
when locals are shared? For me it's just easier to
implement and also to think with the shallow copy.
Otherwise, where is my private place?
Open for becoming convinced, of course :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From jeremy at cnri.reston.va.us  Thu May 20 21:26:30 1999
From: jeremy at cnri.reston.va.us (Jeremy Hylton)
Date: Thu, 20 May 1999 15:26:30 -0400 (EDT)
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
References: <000901bea26a$34526240$179e2299@tim>
	<37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:

  CT> What I want to achieve is that I can run this again, from my
  CT> snapshot. But with shared locals, my parameter x of the snapshot
  CT> would have changed to x+1, which I don't find useful.  I want to
  CT> fix a state of the current frame and still think it should "own"
  CT> its locals. Globals are borrowed, anyway.  Class instances will
  CT> anyway do what you want, since the local "self" is a mutable
  CT> object.

  CT> How do you want to keep computations independent when locals are
  CT> shared? For me it's just easier to implement and also to think
  CT> with the shallow copy.  Otherwise, where is my private place?
  CT> Open for becoming convinced, of course :-)

I think you're making things a lot more complicated by trying to
instantiate new variable bindings for locals every time you create a
continuation.  Can you give an example of why that would be helpful?
(Ok.  I'm not sure I can offer a good example of why it would be
helpful to share them, but it makes intuitive sense to me.)

The call_cc mechanism is going to let you capture the current
continuation, save it somewhere, and call on it again as often as you
like.  Would you get a fresh locals each time you used it?  or just
the first time?  If only the first time, it doesn't seem that you've
gained a whole lot.

Also, all the locals that are references to mutable objects are
already effectively shared.  So it's only a few oddballs like ints
that are an issue.

Jeremy



From tim_one at email.msn.com  Fri May 21 00:04:04 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:04 -0400
Subject: [Python-Dev] A "real" continuation example
In-Reply-To: <14148.3389.962368.221063@cm-29-94-2.nycap.rr.com>
Message-ID: <000601bea30c$ad51b220$9d9e2299@tim>

[Tim]
> So what would this look like in Continuation Python?

[Sam]
> Here's my first hack at it.  Most likely wrong.  It is
> REALLY HARD to do this without having the feature to play with.

[Skip]
> The thought that it's unlikely one could arrive at a reasonable
> approximation of a correct solution for such a small problem without the
> ability to "play with" it is sort of scary.

Yes it is.  But while the problem is small, it's not easy, and only the Icon
solution wrote itself (not a surprise -- Icon was designed for expressing
this kind of algorithm, and the entire language is actually warped towards
it).  My first stab at the Python stack-fiddling solution had bugs too, but
I conveniently didn't post that <wink>.

After studying Sam's code, I expect it *would* work as written, so it's a
decent bet that it's a reasonable approximation to a correct solution as-is.

A different Python approach using threads can be built using

    Demo/threads/Generator.py

from the source distribution.  To make that a fair comparison, I would have
to post the supporting machinery from Generator.py too -- and we can ask
Guido whether Generator.py worked right the first time he tried it <wink>.

The continuation solution is subtle, requiring real expertise; but the
threads solution doesn't fare any better on that count (building the support
machinery with threads is also a baffler if you don't have thread
expertise).  If we threw Python metaclasses into the pot too, they'd be a
third kind of nightmare for the non-expert.

So, if you're faced with this kind of task, there's simply no easy way to
get it done.  Thread- and (it appears) continuation- based machinery can be
crafted once by an expert, then packaged into an easy-to-use protocol for
non-experts.

All in all, I view continuations as a feature most people should actively
avoid!  I think it has that status in Scheme too (e.g., the famed Schemer's
SICP textbook doesn't even mention call/cc).  Its real value (if any <wink>)
is as a Big Invisible Hammer for certified wizards.  Where call_cc leaks
into the user's view of the world I'd try to hide it; e.g., where Sam has

    def walk (self, x):
        if type(x) == type([]):
            for item in x:
                self.walk (item)
        else:
            self.item = x
            # call self.suspend() with a continuation
            # that will continue walking the tree
            call_cc (self.suspend)

I'd do

    def walk(self, x):
        if type(x) == type([]):
            for item in x:
                self.walk(item)
        else:
            self.put(x)

where "put" is inherited from the base class (part of the protocol) and
hides the call_cc business.  Do enough of this, and we'll rediscover why
Scheme demands that tail calls not push a new stack frame <0.9 wink>.

the-tradeoffs-are-murky-ly y'rs  - tim





From tim_one at email.msn.com  Fri May 21 00:04:09 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Thu, 20 May 1999 18:04:09 -0400
Subject: [Python-Dev] 'stackless' python?
In-Reply-To: <37443DC5.1330EAC6@appliedbiometrics.com>
Message-ID: <000701bea30c$af7a1060$9d9e2299@tim>

[Christian]
[... clarified stuff ... thanks! ... much clearer ...]
> ...
> If a frame is the current state, I make it two frames to have two
> current states.  One will be saved, the other will be run. This is
> what I call "splitting".  Actually, splitting must occour whenever
> a frame can be reached twice, in order to keep elements alive.

That part doesn't compute:  if a frame can be reached by more than one path,
its refcount must be at least equal to the number of its immediate
predecessors, and its refcount won't fall to 0 before it becomes
unreachable.  So while you may need to split stuff for *some* reasons, I
can't see how keeping elements alive could be one of those reasons (unless
you're zapping frame contents *before* the frame itself is garbage?).

> ...
> Well, I see. You want one locals and one globals, shared by two
> incarnations. Gets me into trouble.

Just clarifying what Scheme does.  Since they've been doing this forever, I
don't want to toss their semantics on a whim <wink>.  It's at least a
conceptual thing:  why *should* locals follow different rules than globals?
If Python2 grows lexical closures, the only thing special about today's
"locals" is that they happen to be the first guys found on the search path.
Conceptually, that's really all they are today too.

Here's the clearest Scheme example I can dream up:

(define k #f)

(define (printi i)
  (display "i is ") (display i) (newline))

(define (test n)
  (let ((i n))
    (printi i)
    (set! i (- i 1))
    (printi i)
    (display "saving continuation") (newline)
    (call/cc (lambda (here) (set! k here)))
    (set! i (- i 1))
    (printi i)
    (set! i (- i 1))
    (printi i)))

No loops, no recursive calls, just a straight chain of fiddle-a-local ops.
Here's some output:

> (test 5)
i is 5
i is 4
saving continuation
i is 3
i is 2
> (k #f)
i is 1
i is 0
> (k #f)
i is -1
i is -2
> (k #f)
i is -3
i is -4
>

So there's no question about what Scheme thinks is proper behavior here.

> ...
> Let me explain. What Python does right now is:
> When a function is invoked, all local variables are copied
> into fast_locals, well of course just references are copied
> and counts increased. These fast locals give a lot of speed
> today, we must have them.

Scheme (most of 'em, anyway) also resolves locals via straight base + offset
indexing.

> You are saying I have to share locals between frames. Besides
> that will be a reasonable slowdown, since an extra structure
> must be built and accessed indirectly (right now, i's all fast,
> living in the one frame buffer),

GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
care where that points *to* <wink -- but, really, it could point into some
other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
due to continuation needs extra setup work?  Scheme saves itself by putting
name-resolution and continuation info into different structures; to mimic
the semantics, Python would need to get the same end effect.

> I cannot say that I'm convinced that this is what we need.
>
> Suppose you have a function
>
> def f(x):
>     # do something
>     ...
>     # in some context, wanna have a snapshot
>     global snapshot  # initialized to None
>     if not snapshot:
>         snapshot = callcc.new()
>     # continue computation
>     x = x+1
>     ...
>
> What I want to achieve is that I can run this again, from my
> snapshot. But with shared locals, my parameter x of the
> snapshot would have changed to x+1, which I don't find useful.

You need a completely fleshed-out example to score points here:  the use of
call/cc is subtle, hinging on details, and fragments ignore too much.  If
you do want the same x,

    commonx = x
    if not snapshot:
         # get the continuation
    # continue computation
    x = commonx
    x = x+1
    ...

That is, it's easy to get it.  But if you *do* want to see changes to the
locals (which is one way for those distinct continuation invocations to
*cooperate* in solving a task -- see below), but the implementation doesn't
allow for it, I don't know what you can do to worm around it short of making
x global too.  But then different *top* level invocations of f will stomp on
that shared global, so that's not a solution either.  Maybe forget functions
entirely and make everything a class method.

> I want to fix a state of the current frame and still think
> it should "own" its locals. Globals are borrowed, anyway.
> Class instances will anyway do what you want, since
> the local "self" is a mutable object.
>
> How do you want to keep computations independent
> when locals are shared? For me it's just easier to
> implement and also to think with the shallow copy.
> Otherwise, where is my private place?
> Open for becoming convinced, of course :-)

I imagine it comes up less often in Scheme because it has no loops:
communication among "iterations" is via function arguments or up-level
lexical vrbls.

So recall your uses of Icon generators instead:  like Python, Icon does have
loops, and two-level scoping, and I routinely build loopy Icon generators
that keep state in locals.  Here's a dirt-simple example I emailed to Sam
earlier this week:

procedure main()
    every result := fib(0, 1) \ 10 do
        write(result)
end

procedure fib(i, j)
    local temp
    repeat {
        suspend i
        temp := i + j
        i := j
        j := temp
    }
end

which prints

0
1
1
2
3
5
8
13
21
34

If Icon restored the locals (i, j, temp) upon each fib resumption, it would
generate a zero followed by an infinite sequence of ones(!).

Think of a continuation as a *paused* computation (which it is) rather than
an *independent* one (which it isn't <wink>), and I think it gets darned
hard to argue.

theory-and-practice-agree-here-in-my-experience-ly y'rs  - tim





From MHammond at skippinet.com.au  Fri May 21 01:01:22 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 09:01:22 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <374417D8.8DBCB617@appliedbiometrics.com>
Message-ID: <00c001bea314$aefc5b40$0801a8c0@bobcat>

> I'm writing a prototype of a stackless Python, which means that
> you will be able to access the current state of the interpreter
> completely.
> The inner interpreter loop will be isolated from the frame
> dispatcher. It will break whenever the ticker goes zero.
> If you set the ticker to one, you will be able to single
> step on every opcode, have the value stack, the frame chain,
> everything.

I think the main point is how to change code when a Python frame already
references it.  I dont think the structure of the frames is as important as
the general concept.  But while we were talking frame-fiddling it seemed a
good point to try and hijack it a little :-)

Would it be possible to recompile just a block of code (eg, just the
current function or method) and patch it back in such a way that the
current frame continues execution of the new code?

I feel this is somewhat related to the inability to change class
implementation for an existing instance.  I know there have been hacks
around this before but they arent completly reliable and IMO it would be
nice if the core Python made it easier to change already running code -
whether that code is in an existing stack frame, or just in an already
created instance, it is very difficult to do.

This has come to try and deflect some conversation away from changing
Python as such towards an attempt at enhancing its _environment_.  To
paraphrase many people before me, even if we completely froze the language
now there would still plenty of work ahead of us :-)

Mark.




From guido at CNRI.Reston.VA.US  Fri May 21 02:06:51 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 20:06:51 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 09:01:22 +1000."
             <00c001bea314$aefc5b40$0801a8c0@bobcat> 
References: <00c001bea314$aefc5b40$0801a8c0@bobcat> 
Message-ID: <199905210006.UAA07900@eric.cnri.reston.va.us>

> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

This topic sounds mostly unrelated to the stackless discussion -- in
either case you need to be able to fiddle the contents of the frame
and the bytecode pointer to reflect the changed function.

Some issues:

  - The slots containing local variables may be renumbered after
    recompilation; fortunately we know the name--number mapping so we can
    move them to their new location.  But it is still tricky.

  - Should you be able to edit functions that are present on the call
    stack below the top?  Suppose we have two functions:

	def f():
	    return 1 + g()

	def g():
	    return 0

    Suppose set a break in g(), and then edit the source of f().  We can
    do all sorts of evil to f(): e.g. we could change it to

	    return g() + 2

    which affects the contents of the value stack when g() returns
    (originally, the value stack contained the value 1, now it is empty).
    Or we could even change f() to

	    return 3

    thereby eliminating the call to g() altogether!

What kind of limitations do other systems that support modifying a
"live" program being debugged impose?  Only allowing modification of
the function at the top of the stack might eliminate some problems,
although there are still ways to mess up.  The value stack is not 
always empty even when we only stop at statement boundaries -- e.g. it 
contains 'for' loop indices, and there's also the 'block' stack, which 
contains try-except information.  E.g. what should happen if we change

    def f():
        for i in range(10):
            print 1

stopped at the 'print 1' into

    def f():
        print 1

???

(Ditto for removing or adding a try/except block.)

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I've been thinking a bit about this.  Function objects now have
mutable func_code attributes (and also func_defaults), I think we can
use this.

The hard part is to do the analysis needed to decide which functions
to recompile!  Ideally, we would simply edit a file and tell the
programming environment "recompile this".  The programming environment
would compare the changed file with the old version that it had saved
for this purpose, and notice (for example) that we changed two methods
of class C.  It would then recompile those methods only and stuff the
new code objects in the corresponding function objects.

But what would it do when we changed a global variable?  Say a module
originally contains a statement "x = 0".  Now we change the source
code to say "x = 100".  Should we change the variable x?  Suppose that
x is modified by some of the computations in the module, and the that,
after some computations, the actual value of x was 50.  Should the
"recompile" reset x to 100 or leave it alone?

One option would be to actually change the semantics of the class and
def statements so that they modify an existing class or function
rather than using assignment.  Effectively, this proposal would change
the semantics of

    class A:
        ...some code...

    class A:
        ...some more code...

to be the same as

    class A:
        ...more code...
        ...some more code...
        
This is somewhat similar to the way the module or package commands in
some other dynamic languages work, I think; and I don't think this
would break too much existing code.

The proposal would also change

    def f():
        ...some code...

    def f():
        ...other code...

but here the equivalence is not so easy to express, since I want
different semantics (I don't want the second f's code to be tacked
onto the end of the first f's code).

If we understand that def f(): ... really does the following:

    f = NewFunctionObject()
    f.func_code = ...code object...

then the construct above (def f():... def f(): ...) would do this:

    f = NewFunctionObject()
    f.func_code = ...some code...

    f.func_code = ...other code...

i.e. there is no assignment of a new function object for the second
def.

Of course if there is a variable f but it is not a function, it would
have to be assigned a new function object first.

But in the case of def, this *does* break existing code.  E.g.

# module A
from B import f
.
.
.
if ...some test...:
    def f(): ...some code...

This idiom conditionally redefines a function that was also imported
from some other module.  The proposed new semantics would change B.f
in place!

So perhaps these new semantics should only be invoked when a special
"reload-compile" is asked for...  Or perhaps the programming
environment could do this through source parsing as I proposed
before...

> This has come to try and deflect some conversation away from changing
> Python as such towards an attempt at enhancing its _environment_.  To
> paraphrase many people before me, even if we completely froze the language
> now there would still plenty of work ahead of us :-)

Please, no more posts about Scheme.  Each new post mentioning call/cc
makes it *less* likely that something like that will ever be part of
Python.  "What if Guido's brain exploded?" :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Fri May 21 03:13:28 1999
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 20 May 1999 21:13:28 -0400 (EDT)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
	<199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <14148.45321.204380.19130@cm-29-94-2.nycap.rr.com>

    Guido> What kind of limitations do other systems that support modifying
    Guido> a "live" program being debugged impose?  Only allowing
    Guido> modification of the function at the top of the stack might
    Guido> eliminate some problems, although there are still ways to mess
    Guido> up.

Frame objects maintain pointers to the active code objects, locals and
globals, so modifying a function object's code or globals shouldn't have any
effect on currently executing frames, right?  I assume frame objects do the
usual INCREF/DECREF dance, so the old code object won't get deleted before
the frame object is tossed.

    Guido> But what would it do when we changed a global variable?  Say a
    Guido> module originally contains a statement "x = 0".  Now we change
    Guido> the source code to say "x = 100".  Should we change the variable
    Guido> x?  Suppose that x is modified by some of the computations in the
    Guido> module, and the that, after some computations, the actual value
    Guido> of x was 50.  Should the "recompile" reset x to 100 or leave it
    Guido> alone?

I think you should note the change for users and give them some way to
easily pick between old initial value, new initial value or current value.

    Guido> Please, no more posts about Scheme.  Each new post mentioning
    Guido> call/cc makes it *less* likely that something like that will ever
    Guido> be part of Python.  "What if Guido's brain exploded?" :-)

I agree.  I see call/cc or set! and my eyes just glaze over...

Skip Montanaro	| Mojam: "Uniting the World of Music" http://www.mojam.com/
skip at mojam.com  | Musi-Cal: http://www.musi-cal.com/
518-372-5583



From MHammond at skippinet.com.au  Fri May 21 03:42:14 1999
From: MHammond at skippinet.com.au (Mark Hammond)
Date: Fri, 21 May 1999 11:42:14 +1000
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <00c501bea32b$277ce3d0$0801a8c0@bobcat>

[Guido writes...]
> This topic sounds mostly unrelated to the stackless discussion -- in

Sure is - I just saw that as an excuse to try and hijack it <wink>

> Some issues:
>
>   - The slots containing local variables may be renumbered after

Generally, I think we could make something very useful even with a number
of limitations.  For example, I would find a first cut completely
acceptable and a great improvement on today if:

* Only the function at the top of the stack can be recompiled and have the
code reflected while executing.  This function also must be restarted after
such an edit.  If the function uses global variables or makes calls that
restarting will screw-up, then either a) make the code changes _before_
doing this stuff, or b) live with it for now, and help us remove the
limitation :-)

That may make the locals being renumbered easier to deal with, and also
remove some of the problems you discussed about editing functions below the
top.

> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?  Only allowing modification of

I can only speak for VC, and from experience at that - I havent attempted
to find documentation on it.

It accepts most changes while running.  The current line is fine.  If you
create or change the definition of globals (and possibly even the type of
locals?), the "incremental compilation" fails, and you are given the option
of continuing with the old code, or stopping the process and doing a full
build.

When the debug session terminates, some link process (and maybe even
compilation?) is done to bring the .exe on disk up to date with the
changes.

If you do wierd stuff like delete the line being executed, it usually gives
you some warning message before either restarting the function or trying to
pick a line somewhere near the line you deleted.  Either way, it can screw
up, moving the "current" line somewhere else - it doesnt crash the
debugger, but may not do exactly what you expected.  It is still a _huge_
win, and a great feature!

Ironically, I turn this feature _off_ for Python extensions.  Although
changing the C code is great, in 99% of the cases I also need to change
some .py code, and as existing instances are affected I need to restart the
app anyway - so I may as well do a normal build at that time.  ie, C now
lets me debug incrementally, but a far more dynamic language prevents this
feature being useful ;-)

> the function at the top of the stack might eliminate some problems,
> although there are still ways to mess up.  The value stack is not
> always empty even when we only stop at statement boundaries

If we forced a restart would this be better?  Can we reliably reset the
stack to the start of the current function?

> I've been thinking a bit about this.  Function objects now have
> mutable func_code attributes (and also func_defaults), I think we can
> use this.
>
> The hard part is to do the analysis needed to decide which functions
> to recompile!  Ideally, we would simply edit a file and tell the
> programming environment "recompile this".  The programming environment
> would compare the changed file with the old version that it had saved
> for this purpose, and notice (for example) that we changed two methods
> of class C.  It would then recompile those methods only and stuff the
> new code objects in the corresponding function objects.

If this would work for the few changed functions/methods, what would the
impact be of doing it for _every_ function (changed or not)?  Then the
analysis can drop to the module level which is much easier.  I dont think a
slight performace hit is a problem at all when doing this stuff.

> One option would be to actually change the semantics of the class and
> def statements so that they modify an existing class or function
> rather than using assignment.  Effectively, this proposal would change
> the semantics of
>
>     class A:
>         ...some code...
>
>     class A:
>         ...some more code...
>
> to be the same as
>
>     class A:
>         ...more code...
>         ...some more code...

Or extending this (didnt this come up at the latest IPC?)
# .\package\__init__.py
class BigMutha:
  pass

# .\package\something.py
class package.BigMutha:
  def some_category_of_methods():
    ...

# .\package\other.py
class package.BigMutha:
  def other_category_of_methods():
    ...
[Of course, this wont fly as it stands; just a conceptual possibility]

> So perhaps these new semantics should only be invoked when a special
> "reload-compile" is asked for...  Or perhaps the programming
> environment could do this through source parsing as I proposed
> before...


From guido at CNRI.Reston.VA.US  Fri May 21 05:02:49 1999
From: guido at CNRI.Reston.VA.US (Guido van Rossum)
Date: Thu, 20 May 1999 23:02:49 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: Your message of "Fri, 21 May 1999 11:42:14 +1000."
             <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
References: <00c501bea32b$277ce3d0$0801a8c0@bobcat> 
Message-ID: <199905210302.XAA08129@eric.cnri.reston.va.us>

> Generally, I think we could make something very useful even with a number
> of limitations.  For example, I would find a first cut completely
> acceptable and a great improvement on today if:
> 
> * Only the function at the top of the stack can be recompiled and have the
> code reflected while executing.  This function also must be restarted after
> such an edit.  If the function uses global variables or makes calls that
> restarting will screw-up, then either a) make the code changes _before_
> doing this stuff, or b) live with it for now, and help us remove the
> limitation :-)

OK, restarting the function seems a reasonable compromise and would
seem relatively easy to implement.  Not *real* easy though: it turns
out that eval_code2() is called with a code object as argument, and
it's not entirely trivial to figure out the corresponding function
object from which to grab the new code object.  But it could be done
-- give it a try.  (Don't wait for me, I'm ducking for cover until at
least mid June.)

> Ironically, I turn this feature _off_ for Python extensions.  Although
> changing the C code is great, in 99% of the cases I also need to change
> some .py code, and as existing instances are affected I need to restart the
> app anyway - so I may as well do a normal build at that time.  ie, C now
> lets me debug incrementally, but a far more dynamic language prevents this
> feature being useful ;-)

I hear you.

> If we forced a restart would this be better?  Can we reliably reset the
> stack to the start of the current function?

Yes, no problem.

> If this would work for the few changed functions/methods, what would the
> impact be of doing it for _every_ function (changed or not)?  Then the
> analysis can drop to the module level which is much easier.  I dont think a
> slight performace hit is a problem at all when doing this stuff.

Yes, this would be fine too.

> >"What if Guido's brain exploded?" :-)
> 
> At least on that particular topic I didnt even consider I was the only one
> in fear of that!  But it is good to know that you specifically are too :-)

Have no fear.  I've learned to say no. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Fri May 21 07:36:44 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Fri, 21 May 1999 01:36:44 -0400
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <199905210006.UAA07900@eric.cnri.reston.va.us>
Message-ID: <000401bea34b$e93fcda0$d89e2299@tim>

[GvR]
> ...
> What kind of limitations do other systems that support modifying a
> "live" program being debugged impose?

As an ex-compiler guy, I should have something wise to say about that.
Alas, I've never used a system that allowed more than poking new values into
vrbls, and the thought of any more than that makes me vaguely ill!  Oh,
that's right -- I'm vaguely ill anyway today.  Still-- oooooh -- the
problems.

This later got reduced to restarting the topmost function from scratch.
That has some attraction, especially on the bang-for-buck-o-meter.

> ...
> Please, no more posts about Scheme.  Each new post mentioning call/cc
> makes it *less* likely that something like that will ever be part of
> Python.  "What if Guido's brain exploded?" :-)

What a pussy <wink>.  Really, overall continuations are much less trouble to
understand than threads -- there's only one function in the entire
interface!

OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
way to get pseudo-parallel semantics regardless of OS.

changing-code-on-the-fly-==-mutating-the-current-continuation-ly y'rs  - tim





From tismer at appliedbiometrics.com  Fri May 21 09:12:05 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:12:05 +0200
Subject: [Python-Dev] Interactive Debugging of Python
References: <00c001bea314$aefc5b40$0801a8c0@bobcat>
Message-ID: <37450745.21D63A5@appliedbiometrics.com>


Mark Hammond wrote:
> 
> > I'm writing a prototype of a stackless Python, which means that
> > you will be able to access the current state of the interpreter
> > completely.
> > The inner interpreter loop will be isolated from the frame
> > dispatcher. It will break whenever the ticker goes zero.
> > If you set the ticker to one, you will be able to single
> > step on every opcode, have the value stack, the frame chain,
> > everything.
> 
> I think the main point is how to change code when a Python frame already
> references it.  I dont think the structure of the frames is as important as
> the general concept.  But while we were talking frame-fiddling it seemed a
> good point to try and hijack it a little :-)
> 
> Would it be possible to recompile just a block of code (eg, just the
> current function or method) and patch it back in such a way that the
> current frame continues execution of the new code?

Sure. Since the frame holds a pointer to the code, and the current
IP and SP, your code can easily change it (with care, or GPF:) .
It could even create a fresh code object and let it run only
for the running instance. By instance, I mean a frame which is
running a code object.

> I feel this is somewhat related to the inability to change class
> implementation for an existing instance.  I know there have been hacks
> around this before but they arent completly reliable and IMO it would be
> nice if the core Python made it easier to change already running code -
> whether that code is in an existing stack frame, or just in an already
> created instance, it is very difficult to do.

I think this has been difficult, only since information was hiding
in the inner interpreter loop. Gonna change now.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Fri May 21 09:21:22 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 09:21:22 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000901bea26a$34526240$179e2299@tim>
		<37443DC5.1330EAC6@appliedbiometrics.com> <14148.21750.738559.424456@bitdiddle.cnri.reston.va.us>
Message-ID: <37450972.D19E160@appliedbiometrics.com>


Jeremy Hylton wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer at appliedbiometrics.com> writes:
> 
>   CT> What I want to achieve is that I can run this again, from my
>   CT> snapshot. But with shared locals, my parameter x of the snapshot
>   CT> would have changed to x+1, which I don't find useful.  I want to
>   CT> fix a state of the current frame and still think it should "own"
>   CT> its locals. Globals are borrowed, anyway.  Class instances will
>   CT> anyway do what you want, since the local "self" is a mutable
>   CT> object.
> 
>   CT> How do you want to keep computations independent when locals are
>   CT> shared? For me it's just easier to implement and also to think
>   CT> with the shallow copy.  Otherwise, where is my private place?
>   CT> Open for becoming convinced, of course :-)
> 
> I think you're making things a lot more complicated by trying to
> instantiate new variable bindings for locals every time you create a
> continuation.  Can you give an example of why that would be helpful?

I'm not sure wether you all understand me, and vice versa.
There is no copying at all, but for the frame.
I copy the frame, which means I also incref all the
objects which it holds. Done. This is the bare minimum
which I must do.

> (Ok.  I'm not sure I can offer a good example of why it would be
> helpful to share them, but it makes intuitive sense to me.)
> 
> The call_cc mechanism is going to let you capture the current
> continuation, save it somewhere, and call on it again as often as you
> like.  Would you get a fresh locals each time you used it?  or just
> the first time?  If only the first time, it doesn't seem that you've
> gained a whole lot.

call_cc does a copy of the state which is the frame. This is
stored away until it is revived. Nothing else happens.
As Guido pointed out, virtually the whole frame chain is
duplicated, but only on demand.

> Also, all the locals that are references to mutable objects are
> already effectively shared.  So it's only a few oddballs like ints
> that are an issue.

Simply look at a frame, what it is. What do you need to do to
run it again with a given state. You have to preserve the stack
variables. And you have to preserve the current locals, since
some of them might even have a copy on the stack, and we want
to stay consistent.

I believe it would become obvious if you tried to implement it.
Maybe I should close my ears and get something ready to show?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Fri May 21 11:00:26 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Fri, 21 May 1999 11:00:26 +0200
Subject: [Python-Dev] 'stackless' python?
References: <000701bea30c$af7a1060$9d9e2299@tim>
Message-ID: <374520AA.2ADEA687@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Christian]
> [... clarified stuff ... thanks! ... much clearer ...]

But still not clear enough, I fear.

> > ...
> > If a frame is the current state, I make it two frames to have two
> > current states.  One will be saved, the other will be run. This is
> > what I call "splitting".  Actually, splitting must occour whenever
> > a frame can be reached twice, in order to keep elements alive.
> 
> That part doesn't compute:  if a frame can be reached by more than one path,
> its refcount must be at least equal to the number of its immediate
> predecessors, and its refcount won't fall to 0 before it becomes
> unreachable.  So while you may need to split stuff for *some* reasons, I
> can't see how keeping elements alive could be one of those reasons (unless
> you're zapping frame contents *before* the frame itself is garbage?).

I was saying that under the side condition that I don't want to
change frames as they are now. Maybe that's misconcepted, but
this is what I did:

If a frame as we have it today shall be resumed twice, then
it has to be copied, since:
The stack is in it and has some state which will change
after resuming.

That was the whole problem with my first prototype, which
was done hoping that I don't need to change the interpreter
at all. Wrong, bad, however.

What I actually did was more than seems to be needed:
I made a copy of the whole current frame chain. Later on,
Guido said this can be done on demand. He's right.

[Scheme sample - understood]

> GETLOCAL and SETLOCAL simply index off of the fastlocals pointer; it doesn't
> care where that points *to* <wink -- but, really, it could point into some
> other frame and ceval2 wouldn't know the difference).  Maybe a frame entered
> due to continuation needs extra setup work?  Scheme saves itself by putting
> name-resolution and continuation info into different structures; to mimic
> the semantics, Python would need to get the same end effect.

Point taken. The pointer doesn't save time of access, it just
saves allocating another structure.
So we can use something else without speed loss.

[have to cut a little]

> So recall your uses of Icon generators instead:  like Python, Icon does have
> loops, and two-level scoping, and I routinely build loopy Icon generators
> that keep state in locals.  Here's a dirt-simple example I emailed to Sam
> earlier this week:
> 
> procedure main()
>     every result := fib(0, 1) \ 10 do
>         write(result)
> end
> 
> procedure fib(i, j)
>     local temp
>     repeat {
>         suspend i
>         temp := i + j
>         i := j
>         j := temp
>     }
> end

[prints fib series]

> If Icon restored the locals (i, j, temp) upon each fib resumption, it would
> generate a zero followed by an infinite sequence of ones(!).

Now I'm completely missing the point. Why should I want
to restore anything? At a suspend, which when done by continuations
will be done by temporarily having two identical states, one
is saved and another is continued. The continued one in your example
just returns the current value and immediately forgets about
the locals. The other one is continued later, and of course with
the same locals which were active when going asleep.

> Think of a continuation as a *paused* computation (which it is) rather than
> an *independent* one (which it isn't <wink>), and I think it gets darned
> hard to argue.

No, you get me wrong. I understand what you mean. It is just
the decision wether a frame, which will be reactivated later
as a continuation, should use a reference to locals like
the reference which it has for the globals. This causes me
a major frame redesign.

Current design:
A frame is: back chain, state, code, unpacked locals, globals, stack.

Code and globals are shared. 
State, unpacked locals and stack are private.

Possible new design:
A frame is: back chain, state, code, variables, globals, stack.

variables is: unpacked locals.

This makes the variables into an extra structure which is shared.
Probably a list would be the thing, or abusing a tuple as
a mutable object.

Hmm. I think I should get something ready, and we should
keep this thread short, or we will loose the rest of 
Guido's goodwill (if not already).

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From da at ski.org  Fri May 21 18:27:42 1999
From: da at ski.org (David Ascher)
Date: Fri, 21 May 1999 09:27:42 -0700 (Pacific Daylight Time)
Subject: [Python-Dev] Interactive Debugging of Python
In-Reply-To: <000401bea34b$e93fcda0$d89e2299@tim>
Message-ID: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>

On Fri, 21 May 1999, Tim Peters wrote:

> OK.  So how do you feel about coroutines?  Would sure be nice to have *some*
> way to get pseudo-parallel semantics regardless of OS.

I read about coroutines years ago on c.l.py, but I admit I forgot it all.
Can you explain them briefly in pseudo-python? 

--david




From tim_one at email.msn.com  Sat May 22 06:22:50 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:50 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000401bea40a$c1d2d2c0$659e2299@tim>

[Tim]
> OK.  So how do you feel about coroutines?  Would sure be nice
> to have *some* way to get pseudo-parallel semantics regardless of OS.

[David Ascher]
> I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> Can you explain them briefly in pseudo-python?

How about real Python?  http://www.python.org/tim_one/000169.html contains a
complete coroutine implementation using threads under the covers (& exactly
5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
different object interface (making coroutines objects in their own right
instead of funneling everything through a "coroutine controller" object),
but the ideas are the same in every coroutine language.  The post contains
several executable examples, from simple to "literature standard".

I had forgotten all about this:  it contains solutions to the same "compare
tree fringes" problem Sam mentioned, *and* the generator-based building
block I posted three other solutions for in this thread.  That last looks
like:

# fringe visits a nested list in inorder, and detaches for each non-list
# element; raises EarlyExit after the list is exhausted
def fringe( co, list ):
    for x in list:
        if type(x) is type([]):
            fringe(co, x)
        else:
            co.detach(x)

def printinorder( list ):
    co = Coroutine()
    f = co.create(fringe, co, list)
    try:
        while 1:
            print co.tran(f),
    except EarlyExit:
        pass
    print

printinorder([1,2,3])  # 1 2 3
printinorder([[[[1,[2]]],3]]) # ditto
x = [0, 1, [2, [3]], [4,5], [[[6]]] ]
printinorder(x) # 0 1 2 3 4 5 6

Generators are really "half a coroutine", so this doesn't show the full
power (other examples in the post do).  co.detach is a special way to deal
with this asymmetry.  In the general case you use co.tran all the time,
where (see the post for more info)

    v = co.tran(c [, w])

means "resume coroutine c from the place it last did a co.tran, optionally
passing it the value w, and when somebody does a co.tran back to *me*,
resume me right here, binding v to the value *they* pass to co.tran ).

Knuth complains several times that it's very hard to come up with a
coroutine example that's both simple and clear <0.5 wink>.  In a nutshell,
coroutines don't have a "caller/callee" relationship, they have "we're all
equal partners" relationship, where any coroutine is free to resume any
other one where it left off.  It's no coincidence that making coroutines
easy to use was pioneered by simulation languages!  Just try simulating a
marriage where one partner is the master and the other a slave <wink>.

i-may-be-a-bachelor-but-i-have-eyes-ly y'rs  - tim





From tim_one at email.msn.com  Sat May 22 06:22:55 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 00:22:55 -0400
Subject: [Python-Dev] Re: Coroutines
In-Reply-To: <Pine.WNT.4.04.9905210927060.289-100000@rigoletto.ski.org>
Message-ID: <000501bea40a$c3d1fe20$659e2299@tim>

Thoughts o' the day:

+ Generators ("semi-coroutines") are wonderful tools and easy to implement
without major changes to the PVM.  Icon calls 'em generators, Sather calls
'em iterators, and they're exactly what you need to implement "for thing in
object:" when object represents a collection that's tricky to materialize.
Python needs something like that.  OTOH, generators are pretty much limited
to that.

+ Coroutines are more general but much harder to implement, because each
coroutine needs its own stack (a generator only has one stack *frame*-- its
own --to worry about), and C-calling-Python can get into the act.  As Sam
said, they're probably no easier to implement than call/cc (but trivial to
implement given call/cc).

+ What may be most *natural* is to forget all that and think about a
variation of Python threads implemented directly via the interpreter,
without using OS threads.  The PVM already knows how to handle thread-state
swapping.  Given Christian's stackless interpreter, and barring C->Python
cases, I suspect Python can fake threads all by itself, in the sense of
interleaving their executions within a single "real" (OS) thread.  Given the
global interpreter lock, Python effectively does only-one-at-a-time anyway.

Threads are harder than generators or coroutines to learn, but

A) Many more people know how to use them already.

B) Generators and coroutines can be implemented using (real or fake)
threads.

C) Python has offered threads since the beginning.

D) Threads offer a powerful mode of control transfer coroutines don't,
namely "*anyone* else who can make progress now, feel encouraged to do so at
my expense".

E) For whatever reasons, in my experience people find threads much easier to
learn than call/cc -- perhaps because threads are *obviously* useful upon
first sight, while it takes a real Zen Experience before call/cc begins to
make sense.

F) Simulated threads could presumably produce much more informative error
msgs (about deadlocks and such) than OS threads, so even people using real
threads could find excellent debugging use for them.

Sam doesn't want to use "real threads" because they're pigs; fake threads
don't have to be.  Perhaps

x = y.SOME_ASYNC_CALL(r, s, t)

could map to e.g.

import config
if config.USE_REAL_THREADS:
    import threading
else:
    from simulated_threading import threading

from config.shared import msg_queue

class Y:
    def __init__(self, ...):
        self.ready = threading.Event()
        ...

    def SOME_ASYNC_CALL(self, r, s, t):
        result = [None]  # mutable container to hold the result
        msg_queue.put((server_of_the_day, r, s, t, self.ready, result))
        self.ready.wait()
        self.ready.clear()
        return result[0]

where some other simulated thread polls the msg_queue and does ready.set()
when it's done processing the msg enqueued by SOME_ASYNC_CALL.  For this to
scale nicely, it's probably necessary for the PVM to cooperate with the
simulated_threading implementation (e.g., a simulated thread that blocks
(like on self.ready.wait()) should be taken out of the collection of
simulated threads the PVM may attempt to resume -- else in Sam's case the
PVM would repeatedly attempt to wake up thousands of blocked threads, and
things would slow to a crawl).

Of course, simulated_threading could be built on top of call/cc or
coroutines too.  The point to making threads the core concept is keeping
Guido's brain from exploding.  Plus, as above, you can switch to "real
threads" by changing an import statement.

making-sure-the-global-lock-support-hair-stays-around-even-if-greg-
    renders-it-moot-for-real-threads<wink>-ly y'rs  - tim





From tismer at appliedbiometrics.com  Sat May 22 18:20:30 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 18:20:30 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim>
Message-ID: <3746D94E.239D0B8E@appliedbiometrics.com>


Tim Peters wrote:
> 
> [Tim]
> > OK.  So how do you feel about coroutines?  Would sure be nice
> > to have *some* way to get pseudo-parallel semantics regardless of OS.
> 
> [David Ascher]
> > I read about coroutines years ago on c.l.py, but I admit I forgot it all.
> > Can you explain them briefly in pseudo-python?
> 
> How about real Python?  http://www.python.org/tim_one/000169.html contains a
> complete coroutine implementation using threads under the covers (& exactly
> 5 years old tomorrow <wink>).  If I were to do it over again, I'd use a
> different object interface (making coroutines objects in their own right
> instead of funneling everything through a "coroutine controller" object),
> but the ideas are the same in every coroutine language.  The post contains
> several executable examples, from simple to "literature standard".

What an interesting thread! Unfortunately, all the examples are messed
up since some HTML formatter didn't take care of the python code,
rendering it unreadable. Is there a different version available?

Also, I'd like to read the rest of the threads in 
http://www.python.org/tim_one/ but it seems that only your messages
are archived?
Anyway, the citations in http://www.python.org/tim_one/000146.html
show me that you have been through all of this five years
ago, with a five years younger Guido which sounds a bit
different than today.
I had understood him better if I had known that this
is a re-iteration of a somehow dropped or entombed idea.

(If someone has the original archives from that epoche,
I'd be happy to get a copy. Actually, I'm missing all upto
end of 1996.)

A sort snapshot:
Stackless Python is meanwhile nearly alive, with recursion
avoided in ceval. Of course, some modules are left which
still need work, but enough for a prototype. Frames contain
now all necessry state and are now prepared for execution
and thrown back to the evaluator (elevator?). 

The key idea was to change the deeply nested functions in a 
way, that their last eval_code call happens to be tail recursive.
In ceval.c (and in other not yet changed places), functions
to a lot of preparation, build some parameter, call eval_code
and release the parameter. This was the crux, which I solved
by a new filed in the frame object, where such references
can be stored. The routine can now return with the ready packaged
frame, instead of calling it.

As a minimum facility for future co-anythings,
I provided a hook function for resuming frames, which causes no
overhead in the usual case but allows to override what a frame
does when someone returns control to it. To implement
this is due to some extension module, wether this may
be coroutines or your nice nano-threads, it's possible.

threadedly yours - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tismer at appliedbiometrics.com  Sat May 22 21:04:43 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sat, 22 May 1999 21:04:43 +0200
Subject: [Python-Dev] How stackless can Python be?
Message-ID: <3746FFCB.CD506BE4@appliedbiometrics.com>

Hi,

to make the core interpreter stackless is one thing.
Turning functions which call the interpreter
from some deep nesting level into versions,
which return a frame object instead which is
to be called, is possible in many cases.

Internals like apply are rather uncomplicated to convert.
CallObjectWithKeywords is done.

What I have *no* good solution for is map.
Map does an iteration over evaluations and keeps
state while it is running. The same applies to reduce,
but it seems to be not used so much. Map is.

I don't see at the moment if map could be a killer
for Tim's nice mini-thread idea. How must map work,
if, for instance, a map is done with a function
which then begins to switch between threads,
before map is done? Can one imagine a problem?

Maybe it is no issue, but I'd really like to
know wether we need a stateless map.
(without replacing it by a for loop :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From tim_one at email.msn.com  Sat May 22 21:35:58 1999
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 22 May 1999 15:35:58 -0400
Subject: [Python-Dev] Coroutines
In-Reply-To: <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <000501bea48a$51563980$119e2299@tim>

>> http://www.python.org/tim_one/000169.html

[Christian]
> What an interesting thread! Unfortunately, all the examples are messed
> up since some HTML formatter didn't take care of the python code,
> rendering it unreadable. Is there a different version available?
>
> Also, I'd like to read the rest of the threads in
> http://www.python.org/tim_one/ but it seems that only your messages
> are archived?

Yes, that link is the old Pythonic Award Shrine erected in my memory -- it's
all me, all the time, no mercy, no escape <wink>.

It predates the DejaNews archive, but the context can still be found in

http://www.python.org/search/hypermail/python-1994q2/index.html

There's a lot in that quarter about continuations & coroutines, most from
Steven Majewski, who took a serious shot at implementing all this.

Don't have the code in a more usable form; when my then-employer died, most
of my files went with it.

You can save the file as text, though!  The structure of the code is intact,
it's simply that your browswer squashes out the spaces when displaying it.
Nuke the <P> at the start of each code line and what remains is very close
to what was originally posted.

> Anyway, the citations in http://www.python.org/tim_one/000146.html
> show me that you have been through all of this five years
> ago, with a five years younger Guido which sounds a bit
> different than today.
> I had understood him better if I had known that this
> is a re-iteration of a somehow dropped or entombed idea.

You *used* to know that <wink>!  Thought you even got StevenM's old code
from him a year or so ago.  He went most of the way, up until hitting the
C<->Python stack intertwingling barrier, and then dropped it.  Plus Guido
wrote generator.py to shut me up, which works, but is about 3x clumsier to
use and runs about 50x slower than a generator should <wink>.

> ...
> Stackless Python is meanwhile nearly alive, with recursion
> avoided in ceval. Of course, some modules are left which
> still need work, but enough for a prototype. Frames contain
> now all necessry state and are now prepared for execution
> and thrown back to the evaluator (elevator?).
> ...

Excellent!  Running off to a movie & dinner now, but will give a more
careful reading tonight.

co-dependent-ly y'rs  - tim





From tismer at appliedbiometrics.com  Sun May 23 15:07:44 1999
From: tismer at appliedbiometrics.com (Christian Tismer)
Date: Sun, 23 May 1999 15:07:44 +0200
Subject: [Python-Dev] How stackless can Python be?
References: <3746FFCB.CD506BE4@appliedbiometrics.com>
Message-ID: <3747FDA0.AD3E7095@appliedbiometrics.com>

After a good sleep, I can answer this one by myself.

I wrote:
> to make the core interpreter stackless is one thing.
...
> Internals like apply are rather uncomplicated to convert.
> CallObjectWithKeywords is done.
> 
> What I have *no* good solution for is map.
> Map does an iteration over evaluations and keeps
> state while it is running. The same applies to reduce,
> but it seems to be not used so much. Map is.
...

About stackless map,
and this applies to every extension module
which *wants* to be stackless. We don't have to enforce
everybody to be stackless, but there is a couple of
modules which would benefit from it.

The problem with map is, that it needs to keep state,
while repeatedly calling objects which might call
the interpreter. Even if we kept local variables
in the caller's frame, this would still be not
stateless. The info that a map is running is sitting
on the hardware stack, and that's wrong.

Now a solution. In my last post, I argued that I don't
want to replace map by a slower Python function. But
that gave me the key idea to solve this:

C functions which cannot tail-recursively unwound to
return an executable frame object must instead return
themselves as a frame object. That's it! Frames need
again to be a little extended. They have to spell their
interpreter, which normally is the old eval_code loop.

Anatomy of a standard frame invocation:
A new frame is created, parameters are inserted,
the frame is returned to the frame dispatcher,
which runs the inner eval_code loop until it bails out.
On return, special cases of control flow are handled,
as there are exception, returning, and now also calling.
This is an eval_code frame, since eval_code is its
execution handler.

Anatomy of a map frame invocation:
Map has several phases. The first phases to
argument checking and basic setup.
The last phase is iteration over function calls
and building the result. This phase must be split
off as a second function, eval_map.
A new frame is created, with all temporary variables
placed there. eval_map is inserted as the execution
handler.

Now, I think the analogy is obvious.
By building proper frames, it should be possible
to turn any extension function into a stackless function.

The overall protocol is:
A C function which does a simple computation which cannot
cause an interpreter invocation, may simply evaluate
and return a value.
A C function which might cause an interpreter invocation,
should return a freshly created frame as return value.
- This can be done either in a tail-recursive fashion,
  if the last action of the C function would basically 
  be calling the frame.
- If no tail-recursion is possible, the function must
  return a new frame for itself, with an executor
  for its purpose.

A good stackless candidate is Fredrik's xmlop, which
calls back into the interpreter. If that worked
without the hardware stack, then we could build
ultra-fast XML processors with co-routines!

As a side note: 
The frame structure which I sketched
so far is still made for eval_code in the first place,
but it has all necessary flexibilty for pluggable
interpreters. An extension module can now create
its own frame, with its own execution handler, and
throw it back to the frame dispatcher.
In other words: People can create extensions and
test their own VMs if they want.
This was not my primary intent, but comes for free
as a consequence of having a stackless map.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.python.net
10553 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



From fredrik at pythonware.com  Sun May 23 15:53:19 1999
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 23 May 1999 15:53:19 +0200
Subject: [Python-Dev] Coroutines
References: <000401bea40a$c1d2d2c0$659e2299@tim> <3746D94E.239D0B8E@appliedbiometrics.com>
Message-ID: <031e01bea524$8db41e70$f29b12c2@pythonware.com>

Christian Tismer <tismer at appliedbiometrics.com> wrote:
> (If someone has the original archives from that epoche,
> I'd be happy to get a copy. Actually, I'm missing all upto
> end of 1996.)

http://www.egroups.com/group/python-list/info.html
has it all (almost), starting in 1991.

</F>