Teaching : Python, Scheme, Java...

Mon Apr 21 00:33:48 EDT 2003

[I sent this while my PC was crashing, and it shows as sent in my
mailbox, but I can't see it on e.g. Google, so my apologies if this is a
resend...]

> From: Alex Martelli [mailto:aleax at aleax.it] 
> 
> On Friday 18 April 2003 04:52 pm, Bjorn Pettersen wrote:
> >
> > [..tail call to 'assembly' translation of zero argument
functions...'
> 
> You're not dealing with the question of the stack frame.

Yes, I was hoping to gloss over that -- talking about tail call
optimizations in general, just using exmples from a specific and common
implmentation area as far as the tail-call optimization was concerned,
but since you asked... <wink>. The concept of a _stack_ frame is rather
curious if you went directly from theory to implementation without
knowing anything about C (Algol, et.al.). The fact that certain
languages don't even have such a basic concept as a continuation
(arguably as simple as goto [call/cc being a particularly
incomprehensible introduction to continuations :-]), shouldn't mean that
we must use these crippled languages "lowest-upper-bound" implementation
technique for normal languages <wink>.

Theoretically, the straight-forward way of executing f(args), i.e.
without any arbitrary limitations imposed by e.g. implementation
decisions like datastructures (but assuming you can call functions only
with values [not variables/terms/expressions...], so I don't have to
create an entire grammar...), can go something like: First, let's define
an Environment to be a mapping from visible variables to their values at
a given point during program execution (i.e. lexical scoping, nested
calls), "executing a function invocation, f(actual_argument), happens by
executing the code of the function [details about how to get the code
depends on the definition of executing a function _definiton_], in an
Environment (Env' [i.e. Env-prime]) which we create from the current
Environment (Env) extended with a mapping from f.formal_argument to
actual_argument [multiple arguments adds nothing of theoretical
interest, just complexity, so is frequently omitted, sometimes assuming
multiple arguments are passed as a single tuple etc. etc.], mappings for
defined variables, DV(f) = vars declared in function (should be the set
Env setminus FV(f), the free variables, i.e. variables defined lexically
outside the function). And f.name->f.code [the name the function when
defined as opposed to called for recursion]".

Using formal notation, with the term currently executing [f(actual)] at
the bottom, and the steps defining the meaning of saying f(actual) in
this language listed in order from top to bottom over the line, (where +
is sequential update, <B> is bottom [a upside down T] indicating
undefined, and execution of a program [happening in the current
Environment, Env] results in a value [potentially <B>] and a new
Environment, Env'' (because we defined it as a mapping with "time"
they're immutable and we must create new objects -- and since this is
theory not implementation we mostly care about what is easier to reason
about :-):

   fn = Env[f]
   Env' = Env + (fn.formal -> actual) + (DV(fn) -> <B>) +
{fn.name->fn.code}
   Env' |- fn.code -> v, Env''
   -----------------------------------
   Env |- f(actual) -> v, Env''

Note the CPython implementation of locals seem to calculate DV(fn), but
you could get the same effect (semantically) by extending the
Environment on a local assignment. The only difference is that with the
above you can say "UnboundLocalError", with the other only "NameError"
:-)

a Python version of the above might be (Env implemented by a dict):

  def executeFunctionCall(Env, f, actual):
      fn = Env[f] # funciton object, as opposed to function name
      formals = fn.formals # list of formal arguments
      argmapping = dict(zip(formals, actual))
      fnLocals = dict(zip(fn.vars, None))
      Env' = copy.copy(Env).update(argmapping).update(fnLocals)
      v, Env'' = executeCode(Env', fn.code)
      return v, Env''

and assuming there exists a function "fn, Env' =
executeFunctionDefinition(Env, name, formals, body)" which creates the
function object fn with attributes formals, vars, and code computed by
executing the definition as opposed to the code (just like Python). It
also places fn in Env' with key "name", and through successive
modifications of the Environment passed to and returned from the various
executeTermX, we find this object in the Environment passed to us.

[which is how you normally create an interpreter for a language, well,
at least it used to be before all these fancy things that compiled
language1 to a language2 requiring two interpreters instead of one --
strange :-)]

I haven't defined State (i.e. memory layout of the program at any point
in time), but doing so changes nothing substantial in the definition
above, but does add another complexity. (Defining it is easy, carrying
it through all the formulas is tedious :-) E.g. define Environment to be
the mapping of visible variables to locations in the Program State, S
(a.k.a memory) at a point in program execution, and State as a mapping
from locations to values at a point in program execution, you can get C
style variables by defining the semantics to be:

   loc = Env[x]
   S' = S + loc->v
   ---------------------------------------
   Env * S |- x := v -> v * Env * S'

and Python style semantics by keeping the old definiton of Environment
and using the following formula, which defines reference counting to be
part of the semantics of the language as opposed to a property of the
implementation. (Similarly to define the semantics of C you need to
define two types of state/memory stack and heap mappings with
appropriate properties):

   curval = Env[x]
   S' = S + {loc(curval)->decref(curval)}
   Env' = Env + {x->v}
   incref(v)
   ---------------------------------
   Env * S |- x := v -> v * Env' * S'

The semantics of a language is defined by the combination of the
operational formulas for all "parts" (terms of the grammar) of the
language -- or more practically, given the interpreter, I, defined by
OS(L) [the Operational Semantics of the language L], the semantics of a
program, P is v, defined by v = I(P). Type systems, optimizations, other
program transformations can be defined in the same way, and given a
formal semantics you can prove different properties of the language...
[and creating formal definitions of assembly language, you can prove
correctness of e.g. peep-hole optimizers and even compilers...]

Worth mentioning: The Environment is defined by the language (just a
regular object/value containing the current values of all visible
variables [thinking of possible implementations would be premature]).
There is nothing magical about this object/value, i.e. why wouldn't you
be able to assign it to a variable, pass/return it to/from functions,
etc. Of course if you think of semantics (i.e. the meaning of the
program, what happens when it executes) in terms of what any specific
language implementation (or CPU architecture) is limited to you're
putting the cart before the horse. You're mapping an algorithm
[semantics] onto a language [implementation], the other way would be
like implementing assembly in C (possible, but pointless).

Note, C does have an Environment, but it is extrinsic to the language,
mostly only visible through a debugger...

Indeed, there is nothing preventing you from defining the "PC" (current
Program Counter) inside the language so you can operate on it too. Why?
Well, if you combine the PC with the current Environment (PC, Env), you
just created a continuation (defining the PC as some kind of function
would be a "good idea" :-). 

[[If you tilt your head and think of PC as indicating the next term,
which if you tilt further can be defined as a function taking the value
of executing the previous term as an argument and the rest of the
program as the body, lambda x: rest_of_program(x), as an example if the
program has executed to "a = [PC] f(b)",

  # opAssign defined as a "fundamental" operation, e.g. inlined..
  PC = lambda x, k: k(opAssign(a, x))

x being the result of f(b), and k being the rest_of_program (passed as a
an input argument) you're at CPS - Continuation Passing Style/only
forward program and data flow (just imagine how simple program and data
analysis ought to become, not to mention garbage collection <wink>, as a
bonus, call/cc looks a tad more reasonable]].

Going from mappings and set operations, to requiring that the data
layout is physically contigous seems like a rabid overspecification in
general, and rather unintuitive for this algorithm (the argument seems
to be (a) to implement local "scope" we need a stack, (b) we have a
stack, so (c) a stack is necessary for implementing local "scope"
[http://www.wikipedia.org/wiki/Affirming_the_consequent])? And why would
anyone assume it's ok to just remove a random set of values (i.e. in C I
can store a local int variable in a global and it will survive, but an
int* doesn't if what it is pointing to is created by this function,
directly or indirectly, as a local..?) -- in particular, why would you
use time of creation (instead of when you're done with on object
[defined however you want either directly or e.g. reachability,
reference count, life-time analysis]) as a criterion for deletion? It
may have been convenient when we were using assembly, but only a
convenience. Considering that a function can easily store the
Environment object, which, to belabor the point, contains the
variable->value mappings of all nested function calls up to the current
function call (i.e. what C has on the stack), there doesn't seem to be a
correct stack based implementation that doesn't involve a lot of
copying... Can it be that it is the wrong datastructure for locals?

> [.. what about implementation issues: calling conventions, memory
layout,
>     memory (de)alloc, how to optimize tail-call in this scheme..]

You need to think more about the inherent beauty of simple, logically
defined, systems of formulaes with a theory that has direct implications
for what you're doing. Pure (sometimes advanced) science, logic, and
math that can be directly applied -- pretty special. Why waste your time
on unsatisfying problems like how to limit expressiveness purely for the
sake of the convenience of an application developer <chuckle>. Seriously
though, C semantics is barely more than a formalization of the most
effective way to map the terms to a specific implementation algorithm
that places great arbitrary limits on the semantics purely because
otherwise the abstract data-layout would be different from the most
effective hardware layout for assembly language calling conventions... 

Note: I'm saying converting C to assembly should be simple relatively
because (a) they're very similar, (b) possible higher order concepts are
specifically excluded. I'm explicitly not stating either of (1) C isn't
an important language [just not terribly interesting in its semantic
theory], (2) C optimization techniques aren't advanced, i.e. C is very
fast [less time translating gives you more time optimizing...], (3) it's
hard to find a more efficient mapping of algorithm to machine run-time
than the implementation of the algorithm written using C limitations by
an expert programmer. (Which is why I'm using Python, how long something
takes me is much more important than how long it takes the computer.)

You seem to think that compilers are somehow special because they have
requirements (i.e. must compile to byte-code, or MIPS, etc.). If you
think of a language, L, as the requirements document for a regular
application trnalating a text file with a certain formal structure, to a
simpler language defined by another requirements document, it really
isn't so special. I'm betting you could come up with at least half a
dozen implementations and optimizations off the top of your head. It all
boils down to translating an abstract algorithm, into a concrete alg.
written in a particular language. It just so happens that C isn't a
particularly complex abstraction, and even the language sketched above
would be easier than most of the stuff I'm sure you do on a daily basis
-- e.g. metaclasses are doing similar transformations at run-time while
this is merely defining the constraints on an implementation (IMHO the
reason it's frequently considered special, is mostly due to the way it's
taught, confusion between compilers/interpreters and optimizers [if an
optimization isn't hard, it's allready been done :-], and being
iherently recursive -- which a lot of people have a really hard problem
with [at least according to my limited teaching experience]).

To address the concrete issues (which would be formalized by defining
State with properties you care about and adding it to all the formulas,
potentially changing the definition of Environment based on whether you
want memory locations to be accessible from variables or values/objects:

The only thing changed was removing the arbitrary limitation that
exiting a function must deallocate its data.

 - calling conventions: same
 - tail-call optimization: same
 - memory layout: space for local variables will be allocated on
function
   invocation, but can't be deleted until they're no longer used by the 
   program (which isn't limited by the scope of the invocation). For 
   simplicity allocate it all on the heap, profile, and find a better
   way if necessary. (Most implementations have special allocators for
   both arguments and DV based on statistical analysis, i.e. a pure
optimization).
 - memory allocation/deallocation: any garbage collection system will
work
   (and manual solutions don't even work in C so should be your first 
   choice anyway :-)

> > [[Now what if you came up with an algorithm that converted any
function
> > to a function with tail call? Your code for return would be really
fast,
> > it would never "return" to a previous state, you wouldn't need a
stack
> > anymore, implementing continuations would be trivial... They did it
for
> 
> Of course, such a design would ensure uniformity of stack frames,
having
> chosen to focus on this optimization.  BTW, for those who may have
> dabbled with Forth and friends in the past, this has some similarities
> to "threaded interpreters" (even though there need not be an
interpreter
> involved, and it looks more like direct-threading than indirect) [...]

Yes, I believe the equivalency between lambda calculus and stack based
languages was shown a while ago. Re interpreters vs. CPUs, there are
very few interesting distinguishing properties between them
theoretically, which would probably imply philosophically that only
interesting things are worthy of consideration (which isn't an entirely
unfamiliar pattern <grin>).

> BTW, you may claim you don't need a stack, but that's in some sense
> word-trickery;-).  If a calls b calls c calls d calls e, at this point
the 
> "nonstack" frames for 4 ongoing calls need to be "somewhere" -- this
> somewhere may e.g be a singly linked list, but that's just because
such
> a list is a rather good implementation of... ahem, a last-in-first-out
queue,
> yes, that's the ticket!  [must not say the S-word...]

Well, as I've hopefully shown, it's only LIFO if you explicitly define
(i.e restrict) it to be so since completely rational access patterns
require it to be Random Access. I know you're not arguing that a Stack
is the best algorithm for random access ;-) Are you arguing that the
frame pointer or frame length indicator for some reason don't make what
you call the stack a linked list? It really seems like only a regular
(linked) list of data, with an embedded list of continuations (i.e. the
address to jump to when the function is done). That the list elements
are allocated sequentially is a property of the list allocator rather
than the elements of the list (the program state). It should be trivial
to re-implement C style invocation-local data with a regular linked list
allocator <smile>.

> > SML, and the various steps, algorithms, etc. are nicely 
> > described in:
> >
> >    Compiling with Continuations, Andrew Appel
> >    Cambridge University Press 1992, ISBN 0-521-41695-7
> >
> > which is one of the more interesting books I read in 
> > college both for language and compiler theory (couldn't 
> > find an online-ref).]]
> 
> Nolo contendere (I didn't study it deeply, barely browsed it, 10 years
> ago or so -- my work was Fortran, C and some C++ then...:-).

I'm sorry :-) Seems like you caught the languages that were interesting
more for what they could do than for how they did it (which is of course
also increadibly interesting, but in a [completely?] different way...
Purely out of curiosity, was the set of languages based on practicality
or preference? [currently studying C# for purely practical reasons,
language theory wise it might as well have been created in the 60's,
talk about rehashing old mistakes without any originality <g>]).

-- bjorn