[Python-ideas] A Continuations Compromise in Python

Sat May 2 22:27:47 CEST 2009

Hi all

It was suggested I post this idea to the python-ideas list.  I
apologize in advance for its length, I tried to be thorough :)

Recently there's been some debate over the Tail-Call-Optimization and
its use in Python.  Some advocates coming from languages that support
this optimization say that it makes recursive code far more efficient,
and indeed allows some things to happen that could not otherwise occur
without running out of stack space.

The argument against, made most faithfully by the BDFL himself, is
that looping behavior is in general easier to understand than
recursion, and that implicitly optimizing code behind the scenes may
confuse newcomers to the language.  This would especially affect stack
traces.  Furthermore, Python is already an expressive language and
there appear to be few use cases that truly demand TCO.

Many of these arguments are strong, however, their strength can be
used against them.  Python is a teaching language, and proudly so,
which is one of the reasons to keep it clean and simple.  At the same
time, recursion is an important part of Computer Science, software
development and programming.  Indeed, many people's first recursive
functions nowadays are written in Python!  Python is the perfect
language to introduce these different kinds of constructs as the
'functional' paradigm is further absorbed by other mainstream
languages in industry and academia.

Other than it's potential to teach, Python's first class support of
recursion, through an explicit mechanism to indicate 'continuation'
style functions, is also undervalued in the use cases it more simply
carries out compared to iterative implementations (never mind the
performance benefit).  The Actor model and message passing, state
machine logic, and green thread schedulers are all potential uses.
Similarly, many multi-threaded/multi-processor programs simplify their
logic when passing continuations.

This proposal is to extend the use of the keyword 'continue' as a way
to explicitly invoke continuation style function calls, and through
that mechanism, tail-call optimized recursion.  The Zen of Python
states that explicit is better than implicit, and this is the most
major hang up with many proposed 'optimizations' that change system
behavior.  In many TCO proposals, the compiler is responsible for
identifying recursive calls that can be optimized.  In this proposal,
continuations are achieved explicitly, and it is up to the programmer
to ensure that they are proper tail calls, keeping all of the choice
and power in the programmer's hands.  This is best shown with a few
examples.

Currently, a 'continuation' style function might be written like this:

def func(continuationFunc, *args):
	return continuationFunc(*args)

The function is passed in as a first class object, then applied to
some arguments and immidately returned.  This is an important point
about true tail-calls – their values must immediately be returned.
The proposed syntax extension would change the above function to the
following similar function:

def func(continuationFunc, *args):
	continue continuationFunc(*args)

In this case, instead of the 'return' keyword, the 'continue' keyword
is used to indicate the programmer wishes to transfer full control to
the continuationFunction.  Tail-call purity would be enforced, meaning
that non-tail calls

def func(continuationFunc, *args):
	continue 1 + continuationFunc(*args)

would either cause a syntax error or throw a runtime exception.

This serves as a learning opportunity for those new to programming,
just as many of Python's uniquely elegant implementations of
programming constructs have, yet also provide a powerful and
expressive way to solve certain kinds of recursive problems.

The double use of the 'continue' keyword is no accident.  The current
use of the 'continue' keyword makes a prime candidate for extension
for a few reasons.  First there are theoretical reasons.  For one, the
use of the English word 'continue' in the above constructs is
intuitive and would not take much explaining to a newcomer as to what
it means versus 'return'.

More importantly, since it has been pointed out that many recursive
structures can be implemented as imperative loops, it seems
appropriate that the same keyword would mean equivalent/parallel
things in the two styles.  Using continue inside a loop – bounce back
up to the top of the loop and start with the next iteration, is
synonymous with the proposed recursive use, to bounce back up the
stack and immediately call the 'next' function.  Indeed, trampoline
style schedulers would use the continue keyword in its current form to
implement continuations!  These parallels between recursive and
imperative loops help understanding the new proposed use.

There are also some practical concerns.  'continue' as currently used
is not that popular of a construct.  When one needs to 'continue', one
is glad it's there, but due to Python's expressive generator
expressions, for-each style loops and inner functions, many use cases
for 'continue' become more elegantly expressed in other ways.  This
means that, either way, a newcomers introduction to the uses of
'continue' aren't going to happen until he or she has a firm grasp of
other parts of the language.  Moreover, there's no rule saying that a
student has to learn about all the uses of a keyword upon being first
introduced to the keyword.  A student would only inadvertently need to
learn about continuation passing via 'continue' if they misunderstood
the 'continue' statement while learning about loops, and somehow
triggered the interpreter's wraith that they didn't define their
continuation properly.  There's also no hard and fast rule that a
student would learn about complex loop control before they'd be
interested in continuation passing, so it works both ways.  'continue'
also currently has a very well defined, strict syntax.  This is very
important for any potential extensions, as it guarantees that this
proposal is completely backwards compatible.  The uses of the keyword
in the examples are currently syntax errors, meaning they can't be in
any current running code.  The reuse of a keyword already in the
language also guarantees there will be no naming collisions with older
code.

Using 'continue' in this way seems very 'Pythonic'.  It flows
naturally from the meaning of the keywords, and the extended syntax is
still rather simple – it must be a single callable.  There are no
unique rules or corner cases.  It also mimics other return-control
flow additions to the language, most notably the 'yield' construct,
used for generators and co-routines.  Yield is overloaded, in a sense,
to have two different yet similar purposes.  Overloading 'continue' in
this way seems to naturally flow from the evolution of the language.
Likewise, users are still free to intermix yields, returns and
continues in functions and still have a very well defined, easy to
understand behavior.

Some have voiced concerns.  One instance is the appearance of a
'continue' in a try/catch/finally block.

	try:
		continue func(*args)
	catch:
		#error handle

This appears to be an implementation problem, but only until you were
to expand the same block into an equivalent, old-style 'return code'
error handling scheme.

	err = continue func(*args)
	if(err):
		#error handle
	else:
		return err

In this form, it becomes apparent that this is an illegal use of
continue, as 'func' is not a tail-call!  Checking the result of a
tail-call and controlling logic via that doesn't make any sense, and
as such, we can without remorse or regret say 'no tail calls in
try/catch/finally' blocks.  Tail-calls are functional calls where
there is absolutely no more processing to be done in the current stack
frame.  'Clean up' logic or error handling in catch and finally blocks
indicates there is more processing to be done, and as such, there
simply is no well defined 'tail-call' in these circumstances.  They
ought to remain normal stack returns.

'Continue' can certainly propagate exceptions as well as return
values, but those would be caught at whichever stack level initiated
the tail-call loop.

Another concern is the fact that stack traces disappear with these
types of constructs.  The counter argument is that the construct that
they replace (in some instances) are loops, which themselves only have
limited 'stack' trace information.  If you crash in a loop, you don't
necessarily get the iteration of the loop you were on, or any
information from previous iterations.  That is all lost.  Likewise, a
naive implementation of the proposed 'continue' function would also
lose stack trace information – however, it will do so in a way one can
expect.  Just as if one were to implement a catch block that swallowed
every exception and reported nothing, one should be familiar with the
behavior changes in the stack that 'continue' would produce.
Thankfully, unlike some behind the scenes optimization, a 'continue'
keyword clearly shows where tail-calls are made in this proposal.  In
a more advanced implementation, the clear use of 'continue' would
allow the interpreter to annotate the stack trace, such that some
information could easily be provided about current and historical
program flow.

There are some alternative designs for this problems that have been
proposed.  SJBrown has proposed a similar syntax, using the two
keywords 'continue as' instead of simply 'continue'.  This would
further make clear to users unfamiliar with continuations that
behavior is expected to be different here, and the English still flows
elegantly.  It does, however, reuse the 'as' keyword in an unfamiliar
way.  Other proposals followed the same logic, but introduced new
keywords such as 'recurse' or 'tail'.  These avoid any potential
ambiguity with the 'continue' keyword, but are not backwards
compatible.  Furthermore, this proposal sees the actual English of the
word 'continue' to be a strength in this case.

Continuations and tail-calls are different, sometimes clearer, ways to
implement advanced control flow.  They allow the programmer a lot of
power in a small expressive and elegant package.  They are too
frequently hidden behind the scenes, though, as the programmer relies
on compiler identified 'optimization' spots.  Introducing explicit
continuations to Python would allow students learning programming and
advanced users both an expressive, yet very easy to learn and
familiar, construct.  Extending the 'continue' keyword is the best
candidate for this change, as its backwards compatible, parallels
other similar extensions like yield, mimics the current use of the
keyword, and is, in this author's opinion, 'Pythonic' ;)