[Python-Dev] Proper tail recursion

Christopher T King squirrel at WPI.EDU
Wed Jul 14 20:36:51 CEST 2004

JanC recommended I post this on python-dev to get feedback.  To sum up the 
previous posts in my thread on comp.lang.python, I've created a patch that 
optimizes tail calls in the CPython interpreter, so that the stack is not 
used for functions called in a tail context.

The current patch is only a partial implementation, and might have some
memory leaks with regards to exceptions (I haven't compared this to
vanilla 2.4a1's behaviour yet), but works for simple test cases.  If such
behaviour is deemed desirable, I'll finish the patch, clean up the code,
and work out any bugs.

The relevant portion of the original message follows.


I have a preliminary implementation ready, the patch is against 2.4a1: 

This patch only works for simple functions (i.e. those that can be 
handled by fast_function), but extension to other function types should 
be trivial. I haven't fully tested this with regards to exception 
handling and whatnot, so I want to stick with a simple case for now.

The patch works roughly as follows:
1. When a CALL_FUNCTION opcode is encountered, check if the following 
opcode is RETURN_VALUE. If so, execute the tail call code.
2. The tail call code calls a modified version of call_function that 
sets up and returns a frame object for the function to be called (if 
possible), rather than recursively calling PyEval_EvalFrame. This frame 
is stored in a temporary variable, and a RETURN_VALUE is simulated to 
exit the loop.
3. After cleaning up the current frame, PyEval_EvalFrame loops back up 
to the top, now using the temporarily stored frame as the current frame.

Of course, instead of a loop, gcc's tail-call optimization feature could 
be used, but this would be non-portable to other compilers.

An example of the patch in action:

# Note default arguments aren't supported in the current patch
def fact2(n,v):
     if n:
         return fact2(n-1,v*n)
         return v

def fact(n):
     return fact2(n,1)

Without the patch:
 >>> fact(10000)
RuntimeError: maximum recursion depth exceeded

With the patch:
 >>> fact(10000)
<really really huge number>

Any feedback would be greatly appreciated!

More information about the Python-Dev mailing list