
The recursion test in pybench is testing the performance of the nested scopes changes, which must do some extra bookkeeping to reference the recursive function in a nested scope. To some extent, a performance hit is a necessary consequence for nested functions with free variables. Nonetheless, there are two interesting things to say about this situation. First, there is a bug in the current implementation of nested scopes that the benchmark tickles. The problem is with code like this: def outer(): global f def f(x): if x > 0: return f(x - 1) The compiler determines that f is free in f. (It's recursive.) If f is free in f, in the absence of the global decl, the body of outer must allocate fresh storage (a cell) for f each time outer is called and add a reference to that cell to f's closure. If f is declared global in outer, then it ought to be treated as a global in nested scopes, too. In general terms, a free variable should use the binding found in the nearest enclosing scope. If the nearest enclosing scope has a global binding, then the reference is global. If I fix this problem, the recursion benchmark shouldn't be any slower than a normal function call. The second interesting thing to say is that frame allocation and dealloc is probably more expensive than it needs to be in the current implementation. The frame object has a new f_closure slot that holds a tuple that is freshly allocated every time the frame is allocated. (Unless the closure is empty, then f_closure is just NULL.) The extra tuple allocation can probably be done away with by using the same allocation strategy as locals & stack. If the f_localsplus array holds cells + frees + locals + stack, then a new frame will never require more than a single malloc (and often not even that). Jeremy