On 11/10/12 7:50 AM, Devin Jeanpierre wrote:
On Sat, Nov 10, 2012 at 6:57 AM, Laurens Van Houtven <_@lvh.cc> wrote:
The README suggests that doing this optimization on a bytecode level may be
better than doing it on a source/AST level. Can you explain why?
[...]

In particular there's a directly analogous CPS form for bytecode,
where every bytecode instruction in a bytecode string is considered to
be equivalent to a CPS function, and continuations represent returns
from functions and not expression evaluation, and the CPS functions
just navigate through the bytecode stream using tail calls, while
keeping track of the expression evaluation stack and the call stack
and globals dict.

Right... looking at an example output from the transform:

    v13 = n
    v14 = 1
    v4 = v13 == v14
    if v4:
       ... 

Viewed in CPS this might look like:

(VAR, [v13, (INT, [v14, ((EQ, v13, v14), TEST, ...)])])

Where each node is (EXP, CONT).  In this case the result of each operation is put into a variable/register (e.g., 'v13'), but python's bytecodes actually operate on the frame stack.  So if there were some way to change this to

(VAR, [PUSH, (INT, [PUSH, ((EQ 0 1)), TEST, ...)])])

Where (EQ 0 1) means 'apply EQ to the top two items on the stack'.

The code above puts each value into a local variable, which gets pushed onto the stack anyway by the compiled bytecode.

Another advantage to generating bytecode directly would be support for python 2, since I think 'nonlocal' can be done at the bytecode level.

-Sam