[Python-Dev] Can someone explain the fast_block_end manipulation?

Thu Jan 8 05:30:11 CET 2009

Everybody seems to be doing stuff with the virtual machine all of a sudden.
I thought I would get in on the fun.  I am generating functions from the
byte code which pretty much just inlines the C code implementing each
opcode.  The idea is to generate a C function that looks like a small
version of PyEval_EvalFrameEx but without the for loop and switch statement.
Instead it just contains the C code implementing the opcodes used in that
function.  For example, this function

    >>> def f(a):
    ...   for i in range(a):
    ...     x = i*i

disassembles to this:

  2           0 SETUP_LOOP              30 (to 33) 
              3 LOAD_GLOBAL              0 (range) 
              6 LOAD_FAST                0 (a) 
              9 CALL_FUNCTION            1 
             12 GET_ITER             
        >>   13 FOR_ITER                16 (to 32) 
             16 STORE_FAST               1 (i) 

  3          19 LOAD_FAST                1 (i) 
             22 LOAD_FAST                1 (i) 
             25 BINARY_MULTIPLY      
             26 STORE_FAST               2 (x) 
             29 JUMP_ABSOLUTE           13 
        >>   32 POP_BLOCK            
        >>   33 LOAD_CONST               0 (None) 
             36 RETURN_VALUE         

and compiles to this

    #include "opcode_mini.h"

    PyObject *
    _PyEval_EvalMiniFrameEx(PyFrameObject *f, int throwflag)
    {

            static int minime = 1;
            static int jitting = 1;

            /* most of the stuff at the start of PyEval_EvalFrameEx */
            PyEval_EvalFrameEx_PROLOG();

            /* code length=37 */
            /* nlabels=3, offsets: 13, 32, 33, */

            oparg = 30
            SETUP_LOOP_IMPL(oparg); /* 0 */
            oparg = 0
            LOAD_GLOBAL_IMPL(oparg, 0); /* 3 */
            oparg = 0
            LOAD_FAST_IMPL(oparg); /* 6 */
            oparg = 1
            CALL_FUNCTION_IMPL(oparg); /* 9 */
            GET_ITER_IMPL(); /* 12 */
    __L13:
            FOR_ITER_IMPL(__L32);
            oparg = 1
            STORE_FAST_IMPL(oparg); /* 16 */
            oparg = 1
            LOAD_FAST_IMPL(oparg); /* 19 */
            oparg = 1
            LOAD_FAST_IMPL(oparg); /* 22 */
            BINARY_MULTIPLY_IMPL(); /* 25 */
            oparg = 2
            STORE_FAST_IMPL(oparg); /* 26 */
            goto __L13;
    __L32:
            POP_BLOCK_IMPL(); /* 32 */
    __L33:
            oparg = 0
            LOAD_CONST_IMPL(oparg); /* 33 */
            RETURN_VALUE_IMPL(); /* 36 */

            /* most of the stuff at the end of PyEval_EvalFrameEx */
            PyEval_EvalFrameEx_EPILOG();
    }

Besides eliminating opcode decoding I figure it might give the compiler lots
of optimization opportunities.  Time will tell though.

I have just about everything implemented but I'm a bit stuck trying to
figure out how to deal with the block manipulation code in
PyEval_EvalFrameEx after the fast_block_end label.  JUMP* opcodes in the
interpreter turn into gotos in the generated code.  It seems I will have to
replace any JUMP instructions in the epilog with computed gotos.  In
particular, I am a little confused by this construct:

                        if (b->b_type == SETUP_LOOP && why == WHY_CONTINUE) {
                                /* For a continue inside a try block,
                                   don't pop the block for the loop. */
                                PyFrame_BlockSetup(f, b->b_type,
                                                   b->b_handler,
                                                   b->b_level); \
                                why = WHY_NOT;
                                JUMPTO(PyLong_AS_LONG(retval));
                                Py_DECREF(retval);
                                break;
                        }

The top of stack has been popped into retval.  I think that value was maybe
pushed here:

                        if (b->b_type == SETUP_FINALLY) {
                                if (why & (WHY_RETURN | WHY_CONTINUE))
                                        PUSH(retval);
                                PUSH(PyLong_FromLong((long)why));
                                why = WHY_NOT;
                                JUMPTO(b->b_handler);
                                break;
                        }

but I'm confused.  I don't see anyplace obvious where a value resembling a
jump offset or jump target was pushed onto the stack.  What's with that
first JUMPTO in the SETUP_LOOP/WHY_CONTINUE code?  Is the stack/block
cleanup code documented anywhere?  Wiki?  Pointers to python-dev threads?

I found this brief thread from last July:

    http://mail.python.org/pipermail/python-dev/2008-July/thread.html#81480

A svn annotate suggests that much of the fun in this code began with a
checkin by Jeremy Hylton (r19260).  It references an old SF patch (102989)
but I can't locate that in the current issue tracker to read the
discussion.  Is there some way I can retrieve that?  The obvious

    http://bugs.python.org/issue102989

didn't work for me.

Thx,

Skip