[Python-Dev] Python VM

Jakob Sievers cadr4u at gmail.com
Mon Jul 21 17:58:20 CEST 2008


Hi,
I've been reading the Python VM sources over the last few afternoons and
I took some notes, which I thought I'd share (and if anyone familiar
with the VM internals could have a quick look at them, I'd really
appreciate it).

Cheers,
-jakob


Unless otherwise noted, the source file in question is Python/ceval.c.

Control Flow
============
The calling sequence is:
main() (in python.c) -> Py_Main() (main.c) -> PyRun_FooFlags() (pythonrun.c) ->
run_bar() (pythonrun.c) -> PyEval_EvalCode() (ceval.c) -> PyEval_EvalCodeEx()
(ceval.c) -> PyEval_EvalFrameEx() (ceval.c).

EvalCodeEx() does some initialization (creating a new execution frame,
argument processing, and some generator-specific stuff I haven't looked at yet)
before calling EvalFrameEx() which contains the main interpreter loop.


Threads
=======
PyEval_InitThreads() initializes the GIL (interpreter_lock) and sets
main_thread to the (threading package dependent) ID of the current thread.
Thread switching is done using PyThreadState_Swap(), which sets
_PyThreadState_Current (both defined in pystate.c) and PyThreadState_GET()
(an alias for _PyThreadState_Current) (pystate.h).


Async Callbacks
===============
Asynchronous callbacks can be registered by adding the function to be called
to pendingcalls[] (see Py_AddPendingCall()). The state of this queue is
communicated to the main loop via things_to_do.


State
=====
The global state is recorded in a (per-process?) PyInterpreterState struct and
a per-thread PyThreadState struct.
Each execution frame's state is contained in that frame's PyFrameObject
(which includes the instruction stream, the environment (globals, locals,
builtins, etc.), the value stack and so forth).
EvalFrameEx()'s local variables are initialized from this frame object.


Instruction Stream
==================
The instruction stream looks as follows (c.f. assemble_emit() in compile.c):
A byte stream where each instruction consists of either
1) a single byte opcode: OP
2) a single byte opcode plus a two-byte immediate argument: OP LO HI
3) a special opcode followed by the real opcode followed by a four byte
   argument: EXTENDED_ARG OP BYTE2 BYTE1 BYTE4 BYTE3


Opcode Prediction
=================
One nice trick used to speed up opcode dispatch is the following:
Using the macros PREDICT() and PREDICTED() it is sometimes possible
to jump directly to the code implementing the next instruction
rather than having to go through the whole loop preamble, e.g.

case FOO:
      // ...
      PREDICT(BAR);
      continue;

PREDICTED(BAR);
case BAR:
     // ...

expands to

case FOO:
      // ...
      if (*next_instr == BAR) goto PRED_BAR;
      continue;

PRED_BAR: next_instr++;
case BAR:
      // ...


Main Loop
=========

Variables and macros used in EvalFrameEx()
------------------------------------------
The value stack:
  PyObject **stack_pointer;
The instruction stream:
  unsigned char *next_instr;
NEXTOP(), NEXTARG(), PEEKARG(), JUMPTO(), and JUMPBY() simply fiddle
with next_instr. Likewise for TOP(), SET_SECOND(), PUSH(), POP(),
etc. and stack_pointer.

Current opcode plus argument:
  int opcode;
  int oparg;

Error status:
  enum why_code why; // no, exn, exn re-raised, return, break, continue, yield
  int err;           // non-zero is error

The environment:
  PyObject *names;
  PyObject *consts;
and
  PyObject **fastlocals;
which is accessed via GETLOCAL() and SETLOCAL().

Finally, there are some more PyObject *'s (v, w, u, and so forth, used
as temporary variables) as well as
  PyObject *retval;


Basic structure
---------------
EvalFrameEx() {
    why = WHY_NOT;
    err = 0;

    for (;;) {    <------------------+---+
        // do periodic tasks         |   |
                                     |   |
    fast_next_opcode:                |   |
        opcode = NEXTOP();           |   |
        if (HAS_ARG(opcode))         |   |
            oparg = NEXTARG();       |   |
                                     |   |
    dispatch_opcode:                 |   |
        switch(opcode) {             |   |
                                     |   |
        continue; -------------------+   |
                                         |
        break; ----------------------+   |
                                     |   |
        // Also, opcode prediction   |   |
        // jumps around inside the   |   |
        // switch statement          |   |
                                     |   |
        }    <-----------------------+   |
                                         |
    on_error:                            |
        // no error: continue -----------+
        // otherwise why == WHY_EXCEPTION after this

    fast_block_end:
        // unwind stacks if there was an error
    }

    // more unwinding

fast_yield:
    // reset current thread's exception info
exit_eval_frame:
    // set thread's execution frame to previous execution frame
    return retval;
}

Periodic Tasks
--------------
By checking and decrementing _Py_Ticker, the main loop executes certain tasks
once every _Py_CheckInterval iterations (in fact Py_AddPendingCall() sets
_Py_Ticker to zero, ensuring that pending calls are executed right after the
next instruction which doesn't jump to fast_next_opcode):
  - If there are things_to_do, Py_MakePendingCalls() is called.
  - The GIL is releases and re-acquired, giving other threads a chance to run.

Instruction implementation
--------------------------
Some general notes:
  - Successful instructions either goto fast_next_opcode or continue.
  - Unsuccessful instructions break out of the switch.
  - The value stack holds only PyObject *'s.
  - Instructions must take care to Py_INCREF() and Py_DECREF() the reference
    counts of PyObject's whose addresses have been pushed onto/popped off the
    value stack.
  - Objects are transferred onto the value stack by GETITEM()'ing them from
    consts or names, or by GETLOCAL()'ing them using oparg as an offset into
    fastlocals (c.f. LOAD_* instructions).
  - err is used as a general `error occurred' flag, both inside the code
    implementing an opcode and `globally' for the entire loop.

Nested blocks
-------------
Nested loop and try blocks are handled as follows:
Each frame maintains a block stack; when entering a nested block, a SETUP_*
instruction adds a PyTryBlock to the PyFrameObject's f_blockstack and
registers that block's type (instruction which created the block), handler
(offset into the instruction stream) and level (value stack level before the
nested block was entered).
When such blocks are exited normally (e.g. last iteration of a loop), the final
POP_BLOCK instruction restores the value stack to the state it was in before
the block.
If a block is exited abnormally (e.g. a break instruction), the code following
fast_block_end unwinds the value stack and jumps to the block's handler.
Certain instructions, e.g. RETURN_VALUE, cause the entire block stack to be
unwound (leading to multiple unwinds of the value stack).
See also the comments in compile.c (compiler_try_finally()) which include nice
ASCII art diagrams (incidentally, there's a typo on line 2382: s/en/an/).

Error handling
--------------
Internal errors (bad oparg, say) generally result in
why being set to WHY_EXCEPTION and breaking out of the switch
(if the code implementing the instruction doesn't set why, the code
after on_error will).
The code following fast_block_end will jump to the correct exception
handler and set the global exception related variables
(exception information is stored in the current execution frame and
the thread state. See set_exc_info() and reset_exc_info()).

// eof


More information about the Python-Dev mailing list