[pypy-svn] r68414 - pypy/trunk/pypy/doc/jit

benjamin at codespeak.net benjamin at codespeak.net
Wed Oct 14 04:57:03 CEST 2009


Author: benjamin
Date: Wed Oct 14 04:57:02 2009
New Revision: 68414

Modified:
   pypy/trunk/pypy/doc/jit/pyjitpl5.txt
Log:
start writing some actual documentation on JIT internals

Modified: pypy/trunk/pypy/doc/jit/pyjitpl5.txt
==============================================================================
--- pypy/trunk/pypy/doc/jit/pyjitpl5.txt	(original)
+++ pypy/trunk/pypy/doc/jit/pyjitpl5.txt	Wed Oct 14 04:57:02 2009
@@ -1,9 +1,119 @@
-========================================================================
-                              PyJitPl5
-========================================================================
+==========
+ PyJitPl5
+==========
 
-The documentation about the current JIT is available as a
-first published article:
+This document describes the fith generation of PyPy's JIT.
+
+
+Implementation of the JIT
+=========================
+
+The JIT's `theory`_ is great in principle, but actual code is a different
+story. This section tries to give a high level overview of how PyPy's JIT is
+implemented.  It's helpful to have a basic understanding of the PyPy
+`translation tool chain`_ before digging into the sources.
+
+Almost all JIT specific code is found in the two pypy/jit subdirectories,
+metainterp, and backend.  The metainterp directory holds platform independent
+code including the translation generator, the tracer, and the optimizer.  Code
+in the backend directory is responsible for generating machine code.
+
+.. _`theory`: overview.html
+.. _`translation tool chain`: ../translation.html
+
+
+JIT hints
+---------
+
+To add a JIT to an interpreter, PyPy only requires that two hints be added to
+the interpreter source.  These are jit_merge_point and can_enter_jit.
+jit_merge_point is supposed to go at the start of opcode dispatch.  It allows
+the JIT to bail back to the interpreter in case assembler code fails at some
+point.  can_enter_jit goes at the close of a application level loop.  In the
+Python interpreter, this is the JUMP_ABSOLUTE bytecode.  The Python interpreter
+defines its hints in pypy/module/pypyjit/interp_jit.py.
+
+The interpreter wishing to use the PyPy's JIT must define a list of *green*
+variables and a list of *red* variables.  The *green* variables are loop
+constants.  They are used to identify the current loop.  Red variables are for
+everything else used in the execution loop.  For example, the Python interpreter
+passes the code object and the instruction pointer as greens and the frame
+object and execution context as reds.  These objects are passed to the JIT at
+the location of the JIT hints.
+
+
+JIT Generation
+--------------
+
+After the RTyping phase of translation, where high level Python operations are
+turned into low-level ones for the backend, the translator calls apply_jit() in
+metainterp/warmspot.py to add a JIT compiler to the currently translating
+interpreter.  apply_jit() decides what assembler backend to use then delegates
+the rest of the work to the WarmRunnerDesc class.  WarmRunnerDesc finds the two
+JIT hints in the function graphs.  It rewrites the graph containing the
+jit_merge_point hint, called the portal graph, to be able to handle special JIT
+exceptions, which indicate special operations to the interpreter.  The location
+of the can_enter_jit hint is changed to check if the current loop is "hot" and
+should be compiled.
+
+Next, starting with the portal graph, metainterp/codewriter.py converts the
+graphs of the interpreter into JIT bytecode.  Since this bytecode is stored in
+the final binary, it's designed to be concise rather than fast.  The bytecode
+codewriter doesn't "see" (what it sees is defined by the JIT's policy) every
+part of the interpreter.  In these cases, it simply inserts an opaque call.
+
+Finally, translation finishes, including the bytecode of the interpreter in the
+final binary, and interpreter is ready to use the runtime component of the JIT!
+
+
+Tracing
+-------
+
+Application code running on the JIT-enabled interpreter starts normally; it is
+interpreted on top of the usual evaluation loop.  When an application loop is
+closed (where the can_enter_jit hint was), the interpreter calls the
+maybe_compile_and_run() method of WarmEnterState.  This method increments a
+counter associated with the current green variables.  When this counter reaches
+a certain level, usually indicating the application loop has been run many
+times, the JIT enters tracing mode.
+
+*Tracing* is where JIT interprets the bytecode, generated at translation time,
+of the interpreter interpreting the application level code.  This allows it to
+see the exact operations that make up the application level loop.  Tracing is
+preformed by MetaInterp and MIFrame classes in metainterp/pyjitpl.py.
+maybe_compile_and_run() creates a MetaInterp and calls the
+compile_and_run_once() method.  This initializes the MIFrame for the input
+arguments of the loop, the red and green variables passed from the
+jit_merge_point hint, and sets it to start interpreting the bytecode of the
+portal graph.
+
+Before starting the interpretation, the loop input arguments are wrapped in a
+*box*.  Boxes (defined in metainterp/history.py) wrap the value and type of a
+variable in the program the JIT is interpreting.  There are two main varieties
+of boxes: constant boxes and normal boxes.  Constant boxes are used for values
+assumed to be known during tracing.  These are not necessarily compile time
+constants.  All values which are "promoted", assumed to be constant by the JIT
+for optimization purposes, are also stored in constant boxes.  Normal boxes
+contain values that may change during the running of a loop.  There are three
+kinds of normal boxes: BoxInt, BoxPtr, and BoxFloat, and four kinds of constant
+boxes: ConstInt, ConstPtr, ConstFloat, and ConstAddr.
+
+The meta-interpreter starts interpreting the JIT bytecode.  Each operation is
+executed and then recorded in a list of operations and arguments called the
+trace.  All possible operations generated by tracing are listed in
+metainterp/resoperation.py.  When a (interpreter-level) call to a function the
+JIT has bytecode for occurs in the bytecode, another frame is added to the stack
+and the tracing continues with the same list.  This flattens the list of
+operations.  Interpretation continues until the can_enter_jit hint is seen.  At
+this point, a whole interation of the application level loop has been seen and
+recorded.
+
+
+More resources
+==============
+
+More documentation about the current JIT is available as a first published
+article:
 
 * `Tracing the Meta-Level: PyPy's Tracing JIT Compiler`__
 



More information about the Pypy-commit mailing list