[pypy-svn] r68414 - pypy/trunk/pypy/doc/jit
benjamin at codespeak.net
benjamin at codespeak.net
Wed Oct 14 04:57:03 CEST 2009
Author: benjamin
Date: Wed Oct 14 04:57:02 2009
New Revision: 68414
Modified:
pypy/trunk/pypy/doc/jit/pyjitpl5.txt
Log:
start writing some actual documentation on JIT internals
Modified: pypy/trunk/pypy/doc/jit/pyjitpl5.txt
==============================================================================
--- pypy/trunk/pypy/doc/jit/pyjitpl5.txt (original)
+++ pypy/trunk/pypy/doc/jit/pyjitpl5.txt Wed Oct 14 04:57:02 2009
@@ -1,9 +1,119 @@
-========================================================================
- PyJitPl5
-========================================================================
+==========
+ PyJitPl5
+==========
-The documentation about the current JIT is available as a
-first published article:
+This document describes the fith generation of PyPy's JIT.
+
+
+Implementation of the JIT
+=========================
+
+The JIT's `theory`_ is great in principle, but actual code is a different
+story. This section tries to give a high level overview of how PyPy's JIT is
+implemented. It's helpful to have a basic understanding of the PyPy
+`translation tool chain`_ before digging into the sources.
+
+Almost all JIT specific code is found in the two pypy/jit subdirectories,
+metainterp, and backend. The metainterp directory holds platform independent
+code including the translation generator, the tracer, and the optimizer. Code
+in the backend directory is responsible for generating machine code.
+
+.. _`theory`: overview.html
+.. _`translation tool chain`: ../translation.html
+
+
+JIT hints
+---------
+
+To add a JIT to an interpreter, PyPy only requires that two hints be added to
+the interpreter source. These are jit_merge_point and can_enter_jit.
+jit_merge_point is supposed to go at the start of opcode dispatch. It allows
+the JIT to bail back to the interpreter in case assembler code fails at some
+point. can_enter_jit goes at the close of a application level loop. In the
+Python interpreter, this is the JUMP_ABSOLUTE bytecode. The Python interpreter
+defines its hints in pypy/module/pypyjit/interp_jit.py.
+
+The interpreter wishing to use the PyPy's JIT must define a list of *green*
+variables and a list of *red* variables. The *green* variables are loop
+constants. They are used to identify the current loop. Red variables are for
+everything else used in the execution loop. For example, the Python interpreter
+passes the code object and the instruction pointer as greens and the frame
+object and execution context as reds. These objects are passed to the JIT at
+the location of the JIT hints.
+
+
+JIT Generation
+--------------
+
+After the RTyping phase of translation, where high level Python operations are
+turned into low-level ones for the backend, the translator calls apply_jit() in
+metainterp/warmspot.py to add a JIT compiler to the currently translating
+interpreter. apply_jit() decides what assembler backend to use then delegates
+the rest of the work to the WarmRunnerDesc class. WarmRunnerDesc finds the two
+JIT hints in the function graphs. It rewrites the graph containing the
+jit_merge_point hint, called the portal graph, to be able to handle special JIT
+exceptions, which indicate special operations to the interpreter. The location
+of the can_enter_jit hint is changed to check if the current loop is "hot" and
+should be compiled.
+
+Next, starting with the portal graph, metainterp/codewriter.py converts the
+graphs of the interpreter into JIT bytecode. Since this bytecode is stored in
+the final binary, it's designed to be concise rather than fast. The bytecode
+codewriter doesn't "see" (what it sees is defined by the JIT's policy) every
+part of the interpreter. In these cases, it simply inserts an opaque call.
+
+Finally, translation finishes, including the bytecode of the interpreter in the
+final binary, and interpreter is ready to use the runtime component of the JIT!
+
+
+Tracing
+-------
+
+Application code running on the JIT-enabled interpreter starts normally; it is
+interpreted on top of the usual evaluation loop. When an application loop is
+closed (where the can_enter_jit hint was), the interpreter calls the
+maybe_compile_and_run() method of WarmEnterState. This method increments a
+counter associated with the current green variables. When this counter reaches
+a certain level, usually indicating the application loop has been run many
+times, the JIT enters tracing mode.
+
+*Tracing* is where JIT interprets the bytecode, generated at translation time,
+of the interpreter interpreting the application level code. This allows it to
+see the exact operations that make up the application level loop. Tracing is
+preformed by MetaInterp and MIFrame classes in metainterp/pyjitpl.py.
+maybe_compile_and_run() creates a MetaInterp and calls the
+compile_and_run_once() method. This initializes the MIFrame for the input
+arguments of the loop, the red and green variables passed from the
+jit_merge_point hint, and sets it to start interpreting the bytecode of the
+portal graph.
+
+Before starting the interpretation, the loop input arguments are wrapped in a
+*box*. Boxes (defined in metainterp/history.py) wrap the value and type of a
+variable in the program the JIT is interpreting. There are two main varieties
+of boxes: constant boxes and normal boxes. Constant boxes are used for values
+assumed to be known during tracing. These are not necessarily compile time
+constants. All values which are "promoted", assumed to be constant by the JIT
+for optimization purposes, are also stored in constant boxes. Normal boxes
+contain values that may change during the running of a loop. There are three
+kinds of normal boxes: BoxInt, BoxPtr, and BoxFloat, and four kinds of constant
+boxes: ConstInt, ConstPtr, ConstFloat, and ConstAddr.
+
+The meta-interpreter starts interpreting the JIT bytecode. Each operation is
+executed and then recorded in a list of operations and arguments called the
+trace. All possible operations generated by tracing are listed in
+metainterp/resoperation.py. When a (interpreter-level) call to a function the
+JIT has bytecode for occurs in the bytecode, another frame is added to the stack
+and the tracing continues with the same list. This flattens the list of
+operations. Interpretation continues until the can_enter_jit hint is seen. At
+this point, a whole interation of the application level loop has been seen and
+recorded.
+
+
+More resources
+==============
+
+More documentation about the current JIT is available as a first published
+article:
* `Tracing the Meta-Level: PyPy's Tracing JIT Compiler`__
More information about the Pypy-commit
mailing list