antocuni at codespeak.net antocuni at codespeak.net
Tue Mar 31 16:36:21 CEST 2009

Author: antocuni
Date: Tue Mar 31 16:36:21 2009
New Revision: 63450

Modified:
Log:

==============================================================================
+++ pypy/extradoc/talk/icooolps2009/paper.tex	Tue Mar 31 16:36:21 2009
@@ -23,6 +23,7 @@
\newcommand\arigo[1]{\nb{AR}{#1}}
\newcommand{\commentout}[1]{}

+\normalem

\let\oldcite=\cite

@@ -308,13 +309,25 @@
(from the point of view of a VM author, the "user" is a programmer using the
VM).

-A tracing JIT compiler finds the hot loops of the program it is compiling. In
+\anto{I find this para a bit confusing, but I don't really know how to improve
+  it. I try a rewriting, feel free to revert}
+
+\sout{A tracing JIT compiler finds the hot loops of the program it is compiling. In
our case, this program is the language interpreter. The hot loop of the language
interpreter is the bytecode dispatch loop. Usually it is also the only hot loop.
-Tracing one iteration of this loop means that the execution of one bytecode was
+Tracing one iteration of this loop means that the execution of one opcode was
seen. This means that the resulting machine code will correspond to a loop, that
-assumes that this particular bytecode will be executed many times in a row,
-which is clearly very unlikely.
+assumes that this particular opcode will be executed many times in a row,
+which is clearly very unlikely.}
+
+\anto{Similarly, we need to distinguish two kind of loops: \emph{interpreter
+    loops} are loops \textbf{inside} the language interpreter (e.g., the
+  bytecode dispatch loop). \emph{User loops} are loops in the user program.}
+
+\anto{To improve the performances of the user programs, we need to find and
+  compile traces that correspond to user loops.  The strategy described in
+  section \ref{sect:tracing} does not work well, as it only detects
+  interpreter loops, and in particular the bytecode dispatch one.}

\begin{figure}
\input{code/tlr-paper.py}
@@ -351,13 +364,20 @@
\texttt{bytecode} argument is a string of bytes and all register and the
accumulator are integers. A simple program for this interpreter that computes
the square of the accumulator is shown in Figure \ref{fig:square}. If the
-tracing interpreter traces the execution of the \texttt{DECR\_A} bytecode, the
+tracing interpreter traces the execution of the \texttt{DECR\_A} opcode (whose integer value is 7), the
trace would look as follows:
\input{code/normal-tracing.txt}

+\anto{ Because of the guard on \texttt{opcode0}, the code compiled from this
+  trace will be useful only when executing a \texttt{DECR\_A}.  For all the
+  other operations the guard will fail, making the whole system probably
+  slower than without JIT.}
+
+% YYY: (anto) I reviewd until here
+
To improve this situation, the tracing JIT could trace the execution of several
-bytecodes, thus effectively unrolling the bytecode dispatch loop. Ideally, the
-bytecode loop should be unrolled exactly so much, that the unrolled version
+opcodes, thus effectively unrolling the bytecode dispatch loop. Ideally, the
+bytecode dispatch loop should be unrolled exactly so much, that the unrolled version
corresponds to a loop on the level of the user program. A loop in the user
program occurs when the program counter of the language interpreter has the
same value many times. This program counter is typically one or several
@@ -460,7 +480,7 @@

\subsection{Improving the Result}

-The critical problem of tracing the execution of just one bytecode has been
+The critical problem of tracing the execution of just one opcode has been
solved, the loop corresponds exactly to the loop in the square function.
However, the resulting trace is a bit too long. Most of its operations are not
actually doing any computation that is part of the square function. Instead,
@@ -493,7 +513,7 @@
\begin{figure}
\input{code/full.txt}
\caption{Trace when executing the Square function of Figure \ref{fig:square},
-with the corresponding bytecodes as comments. The constant-folding of operations
+with the corresponding opcodes as comments. The constant-folding of operations
on green variables is enabled.}
\label{fig:trace-full}
\end{figure}