antocuni at codespeak.net antocuni at codespeak.net
Sat Dec 20 19:37:47 CET 2008

Author: antocuni
Date: Sat Dec 20 19:37:46 2008
New Revision: 60657

Modified:
Log:
fix another XXX

==============================================================================
+++ pypy/extradoc/talk/ecoop2009/benchmarks.tex	Sat Dec 20 19:37:46 2008
@@ -1,9 +1,6 @@
\section{Benchmarks}
\label{sec:benchmarks}

-\anto{We should say somewhere that flexswitches are slow but benchmarks are so
-  good because they are not involved in the inner loops}
-
In section \ref{sec:tlc-properties}, we saw that TLC provides most of the
features that usaully make dynamically typed language so slow, such as
\emph{stack-based interpreter}, \emph{boxed arithmetic} and \emph{dynamic lookup} of
@@ -39,21 +36,6 @@
multiplication, one subtraction, and one comparison to check if we have
finished the job.

-\commentout{
-\cfbolz{I think we can kill this for space reasons}
-\anto{If we decide to keep them, we must remember to explain the python-like
-  syntax, as it is no longer in tlc.tex}
-
-\begin{lstlisting}
-def main(n):
-    result = 1
-    while n > 1:
-        result = result * n
-        n = n - 1
-    return n
-\end{lstlisting}
-}
-
When doing plain interpretation, we need to create and destroy three temporary
objects at each iteration.  By contrast, the code generated by the JIT does
much better.  At the first iteration, the classes of the two operands of the
@@ -72,21 +54,6 @@
Similarly, we wrote a program to calculate the $n_{th}$ Fibonacci number, for
which we can do the same reasoning as above.

-\commentout{
-\cfbolz{I think we can kill this for space reasons}
-\begin{lstlisting}
-def main(n):
-    a = 0
-    b = 1
-    while n > 1:
-        sum = a + b
-        a = b
-        b = sum
-        n = n - 1
-    return b
-\end{lstlisting}
-}
-
\begin{table}[ht]
\begin{center}

@@ -143,6 +110,15 @@
CLI backend emits slightly non-optimal code and that the underyling .NET JIT
compiler is highly optimized to handle bytecode generated by C\# compilers.

+As we saw in Section~\ref{sec:flexswitches-cli}, the implementation of
+flexswitches on top of CLI is hard and inefficient.  However, our benchmarks
+show that this inefficiency does not affect the overall performances of the
+generated code.  This is because in most programs the vast majority of the
+time is spent in the inner loop: the graphs are built in such a way that all
+the blocks that are part of the inner loop reside in the same method, so that
+all links inside are internal (and fast).
+
+
\subsection{Object-oriented features}

To measure how the JIT handles object-oriented features, we wrote a very

==============================================================================
+++ pypy/extradoc/talk/ecoop2009/clibackend.tex	Sat Dec 20 19:37:46 2008
@@ -66,6 +66,7 @@

\subsection{Implementing flexswitches in CLI}
+\label{sec:flexswitches-cli}

Implementing flexswitches for backends generating machine code is
not too complex: basically, a new jump has to be inserted in the