[pypy-commit] extradoc extradoc: improve the description of the sqrt(Fix16) benchmark
cfbolz
noreply at buildbot.pypy.org
Fri Aug 17 16:41:23 CEST 2012
Author: Carl Friedrich Bolz <cfbolz at gmx.de>
Branch: extradoc
Changeset: r4687:0f317439cbe1
Date: 2012-08-17 16:40 +0200
http://bitbucket.org/pypy/extradoc/changeset/0f317439cbe1/
Log: improve the description of the sqrt(Fix16) benchmark
diff --git a/talk/dls2012/paper.tex b/talk/dls2012/paper.tex
--- a/talk/dls2012/paper.tex
+++ b/talk/dls2012/paper.tex
@@ -984,18 +984,21 @@
approximation using $x_i = \left( x_{i-1} + y/x_{i-1} \right) / 2$ for $1\leq i < 10^8$.
Only the latest calculated value $x_i$ is kept alive as a local variable within the loop.
There are three different versions of this benchmark where $x_i$
- is represented with different type of objects, $T$,: int's, float's and
+ is represented with different type $T$ of objects: int's, float's and
Fix16's. The latter, Fix16, is a custom class that implements
- fixpoint arithmetic with 16 bits precision. In Python there is only
+ fixpoint arithmetic with 16 bits precision. In Python and Lua there is only
a single implementation of the benchmark that gets specialized
- depending on the class of it's input argument, $y$, while in C,
- there are three different implementations. In Lua there is no support for
- integers so only the floating point number is provided.
-
- \XXXfixme: mikepall fijal, cfbolz: Also, sqrt(Fix16) is now a
- meaningful result, but the text describing the benchmarks hasn't
- changed.
-
+ depending on the class of it's input argument, $y$. In C,
+ there are three different implementations.
+
+The Fix16 type is a custom class with operator overloading in Lua and Python.
+The C version uses a C++ class. The goal of this variant of the benchmark is to
+check how large the overhead of a custom arithmetic class is, compared to
+builtin data types.
+
+In Lua there is no direct support for
+integers so the int version is not provided.
+
\item {\bf conv3}$\left(n\right)$: one-dimensional convolution with fixed kernel-size $3$. A single loop
is used to calculate a vector ${\bf b} = \left(b_1, \cdots, b_{n-2}\right)$ from a vector
${\bf a} = \left(a_1, \cdots, a_n\right)$ and a kernel ${\bf k} = \left(k_1, k_2, k_3\right)$ using
@@ -1131,6 +1134,16 @@
\texttt{http://wiki.luajit.org/Optimizations}} and produces much better
machine code than PyPy.
+The slowdown of sqrt(Fix16) compared to sqrt(int) or sqrt(float) show the
+overhead of using a custom class with operator overloading for arithmetic. For
+C/C++, this overhead is very low, for CPython the code becomes 30 times slower.
+In LuaJIT, the overhead is a slowdown of 70\%. For PyPy, sqrt(Fix16) is only
+slightly slower than sqrt(int), which is itself three times slower than
+sqrt(float). This is probably due to the additional overflow checking necessary
+for integer arithmetic in Python. The fact that LuaJIT and PyPy do so well on
+sqrt(Fix16) shows that the allocation removal/sinking optimizations work well
+in both JITs.
+
\section{Related Work}
\label{sec:related}
More information about the pypy-commit
mailing list