[Python-Dev] Pie-thon benchmarks

Sun Dec 14 13:59:32 EST 2003

[Dennis Allison]
> ...
> A simple benchmark like pystone doesn't tell much.  Large systems
> used for real work are more interesting and a more realistic measure
> of the language+implementation than small synthetic benchmarks.  For
> example, Zope might be a good choice, but measuring its performance
> is an interesting (and difficult) problem in itself.

Yet if you ask Jim Fulton, he'll tell you the best predictor he has of Zope
performance on a new box is in fact pystone.  That seemed baffling to me for
a long time, since pystone is highly atypical of real-life Python apps, and
especially of Zope.  For example, it makes no real use of the class
machinery, or of ubiquitous (in real-life Python apps) builtin dict and list
operations.

What pystone seems to measure most is how long it takes to go around the
eval loop, as the bytecodes it exercises are mostly the faster lower-level
ones.  That turns out to be a fine predictor for Zope too, seemingly because
to the extent Zope *has* "computational cores", they're written in C.
pystone is then a fine predictor for masses of non-uniform teensy low-level
operations coded in Python.

If you want a benchmark to make CPython look good, do a few hundred thousand
very-long int multiplications, stick 'em in a list, and sort it <wink>.