[Python-Dev] A new JIT compiler for a faster CPython?

Wed Jul 18 11:45:25 CEST 2012

Some of my (reasonably well informed) opinions on this subject...

The theory
----------

Don't think in terms of speeding up your program.
Think in terms of reducing the time spent executing your program.

Performance is improved by removing aspects of the execution overhead.
In a talk I gave at EuroPython 2010(1), I divided the overhead into 5 parts:

Interpretive
Imprecise type information
Parameter Handling & Call overhead
Lookups (globals/builtins/attributes)
Memory management (garbage collection)

For optimising CPython, we cannot change the GC from ref-counting,
but the other 4 apply, plus boxing and unboxing of floats and ints.

Compilation (by which I assume people mean converting bytecodes to machine code)
addresses only the first point (by definition).

I worry that Victor is proposing to make the same mistake made by Unladen Swallow,
which is to attack the interpretive overhead first, then attack the other overheads.
This is the wrong way around. If you want good performance, JIT compilation should
come last not first.

Results from my PhD thesis(2) show that the original HotPy without any JIT
compilation outperformed Unladen Swallow using JIT compilation.
In other words, an optimising interpreter for Python will be faster than a
naive JIT compiler. The optimised bytecode traces in an optimising interpreter
are much better input for a JIT compiler than the original bytecodes.

The practice
------------

If you want modest speedup for modest effort, then look at Cesare's WPython.
Also take a look at Stefan Brunthaler's work on inline caching in an interpreter.

If you want a larger speedup then you need to tackle most or all of the causes of
execution overhead listed above.
HotPy (version 2, a fork of CPython) aims to tackle all of these causes except
the GC overhead. As far as I am aware, it is the only project that does so.

Please take a look at www.hotpy.org for more information on HotPy.
You can see my talk from EuroPython 2011(3) on the ideas behind it and
from EuroPython 2012(4) on the current implementation.

Finally, a defence of LLVM.

LLVM is a quality piece of software. It may have some bugs, so does all software.
The code-generation components are designed with static compilation in mind,
so they do use a lot of memory and run slowly for a JIT compiler,
but they produce excellent quality code.
And don't forget the old saying about blaming your tools ;)

If HotPy (version 2) were to have an (optional) JIT I would expect it to be LLVM based.
The JIT can run in a separate thread, while the optimised code continues to run
in the interpreter, patching in the machine code when it is complete.

Cheers,
Mark.

1) Talk at EuroPython 2010
Slides: www.dcs.gla.ac.uk/~marks/comparison.pdf
Video: http://blip.tv/europythonvideos/mark_shannon-_hotpy_a_comparison-3999872
The information in the talk is a bit out of date; PyPy now includes out-of-line guards.

2) theses.gla.ac.uk/2975/01/2011shannonphd.pdf

3) Talk at EuroPython 2011
https://ep2012.europython.eu/conference/talks/making-cpython-fast-using-trace-based-optimisations

4) Talk at EuroPythnon 2012
https://ep2012.europython.eu/conference/talks/hotpy-2-a-high-performance-binary-compatible-virtual-machine-for-python