[pypy-dev] The first JITing pypy-c!
arigo at tunes.org
Fri Dec 8 03:19:50 CET 2006
Two days ago, we got our first JITing version of pypy-c!
This was mostly thanks to Arre, who compiled a version even though we
knew it would segfault due to missing support for C structures with
non-int-sized fields. Writing to a 1-byte-sized Bool field would
overwrite the next 3 bytes of memory with zeroes... Nevertheless, the
result managed to successfully run one function (and indeed segfault on
many other functions). Our first JIT-run function! A recursive
Today the field size problem is fixed. Playing around seems to show
that it's harder to provoke a segfault now. The generated machine code
is completely incredible, in size and complexity, but the following
example runs. It could even be said to run faster with the JIT (8.2
seconds versus 11.4 seconds) but that's unfair, as all normal
optimizations are turned off in this example (a regular pypy-c runs this
example in 2.8 seconds). It still shows that our JIT already gives an
improvement over completely-unoptimized C code, which is already some
kind of success!
while n > 0:
n -= 2
Many other examples give an UnboundLocalError, due probably to some
minor bug somewhere either in the JIT transformation or in the back-end
(along the lines of a == compiled as a !=).
If you want to try for yourself:
- check out or switch to the branch
- run "translate.py /path/to/pypy/jit/goal/targetjit.py"
(that's the usual translate.py from translator/goal)
- you get uncompiled C sources for now; copy it safely away
from /tmp/usession-yourname/testing_1/ before it gets deleted,
and compile it ("make" or "make debug").
- run "PYPYJITLOG=log ./testing_1"
import pypyjit; pypyjit.enable(f.func_code)
f(7000000) # see f above
- the above PYPYJITLOG env var causes a file called 'log' to be
produced, containing the generated assembler code. It can be viewed
in a flowgraph-like fashion with pypy/jit/codegen/i386/viewcode.py.
Don't ask me yet to describe the result, nor where the while loop is
:-) If you are familiar with i386 assembler, you'll laugh at the
obviously bad code, too. That's where your help would be appreciated!
Making the backend produce more reasonable code, starting with some
basic register allocation, is a mostly-independent project. The PPC
backend, btw, has already got this kind of techniques (I wonder what
speed-ups we get on PPC from a jitting pypy-c).
More information about the Pypy-dev