
Hi all, Two days ago, we got our first JITing version of pypy-c! This was mostly thanks to Arre, who compiled a version even though we knew it would segfault due to missing support for C structures with non-int-sized fields. Writing to a 1-byte-sized Bool field would overwrite the next 3 bytes of memory with zeroes... Nevertheless, the result managed to successfully run one function (and indeed segfault on many other functions). Our first JIT-run function! A recursive factorial. Today the field size problem is fixed. Playing around seems to show that it's harder to provoke a segfault now. The generated machine code is completely incredible, in size and complexity, but the following example runs. It could even be said to run faster with the JIT (8.2 seconds versus 11.4 seconds) but that's unfair, as all normal optimizations are turned off in this example (a regular pypy-c runs this example in 2.8 seconds). It still shows that our JIT already gives an improvement over completely-unoptimized C code, which is already some kind of success! def f(n): while n > 0: n -= 2 return n Many other examples give an UnboundLocalError, due probably to some minor bug somewhere either in the JIT transformation or in the back-end (along the lines of a == compiled as a !=). If you want to try for yourself: - check out or switch to the branch http://codespeak.net/svn/pypy/branch/jit-real-world - run "translate.py /path/to/pypy/jit/goal/targetjit.py" (that's the usual translate.py from translator/goal) - you get uncompiled C sources for now; copy it safely away from /tmp/usession-yourname/testing_1/ before it gets deleted, and compile it ("make" or "make debug"). - run "PYPYJITLOG=log ./testing_1" import pypyjit; pypyjit.enable(f.func_code) f(7000000) # see f above - the above PYPYJITLOG env var causes a file called 'log' to be produced, containing the generated assembler code. It can be viewed in a flowgraph-like fashion with pypy/jit/codegen/i386/viewcode.py. Don't ask me yet to describe the result, nor where the while loop is :-) If you are familiar with i386 assembler, you'll laugh at the obviously bad code, too. That's where your help would be appreciated! Making the backend produce more reasonable code, starting with some basic register allocation, is a mostly-independent project. The PPC backend, btw, has already got this kind of techniques (I wonder what speed-ups we get on PPC from a jitting pypy-c). Have fun, Armin