[Python-Dev] A new JIT compiler for a faster CPython?

Wed Jul 18 01:09:23 CEST 2012

> As your original message shows, there has already been
> enough duplication of effort in this area.

I didn't find yet a project reusing ceval.c: most projects implement
their own eval loop and don't use CPython at all.

My idea is not to write something new, but just try to optimize the
existing ceval.c code. Pseudo-code:

 * read the bytecode of a function
 * replace each bytecode by its "C code"
 * optimize
 * compile the "C code" to machine code

(I don't know if "C code" is the right expression here, it's just for
the example)

Dummy example:
----
def mysum(a, b):
   return a+b
----

Python compiles it to bytecode as:
----
>>> dis.dis(mysum)
0 LOAD_FAST                0 (a)
3 LOAD_FAST                1 (b)
6 BINARY_ADD
7 RETURN_VALUE
----

The bytecode can be compiled to something like:
----
x = GETLOCAL(0); # "a"
if (x == NULL) /* error */
Py_INCREF(x);
PUSH(x);

x = GETLOCAL(1); # "b"
if (x == NULL) /* error */
Py_INCREF(x);
PUSH(x);

w = POP();
v = TOP();
x = PyNumber_Add(v, w);
Py_DECREF(v);
Py_DECREF(w);
if (x == NULL) /* error */
SET_TOP(x);

retval = POP();

return retval;
----

The calls to Py_INCREF() and Py_DEREF() can be removed. The code is no
more based on a loop: CPU prefers sequential code. The stack can be
replaced variables: the compiler (LLVM?) knows how to replace many
variables with a few variables, or even use CPU registers instead.

Example:
----
a = GETLOCAL(0); # "a"
if (a == NULL) /* error */
b = GETLOCAL(1); # "b"
if (b == NULL) /* error */
return PyNumber_Add(a, b);
----

I don't expect to run a program 10x faster, but I would be happy if I
can run arbitrary Python code 25% faster.

--

Specialization / tracing JIT can be seen as another project, or at
least added later.

Victor