[Numpy-discussion] Looking for people interested in helping with Python compiler to LLVM

Sturla Molden sturla at molden.no
Sun Mar 11 19:01:26 EDT 2012


Den 11.03.2012 15:52, skrev Pauli Virtanen:
> To get speed gains, you need to optimize not only the bytecode 
> interpreter side, but also the object space --- Python classes, 
> strings and all that. Keeping in mind Python's dynamism, there are 
> potential side effects everywhere. I guess this is what sunk the swallow. 

I'm not sure what scared off or killed the bird. Maybe they just 
approached it from the wrong side.

Psyco has proven that some "algorithmic" Python code can be accelerated 
by an order of magnitude or two. But Psyco did not use an optimizing 
compiler like LLVM, it just generated unoptimized x86 binary code from 
Python.

Also, a JIT for a dynamic language is possible. One example is the 
Strongtalk JIT compiler for Smalltalk. It was purchased by Sun and 
renamed "Java VM Hotspot", because Sun could not produce a JIT for its 
static language Java. The static nature of Java does not have anything 
to do with the performance of its JIT compiler.

Maybe there are differences between Python and Smalltalk that makes the 
latter more easy to JIT compile? Or perhaps there are differences 
between Hotspot/Strongtalk and LLVM that makes the latter less efficient 
for dynamic languages? I don't know. But making a performant JIT for all 
of Python is obviously not easy.

A third example of a fast JIT for a dynamic language is LuaJIT. It can 
often make "interpreted" and duck-typed Lua run faster than statically 
compiled C. Like psyco, LuaJIT just focuses on making algorithmic code 
with a few elementary types run fast.

Another approach to speeding up dynamic languages is optional static 
typing and static compilation. Cython, Bigloo (et al.), and CMUCL/SBCL 
are exellent examples of that. Maybe one could use type annotations like 
Cython in pure Python mode? Python 3 has even got type annotations in 
the syntax.

I think (but I am not 100% sure) that the main problem with JIT'ing 
Python is its dynamic attributes. So maybe some magic with __slots__ or 
__metaclass__ could be used to turn that dynamicity off? So a JIT like 
Numba would only work with builtin types (int, float, list, dict, set, 
tuple), NumPy arrays, and some less-dynamic classes. It would not speed 
up all of Python, but a sufficient subset to make scientists happy.

Sturla



More information about the NumPy-Discussion mailing list