[pypy-dev] PyPy + LLVM

Sun Mar 18 00:25:26 CET 2007

On Wed, 14 Mar 2007, Armin Rigo wrote:
> Hi Chris,

Hey armin, I'm cc'ing pypy-dev, because this is suitable for a broader 
audience.  I'm sorry for the delay in my reply, I have been travelling 
recently.

> After reading your slides from yesterday's presentation, it occurred to
> me that you might have a wrong idea about what PyPy is.  I'm

Sorry about that. The third section of my talk was only loosely based on 
pypy.  Some of it was "inspired" by pypy, some of it was "unification" 
ideas that I have been thinking for a while, some is based on my own 
experience with static analysis, and some is just wishful thinking.  I'm 
sorry if it came across as "here is what pypy is doing", that is now what 
I intended.

> double-guessing here, so of course I may be completely off - please bear
> with me.  What PyPy is definitely not, is a Python compiler.  In order
> to get to a common ground of discussion, may I suggest the following
> references:

Right.  If I had to do a layman's summary, I would say that pypy provides 
infrastructure for building interpreters in [r]python.  This 
infrastructure makes it much easier than starting from scratch, e.g. by 
providing reusable components for language runtimes (like GC's).

My talk (available here 
http://llvm.org/pubs/2007-03-12-BossaLLVMIntro.html ) clearly doesn't 
describe pypy as it exists today, but I think it describes a place that 
pypy could get to if the community desired it (In other words, I think 
the strengths of pypy and of its community both play well into this).  The 
members of the pypy project clearly have experience with type inference, 
clearly understand dynamic language issues, and clearly understand that 
the semantics of these languages differ widely, but there are also a lot 
of commonality.  If the pypy community isn't interested, I will approach 
others, eventually falling back to doing it directly in LLVM if needed.

For the record, I have read Brett Cannon's thesis.  I don't think his 
results are applicable for a number of reasons.  First, his work was in 
the context of an interpreter that used type information for optimization. 
The approach I'm describing uses type information to give substantially 
more information to a static compiler.  The result of this is that the 
codegen would be dramatically more efficient, and without the overhead of 
an interpreter loop, this is far more significant than in his evaluation. 
Also, significantly more type information would be available to the type 
propagator if you used more aggressive techniques than he did.

To me, the much more significant issue with python (and ruby, and others) 
for integer operations is that they automatically promote to big integers 
on demand.  In practice, this means that the code generated by a static 
compiler would be optimized for the small case, but then would fallback 
gracefully in the large case.  If you're familiar with X86 asm, I'd expect 
code for a series of adds to look like this:

   add EAX, EBX
   jo recover_1
   add EAX, 17
   jo recover_2
   add EAX, ECX
   jo recover_3

The idea of this is that (in the common case where the app deals with 
small integers) you have *exteremely* fast codegen with easily 
predicatable branches.  The recovery code could, for example, package up 
the registers into real "integer" objects, and then callback into the 
interpreter to handle the hard case (note that this is very similar to the 
fast paths in the standard python interpreter, which eventually falls back 
to calling PyNumber_Add).

Floating point code, if type inference is successful, does not need this 
sort of recovery code (unlike integer code). Other languages (like 
Javascript) don't have this issue, but they have others (e.g. there are no 
integers :), only floats.

Of course I'm biased, but I think that this sort of code could easily and 
naturally be generated and optimized by LLVM, if you chose to target it. 
This would let LLVM do the range analysis required to eliminate the 
branches (when possible), handle other low level optimization, codegen, 
etc.  You would get extremely high performance code and a very 
retargettable compiler.

Has python even been executed with a sustained performance of two 
instructions per add? :)

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/