[pypy-dev] RE: LLVM backend

Thu Feb 17 11:08:53 CET 2005

Hi Chris,

On Wed, 16 Feb 2005 23:29:01 -0600 (CST), Chris Lattner <sabre at nondot.org>
wrote:
> 
> <sorry I'm joining in late here>
> 
> Carl Friedrich Bolz wrote:
> > I just checked my LLVM-backend in (I hope I did nothing wrong). It 
> > resides in pypy/translator/llvm.
> 
> Wow cool, I had no idea that you guys were this far along!

Depends on what you call 'this' :-). At the moment I can only compile a
small subset of RPython. RPython itself is an informally defined restricted
subset of Python that makes it easy to be compiled (Disclaimer: I'm not
really qualified to explain all this since I started with pypy rather
recently). For example a variable should only hold values of the same type.
The RPython source code is then converted to a flow graph in SSA form. The
types of the variables in the flow graph can then be inferred, if the types
of the entry functions are given. This is all done by some parts of pypy.
The LLVM-backend has now a very easy job: A flow graph in SSA form with
type-annotations maps directly to LLVM code.

[snip]
> 
> > - I think there should be some more intelligent way to produce the
> >   necessary LLLVM-implementations for the space operations of more
> >   complex types than just writing them in LLVM-assembler, which can be
> >   quite tedious (it's no fun writing programs in SSA form).
> 
> Oh yeah, generating SSA is quite a pain.  The traditional way that LLVM 
> front-ends deal with this is to use 'alloca's for local variables and use 
> explicit loads/stores to them.  This generates really gross looking code, 
> but the LLVM optimizers (specifically mem2reg) rip them up.  For example, 
> for something simple like:
> 
>     X = 1;
>     ...
>       = X;
>     ...
>     X = 2;
> 
> You can generate code that looks like this:
> 
>    %X = alloca int   ;; in the entry block for the fn
> ...
>    store int 1, int* %X
> ...
>      = load int* %X
> ...
>    store int 2, int* %X
> ...
> 
> If you run this sort of code through the LLVM "-mem2reg" optimization, it 
> will promote all of these to SSA values, so you don't have to do it 
> yourself.  If the "..."'s contain control flow, this is a non-trivial 
> task. :)

The problem is not to transform RPython to SSA since the pypy-tools do all
the difficult work. I was talking about the implementation of Python's more
interesting types like lists and dictionaries. All the methods of these
types have to be implemented somehow, which I did from hand in LLVM
assembler at first. This is what I meant when I talked about 'pain' ;-).

I still have no good solution for this. At the moment I do the following:
The methods of the list objects are implemented in C as arrays of pointers
to "object" and turned to LLVM code (by compiling and disassembling it). The
result is used as a kind of template: All the occurences of the pointers to
"object" are replaced by the type of the values the list is supposed to
hold. This sounds rather brittle but works quite well at the moment.

I will probably run into limitations whith this later. For example if I
implement exceptions (which should not be too complicated using invoke and
unwind) I can't raise them from within the C code that produces the list
implementation. 

[snip]
> Another thought: I see that you're currently using llc to build your 
> programs, have you considered using the LLVM JIT?

At the moment I don't produce standalone programs but rather shared
libraries that can be loaded into Python as modules to get access to the
LLVM-compiled functions. So I really need to use llc.

> 
> Anyway, if you have any questions or run into problems, again, we'd love 
> to help, just let us know. :)
> 
> -Chris

I'll do that. Thanks a lot.

Regards,

Carl Friedrich