[pypy-dev] Hi!

Thu Feb 3 09:46:08 CET 2005

Hi Carl! 

On Thu, Feb 03, 2005 at 03:59 +0100, Carl Friedrich Bolz wrote:
> I've [1] been following the pypy-dev mailing list for quite some time now
> and am really exited about this project. This weekend I checked out the
> code and started to play around with it a bit. Since there has been some
> talk about adding a LLVM backend and since this doesn't seem to have
> happened I decided to take a stab at it. I installed LLVM (which really
> is a pain), read the LLVM documentation and started to write a (very
> rudimentary) genllvm.py. It can already generate LLVM-assembler for
> simple functions (e.g. just ints, no function calls, no default
> arguments...). Then a Pyrex-wrapper for the functions is generated so
> that they can imported.

sounds good. Before i comment further a disclaimer: i think
that Armin, Samuele, Michael or Christian or others can possibly 
better provide more in-depth comments since i haven't much worked 
on the current translator/annotator codebase.  

One of the obstacles regarding LLVM is indeed its installation
process, as is often the case with large C++ codebases not
packaged by the distributions.  If we want to use LLVM more
then we should try to provide supplemental installation
instructions i guess. 

> For the function snippet.my_gcd the following LLVM-assembler code is
> generated:
> 
> int %my_gcd(int %a_2, int %b_3) {
> block0:
> 	%r_7 = call int %mod(int %a_2, int %b_3)
> 	br label %block1
> block3:
> 	%a_29 = phi int [%a_8, %block1]
> 	%b_30 = phi int [%b_9, %block1]
> 	%r_31 = phi int [%r_10, %block1]
> 	%v32 = phi bool [%v11, %block1]
> 	%r_21 = call int %mod(int %b_30, int %r_31)
> 	br label %block1
> block2:
> 	%v4 = phi int [%b_9, %block1]
> 	ret int %v4
> block1:
> 	%a_8 = phi int [%a_2, %block0], [%b_30, %block3]
> 	%b_9 = phi int [%b_3, %block0], [%r_31, %block3]
> 	%r_10 = phi int [%r_7, %block0], [%r_21, %block3]
> 	%v11 = call bool %is_true(int %r_10)
> 	br bool %v11, label %block3, label %block2
> }
> 
> Note that this exactly mirrors the flowgraph. I just use function calls
> for all SpaceOperations (though some probably have to be special-cased
> later). It is not neccessary to rename these functions since LLVM
> considers functions to be different if their signatures differ.

> (... nice example ...) 

> This is then compiled to native code. At the moment I'm not using the
> LLVM-API to generate this code since it was simpler to just do the string
> shuffling in python than to wrap and learn the LLVM-API.

hehe. 

> In my opinion this can be extended to nearly all of Python's data types
> as long as the annotation succeeds. I cannot yet judge wether other things
> like classes, exception handling, garbage collection etc. will be easy but
> we shall see.

Oh don't worry, the other backends don't care too much for
this, either :-) The Pyrex and GenC still cooperate with the
CPython runtime and borrow its garbage collection among other
things.  Once we target a standalone (without the CPython
runtime) version garbage collection needs to be done. (Usually
at this point sometime drops in the two words "Boehm collector" :-) 
Exceptions get analyzed by the flow space in a way that makes
generation of low-level code rather straightforward. 

> As for the code: It is quite convoluted and ad-hoc, I need to clean it up,
> write some more tests (I already wrote some) and extend it a bit before it
> is fit for someone else to see. Should I just post it or apply for checkin
> rights?

Apply for checkin rights, i'd say.  I guess you are aware of at least 
our coding-style document http://codespeak.net/pypy/index.cgi?doc/coding-style
and of the fact that we generally want MIT-licensed (BSD-licensed) code  
If so, how about you send me privately your desired account name? 

> What do you all think? Does my approach make sense or are there some
> obvious problems that I didn't see.

I think it makes sense. It would be great to have you at one of our
next sprints and further explore the LLVM backend, i think.  Btw, 
Armin and Christian intend to do cleanup work on the translator 
backends and it is sensible to already have LLVM in mind. 

What i would like to find out is if we could use a stripped
down version of LLVM because i also guess that many
supplemental (code generation) tasks are better done in Python
than with wrapping and using some of the LLVM API. I guess i am
going to download and try-installing the thing again :-) 

cheers, 

    holger