Hi Carl! On Thu, Feb 03, 2005 at 03:59 +0100, Carl Friedrich Bolz wrote:
I've [1] been following the pypy-dev mailing list for quite some time now and am really exited about this project. This weekend I checked out the code and started to play around with it a bit. Since there has been some talk about adding a LLVM backend and since this doesn't seem to have happened I decided to take a stab at it. I installed LLVM (which really is a pain), read the LLVM documentation and started to write a (very rudimentary) genllvm.py. It can already generate LLVM-assembler for simple functions (e.g. just ints, no function calls, no default arguments...). Then a Pyrex-wrapper for the functions is generated so that they can imported.
sounds good. Before i comment further a disclaimer: i think that Armin, Samuele, Michael or Christian or others can possibly better provide more in-depth comments since i haven't much worked on the current translator/annotator codebase. One of the obstacles regarding LLVM is indeed its installation process, as is often the case with large C++ codebases not packaged by the distributions. If we want to use LLVM more then we should try to provide supplemental installation instructions i guess.
For the function snippet.my_gcd the following LLVM-assembler code is generated:
int %my_gcd(int %a_2, int %b_3) { block0: %r_7 = call int %mod(int %a_2, int %b_3) br label %block1 block3: %a_29 = phi int [%a_8, %block1] %b_30 = phi int [%b_9, %block1] %r_31 = phi int [%r_10, %block1] %v32 = phi bool [%v11, %block1] %r_21 = call int %mod(int %b_30, int %r_31) br label %block1 block2: %v4 = phi int [%b_9, %block1] ret int %v4 block1: %a_8 = phi int [%a_2, %block0], [%b_30, %block3] %b_9 = phi int [%b_3, %block0], [%r_31, %block3] %r_10 = phi int [%r_7, %block0], [%r_21, %block3] %v11 = call bool %is_true(int %r_10) br bool %v11, label %block3, label %block2 }
Note that this exactly mirrors the flowgraph. I just use function calls for all SpaceOperations (though some probably have to be special-cased later). It is not neccessary to rename these functions since LLVM considers functions to be different if their signatures differ.
(... nice example ...)
This is then compiled to native code. At the moment I'm not using the LLVM-API to generate this code since it was simpler to just do the string shuffling in python than to wrap and learn the LLVM-API.
hehe.
In my opinion this can be extended to nearly all of Python's data types as long as the annotation succeeds. I cannot yet judge wether other things like classes, exception handling, garbage collection etc. will be easy but we shall see.
Oh don't worry, the other backends don't care too much for this, either :-) The Pyrex and GenC still cooperate with the CPython runtime and borrow its garbage collection among other things. Once we target a standalone (without the CPython runtime) version garbage collection needs to be done. (Usually at this point sometime drops in the two words "Boehm collector" :-) Exceptions get analyzed by the flow space in a way that makes generation of low-level code rather straightforward.
As for the code: It is quite convoluted and ad-hoc, I need to clean it up, write some more tests (I already wrote some) and extend it a bit before it is fit for someone else to see. Should I just post it or apply for checkin rights?
Apply for checkin rights, i'd say. I guess you are aware of at least our coding-style document http://codespeak.net/pypy/index.cgi?doc/coding-style and of the fact that we generally want MIT-licensed (BSD-licensed) code If so, how about you send me privately your desired account name?
What do you all think? Does my approach make sense or are there some obvious problems that I didn't see.
I think it makes sense. It would be great to have you at one of our next sprints and further explore the LLVM backend, i think. Btw, Armin and Christian intend to do cleanup work on the translator backends and it is sensible to already have LLVM in mind. What i would like to find out is if we could use a stripped down version of LLVM because i also guess that many supplemental (code generation) tasks are better done in Python than with wrapping and using some of the LLVM API. I guess i am going to download and try-installing the thing again :-) cheers, holger