
Hello Chris, On Fri, Oct 31, 2003 at 10:58:45AM -0600, Chris Lattner wrote:
Great!
It is central to Psyco, the Python just-in-time specializer (http://psyco.sourceforge.net) whose techniques we plan to integrate with PyPy. Unlike other environments like Self, which collects execution profiles during interpretation and use them to recompile whole functions, Psyco has no interpretation stage: it directly emits a basic block and run it; the values found at run-time trigger the compilation of more basic blocks, which are run, and so on. So each function's machine code is a dynamic network of basic blocks which are various specialized versions of a bit of the original function. This network is not statically known, in particular because basic blocks often have a "switch" exit based on some value or type collected at run-time. Every new value encountered at this point triggers the compilation of a new switch case jumping to a new basic block. We will also certainly consider Self-style recompilations, as they allow more agressive optimizations. (Register allocation in Psyco is done using a simple round-robin scheme; code generation is very fast.)
Well, as the C++ API is nice and clean it is probably simpler to bind it directly to Python. We would probably go for Boost-Python, which makes C++ objects directly accessible to Python. But nothing is sure about this; maybe driving LLVM from LLVM code is closer to our needs. Is there a specific interface to do that? Is it possible to extract from LLVM the required code only, and link it with the final executable? In my experience, there are a few limitations of C that require explicit assembly code, like building calls dynamically (i.e. the caller's equivalent of varargs). A bientot, Armin.

Ok, makes sense.
That would be great! We've tossed around the idea of creating C bindings for LLVM, which would make interfacing from other languages easier than
Ok, I didn't know the boost bindings allowed calling C++ code from python. In retrospect, that makes a lot of sense. :)
driving LLVM from LLVM code is closer to our needs. Is there a specific interface to do that?
Sure, what exactly do you mean by driving LLVM code from LLVM? The main interface for executing LLVM code is the ExecutionEngine interface: http://llvm.cs.uiuc.edu/doxygen/classExecutionEngine.html There are concrete implementations of this interface for the JIT and for the interpreter. Note that we will probably need to add some additional methods to this class to enable all of the functionality that you need (that's not a problem though :).
What do you mean by the "required code only"? LLVM itself is very modular, you only have to link the libraries in that you use. It's also very easy to slice and dice LLVM code from programs or functions, etc. For example, the simple 'extract' tool rips a function out of a module (this is typically useful only when debugging though)... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/

Hello Chris, On Sun, Nov 02, 2003 at 10:56:34AM -0600, Chris Lattner wrote:
Writing LLVM code that contains calls to the LLVM framework's compilation routines. Sorry if this is a newbie question, but are we supposed to be able to use all the classes like ExecutionEngine from LLVM code produced by our tools (as opposed to by the C++ front-end) ? Or would that be a real hack ? In other words, can we write a JIT in LLVM code ? I understand this is not what you have in mind for Java, for example, where you'd rather write the JIT *for* but not *in* LLVM code. In PyPy we are considering generating different versions of the low-level code: The first is a regular Python interpreter (I), similar to the general design of interpreters written in C. It is the direct translation of the Python source code of PyPy. Now consider a clever meta-interpreter (M) that interprets (I) with as argument an input user program (P) in Python. Note that we include the user program's runtime arguments in (P). Using feed-back, (M) could specialize (I) to some partial information about (P); a typical choice is the user code and the type of the user variables, but in general it is a more dynamic part of (P). Now consider (M) itself and specialize it statically for its first argument (I) for optimization. The result is efficient low-level code that can dynamically instrument and compile any user program (P). This efficient low-level code can also be written by hand; it is what I did in Psyco. Now that I know exactly how such code must be written it is not difficult to actually generate it out of the regular Python source code of (I), i.e. PyPy. We won't actually write (M). A bientot, Armin.

Oh, I see. :)
This is not something that we had considered or planned to do, but there is no reason it shouldn't work. LLVM compiled code follows the same ABI as the G++ compiler, so you can even mix and match translation units or libraries.
I understand this is not what you have in mind for Java, for example, where you'd rather write the JIT *for* but not *in* LLVM code.
Well, that's sort-of true. You can ask Alkis for more details, but I think that he's writing the Java->LLVM converter in Java, which will mean that the converter is going to be compiled to LLVM as well. He's doing this work in the context of the Jikes RVM.
In PyPy we are considering generating different versions of the low-level code:
This should be doable. :) I've read up a little bit on Psyco, but I'm still not sure I understand the advantage of translating a basic block at a time. An easier way to tackle the above problem is to use an already supported language for the bootstrap, but I think the above should work... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/

Ok, makes sense.
That would be great! We've tossed around the idea of creating C bindings for LLVM, which would make interfacing from other languages easier than
Ok, I didn't know the boost bindings allowed calling C++ code from python. In retrospect, that makes a lot of sense. :)
driving LLVM from LLVM code is closer to our needs. Is there a specific interface to do that?
Sure, what exactly do you mean by driving LLVM code from LLVM? The main interface for executing LLVM code is the ExecutionEngine interface: http://llvm.cs.uiuc.edu/doxygen/classExecutionEngine.html There are concrete implementations of this interface for the JIT and for the interpreter. Note that we will probably need to add some additional methods to this class to enable all of the functionality that you need (that's not a problem though :).
What do you mean by the "required code only"? LLVM itself is very modular, you only have to link the libraries in that you use. It's also very easy to slice and dice LLVM code from programs or functions, etc. For example, the simple 'extract' tool rips a function out of a module (this is typically useful only when debugging though)... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/

Hello Chris, On Sun, Nov 02, 2003 at 10:56:34AM -0600, Chris Lattner wrote:
Writing LLVM code that contains calls to the LLVM framework's compilation routines. Sorry if this is a newbie question, but are we supposed to be able to use all the classes like ExecutionEngine from LLVM code produced by our tools (as opposed to by the C++ front-end) ? Or would that be a real hack ? In other words, can we write a JIT in LLVM code ? I understand this is not what you have in mind for Java, for example, where you'd rather write the JIT *for* but not *in* LLVM code. In PyPy we are considering generating different versions of the low-level code: The first is a regular Python interpreter (I), similar to the general design of interpreters written in C. It is the direct translation of the Python source code of PyPy. Now consider a clever meta-interpreter (M) that interprets (I) with as argument an input user program (P) in Python. Note that we include the user program's runtime arguments in (P). Using feed-back, (M) could specialize (I) to some partial information about (P); a typical choice is the user code and the type of the user variables, but in general it is a more dynamic part of (P). Now consider (M) itself and specialize it statically for its first argument (I) for optimization. The result is efficient low-level code that can dynamically instrument and compile any user program (P). This efficient low-level code can also be written by hand; it is what I did in Psyco. Now that I know exactly how such code must be written it is not difficult to actually generate it out of the regular Python source code of (I), i.e. PyPy. We won't actually write (M). A bientot, Armin.

Oh, I see. :)
This is not something that we had considered or planned to do, but there is no reason it shouldn't work. LLVM compiled code follows the same ABI as the G++ compiler, so you can even mix and match translation units or libraries.
I understand this is not what you have in mind for Java, for example, where you'd rather write the JIT *for* but not *in* LLVM code.
Well, that's sort-of true. You can ask Alkis for more details, but I think that he's writing the Java->LLVM converter in Java, which will mean that the converter is going to be compiled to LLVM as well. He's doing this work in the context of the Jikes RVM.
In PyPy we are considering generating different versions of the low-level code:
This should be doable. :) I've read up a little bit on Psyco, but I'm still not sure I understand the advantage of translating a basic block at a time. An easier way to tackle the above problem is to use an already supported language for the bootstrap, but I think the above should work... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
participants (2)
-
Armin Rigo
-
Chris Lattner