[pypy-dev] Questions from a lurker
Carl Friedrich Bolz
cfbolz at gmx.de
Sat Jul 16 12:55:01 CEST 2005
Arthur Peters wrote:
> Hello, I am Arthur. I've been lurking on the list for a month or so and
> I've had a good bit of fun playing with the code.
Welcome to PyPy! I'll try to take a stab at some of the questions.
Corrections are welcome.
> My questions are:
> - Why did you start a second LLVM backend? Was the first one badly miss
When I wrote the first LLVM backend the RTyper did not exist yet. This
resulted in a big duplication of work: I basically implemented a lot of
things on LLVM level (like lists, tuples, classes...) which would have
been useful for other C-ish backends as well. Later the RTyper was
written which made all that work superfluous because the RTyper did
exactly that: implement all those things on a level where every other
C-ish backend could use it. Eric and me tried to adapt the old LLVM
backend to the new model but it didn't work very well (because the
assumptions where totally different). Thus Holger and me started the new
LLVM backend on a weekend.
> - So I take it the initial goal will be to translate pypy into C (or
> LLVM) and create a binary executable. At that point will it be possible
> to translate any given program or will the translator still only work
> with a subset of the python language? Will the final goal be a JIT VM
> for python or will I at some point be able to statically compile python
> to machine code?
The translator will always only work on a subset of Python. It's
probably not really possible to statically compile arbritrary Python
code to machine code (at least not whithout cheating and using the whole
interpreter in the machine code again :-) ).
> - Will there be support for multiple specialized versions of functions
> like in psyco? I know this is a long way down the road but I'm curious
> what people think.
The idea is to integrate Psyco's ideas, yes. Although there are not
really multiple specialized versions of a function in Psyco, it's rather
one function that does different things depending on the type of the
> - I read what some discussion of the GIL as it relate to pypy and I agree
> that the GIL need to be implemented and that a different thread model
> might be a good way to go. However I thought of the following: Would it
> be possible to detect when a section of code only uses local variables
> and unlock the GIL? This seems possible because local variables will
> cannot be shared between threads anyway. In addition, local variables
> likely be translated into more basic types than a python object (native
> ints and floats for instance), that would not require any interaction
> with the object-space to manipulate (not sure about that usage of the
> term "object-space"). Thoughts?
First a comment about the usage of the term "object space": I think you
are mixing levels here (and I might misunderstand you). The PyPy
interpreter (meaning the bytecode interpreter plus the standard
objectspace) gets translated to low level code. This "interpreter level
code" deals with the standard object space as a regular class instance
that is in principle in no way different than any other class. The
classes that appear on interpreter level are all translated to more
basic types (what would be the alternative, there is no layer below that
could deal with anything else), probably to something like a struct.
Thus the operations of objects at this level never need to be done via
the object space -- the object space is rather a regular object in itself.
The object space /is/ used, if you interpret a Python program with
PyPy's interpreter. The bytecode interpreter does not now how to deal
with /any/ object (except for True and False), it has to delegate to the
object space for every single operation. Even for basic types like ints
and such -- at this level ("application level") there isn't any type
inference or anything like that!
Now on threading: I'm not really the right person to say much about it.
As far as I understand it, the general idea is that we don't want to
clutter our whole interpreter implementation with threading details.
Instead threading is supposed to become a translation aspect: The
translation process is meant to "weave" the threading model into the
translated interpreter. This would have a lot of advantages: Instead of
being stuck with a single threading model which is deeply integrated
into all the parts of our interpreter and hard to get rid of again, we
can change it by changing a small localized part of whatever (probably
the translator). Thus it would be possible to translate an interpreter
which uses a GIL -- appropriate for an environment where threads are
rarely used. Or we could translate an interpreter with, say, more finely
grained locking which would be slower for a single threaded program but
could speed up applications with multiple threads.
Hope that helped a bit and wasn't too confused :-).
More information about the Pypy-dev