Possible student project on PyPy
![](https://secure.gravatar.com/avatar/a3a676c96a88feb813010e67af012ca0.jpg?s=120&d=mm&r=g)
Hi, for a student project, we are evaluating the possibility to experiment with some ideas on PyPy. Even if it is a student project, we have high expectations (our aim is to really improve PyPy's speed, if possible). But we are still trying to choose which features to implement (we started with some nice ideas, but part of them is already implemented). We will work for something like 2 months on a separate branch - maybe we can give readonly access to the source code repository, but write access would be a problem for our exam, for obvious reasons. Do you have any possibility to host a development branch for this project? To let you know who we are, I'll present us: - I, Paolo 'Blaisorblade' Giarrusso, am a past Linux kernel hacker, I worked also with Java, C++, and a bit of Python, and I am currently a graduate student at Aarhus University, Denmark; - Sigurd is a PhD candidate at Aarhus University, Denmark, currently working, among other things, on a cryptography research project in Python (viff.dk). - our professor is Lars Bak, the lead architect of the Google V8 Javascript engine, on which we implemented various optimizations in the previous months. We are obviously open to suggestions, and we have been looking at the status blog and at various blogs. It seems that there is still space for improvement in the space of garbage collectors, as mentioned here: - http://codespeak.net/pypy/extradoc/talk/osdc2008/paper.txt working on that could be interesting. The main idea we wanted to apply to Python, based on a suggestion from Lars, is the main peculiar V8 idea, the one of hidden classes with transitions. That allows avoiding dictionary lookups for property accesses on objects, using instead a Java-like data representation in memory, without any visible effects on the Python semantics. While revising this email, we saw that this is already partially implemented: for what the blog says, the memory savings where done but not the memory speedups. Also, the blog does not mention whether and how class transitions are used. Some of the initial ideas below may already have been implemented, so point out what is there and what can be interesting. I've also read the discussion about the JIT, which will be missing in the next release. Obviously, this would require partial reworking the core object model of PyPy. Still, we think that specialization will especially benefit from this, even because this allows to specialize better also on class types, not only on primitive types. It is not clear to me how do you handle changes (like addition of a property) to a single object - is it possible to specialize the code on the (anonymous) type of this new object? Can you give me pointers inside the documentation, if this is already explained? I've read the EU reports about this and partial compilation, but it is still unclear to us how things work. Btw, the first thing to do is studying the object model and reflection capability of Python. Since Lars said that the target of a VM designer is to allow programmers work in the most pleasant way and use advanced features without incurring any cost, our development plan should try to optimize also reflective features as much as possible. Just to make an example, in Python you can register an handler for unknown method calls on an object (I don't remember how, but I'm pretty sure something like this exists). This idea means that our design should be able to do inline caching by storing a call to this handler (maybe we'll defer the implementation, but the initial design should allow doing that IMHO). Any comments? -- Paolo Giarrusso
![](https://secure.gravatar.com/avatar/5b37e6b4ac97453e4ba9dba37954cf79.jpg?s=120&d=mm&r=g)
Hi Paolo, Thanks for the introduction e-mail. I'd like to answer it more directly, but maybe I should first give a quick overview of what I think are possible misunderstandings. The main issue is that unlike Google V8, in PyPy there is no machine code generation at all (ignoring the JIT compiler generator for the first part of this e-mail). That means that the notion of inline cache simply doesn't make sense. Let me explain this in more details. What we implemented long ago in PyPy is completely similar to the "maps" of the Self language. It is a dictionary implementation that stores the keys in a separate, shared "map" object, and that stores the values in a flat list. In PyPy this thing still presents the interface of a full-featured dictionary, so it supports adding and removing arbitrary elements, which is done by changing the dictionary's current map. This saves memory, as you know. But I don't see how this can provide much speed-up in a pure interpreter (as opposed to in generated machine code with inline caches). It seems to me that whatever you do, you need to perform a lookup *somewhere* in order to implement the "getattr" operation. The lookup can be done directly in the object's dictionary, or it can be done in the shadow "map" object, but the point is that in both cases it needs to be done at runtime for every attribute access. There are probably advanced ways to cache recent lookups, which we didn't try so far, like attaching to each GETATTR opcode in the Python bytecode a tiny structure that caches the result of the most recent lookup performed at that source code position. I don't know how much speed-ups this would give; in fact it is rather unclear to me why people think that dictionary lookups are seriously bad. Of course they are more expensive than just reading an item out of a list, but I don't think it's worse than roughly 3 times slower; this number looks big in the context of machine code, but insignificant in the context of a complicated bytecode interpreter. I'm ready to be proven wrong, of course. However, this is only half of the story. Another very important point is that the mid-term goal for PyPy is to have again a JIT compiler generator (it did so in the 1.0 release, but it was far from providing consistent speed-ups, which is the reason that we are re-experimenting at the moment). In the context of our JIT, the inline cache technique you describe -- as well as the polymorphic inline caches of Java VMs -- are special cases of a generic construct called "promotion" in [1] or "unlifting" in Psyco [2]. Using this construct, the JIT compiler generator takes as input our "dictionary-with-maps"-enabled interpreter and produces as output a JIT compiler that itself emits inline caches similar to Google V8's. Another way to put it is that in PyPy the real purpose of our "maps", besides saving memory, is to be JIT-generator-friendly, enabling good speed-ups in the machine code generated by the JIT (and not in the pure interpreter). Thus, what we implemented in the interpreter is enough already; the main missing piece is a (completely generic) JIT generator producing a consistent JIT. Note that this is not just an abstract wish: such inline caches is what Psyco emits already; and the automatically generated JIT of PyPy 1.0 definitely did that too. To repeat a point that we tried to make a lot of times but apparently keep failing to: in the context of the PyPy JIT there is no point in thinking deeply about the various semantics of Python specifically, like reflection, operator overloading, the handlers for sends to unsupported methods (that's __getattr__ in Python), and so on. These semantics are encoded in the interpreter already, where the JIT compiler generator can see them and use them to produce a JIT that emits reasonably efficient machine code for a given user program. The relevant question in this context is not how to encode some complicated Python semantics in the machine code; the relevant question is to identify the cases where the generated JIT emits definitely inefficient code, and how to tweak the interpreter in order to be more JIT-friendly (e.g. by adding maps into the interpreter). In an attempt to make it clearer that thinking about advanced Python semantics is not useful in this context, the "Python" in the previous paragraph can be replaced by any language, preferably dynamic, and we still get a JIT for it (e.g. we did it for Prolog, and we have partial or in-progress interpreters for JavaScript and Smalltalk and other languages). Does the above make any sense? There are three relatively distinct areas to work on -- tweaking the interpreter to maximize its speed, tweaking it to maximize its JIT-friendlyness (better done after the JIT generator is ready), or helping develop the JIT generator itself; I hope the above helped clarify a bit this aspect of PyPy. In any case, we would definitely be interested in working with you in these areas, or in the other areas that you mentioned in your e-mail (GC...). Feel free to discuss matters further, here or on irc (#pypy on irc.freenode.net). [1] http://codespeak.net/pypy/dist/pypy/doc/jit/ [2] http://psyco.sourceforge.net/doc.html [PEPM'04 paper] A bientot, Armin.
![](https://secure.gravatar.com/avatar/2f2ca3900ef33896dbd5d158c803d4bd.jpg?s=120&d=mm&r=g)
Paolo Giarrusso wrote:
Hi, for a student project, we are evaluating the possibility to experiment with some ideas on PyPy.
Even if it is a student project, we have high expectations (our aim is to really improve PyPy's speed, if possible). But we are still trying to choose which features to implement (we started with some nice ideas, but part of them is already implemented).
We will work for something like 2 months on a separate branch - maybe we can give readonly access to the source code repository, but write access would be a problem for our exam, for obvious reasons.
Do you have any possibility to host a development branch for this project?
To let you know who we are, I'll present us:
- I, Paolo 'Blaisorblade' Giarrusso, am a past Linux kernel hacker, I worked also with Java, C++, and a bit of Python, and I am currently a graduate student at Aarhus University, Denmark; - Sigurd is a PhD candidate at Aarhus University, Denmark, currently working, among other things, on a cryptography research project in Python (viff.dk).
- our professor is Lars Bak, the lead architect of the Google V8 Javascript engine, on which we implemented various optimizations in the previous months.
We are obviously open to suggestions, and we have been looking at the status blog and at various blogs.
It seems that there is still space for improvement in the space of garbage collectors, as mentioned here: - http://codespeak.net/pypy/extradoc/talk/osdc2008/paper.txt working on that could be interesting.
Just to go a bit in a different direction than what Armin already wrote: Yes, we would be very interested in having somebody work on our GCs. They currently work, but they are not very advanced in many respects. So if somebody is interested in having a go at them, that would probably be quite useful for PyPy. I can't really say how interesting it would be research-wise, though. Cheers, Carl Friedrich
![](https://secure.gravatar.com/avatar/a3a676c96a88feb813010e67af012ca0.jpg?s=120&d=mm&r=g)
On Wed, Oct 29, 2008 at 15:40, Carl Friedrich Bolz <cfbolz@gmx.de> wrote:
Paolo Giarrusso wrote:
Hi, for a student project, we are evaluating the possibility to experiment with some ideas on PyPy.
I'm sorry to reply so late - basically, after a consultation with the professors, we were recommended to start out on our own, to really learn how to write a VM and make it fast, before going on to more ambitious projects. The target we have settled on is for now to write a Python bytecode interpreter with threading and code-copying, and then try to work on the GC. We'll see if we get any interesting results (but since we are not aiming to implement the full Python semantics, I guess it will be pretty easy to be faster than a full-blown interpreter). Regards -- Paolo Giarrusso
participants (3)
-
Armin Rigo
-
Carl Friedrich Bolz
-
Paolo Giarrusso