
Hi, for a student project, we are evaluating the possibility to experiment with some ideas on PyPy. Even if it is a student project, we have high expectations (our aim is to really improve PyPy's speed, if possible). But we are still trying to choose which features to implement (we started with some nice ideas, but part of them is already implemented). We will work for something like 2 months on a separate branch - maybe we can give readonly access to the source code repository, but write access would be a problem for our exam, for obvious reasons. Do you have any possibility to host a development branch for this project? To let you know who we are, I'll present us: - I, Paolo 'Blaisorblade' Giarrusso, am a past Linux kernel hacker, I worked also with Java, C++, and a bit of Python, and I am currently a graduate student at Aarhus University, Denmark; - Sigurd is a PhD candidate at Aarhus University, Denmark, currently working, among other things, on a cryptography research project in Python (viff.dk). - our professor is Lars Bak, the lead architect of the Google V8 Javascript engine, on which we implemented various optimizations in the previous months. We are obviously open to suggestions, and we have been looking at the status blog and at various blogs. It seems that there is still space for improvement in the space of garbage collectors, as mentioned here: - http://codespeak.net/pypy/extradoc/talk/osdc2008/paper.txt working on that could be interesting. The main idea we wanted to apply to Python, based on a suggestion from Lars, is the main peculiar V8 idea, the one of hidden classes with transitions. That allows avoiding dictionary lookups for property accesses on objects, using instead a Java-like data representation in memory, without any visible effects on the Python semantics. While revising this email, we saw that this is already partially implemented: for what the blog says, the memory savings where done but not the memory speedups. Also, the blog does not mention whether and how class transitions are used. Some of the initial ideas below may already have been implemented, so point out what is there and what can be interesting. I've also read the discussion about the JIT, which will be missing in the next release. Obviously, this would require partial reworking the core object model of PyPy. Still, we think that specialization will especially benefit from this, even because this allows to specialize better also on class types, not only on primitive types. It is not clear to me how do you handle changes (like addition of a property) to a single object - is it possible to specialize the code on the (anonymous) type of this new object? Can you give me pointers inside the documentation, if this is already explained? I've read the EU reports about this and partial compilation, but it is still unclear to us how things work. Btw, the first thing to do is studying the object model and reflection capability of Python. Since Lars said that the target of a VM designer is to allow programmers work in the most pleasant way and use advanced features without incurring any cost, our development plan should try to optimize also reflective features as much as possible. Just to make an example, in Python you can register an handler for unknown method calls on an object (I don't remember how, but I'm pretty sure something like this exists). This idea means that our design should be able to do inline caching by storing a call to this handler (maybe we'll defer the implementation, but the initial design should allow doing that IMHO). Any comments? -- Paolo Giarrusso