[pypy-dev] How to turn a crawling caterpillar of a VM into a graceful butterfly
laurie at tratt.net
Sat Dec 31 11:03:04 CET 2011
As many of you know, over the past few months I've been creating an RPython
VM for the Converge language <http://convergepl.org>. This is now mostly
complete at a basic level - enough to run pretty much all of my Converge
programs at least on Linux and OpenBSD. [Major remaining issues are: no
floating point numbers; 32 bit support is not finished; may not compile on OS
First, the good news. The RPython VM (~3 months effort) is currently close to
3 times faster than the old C-based VM (~18 months effort). I think the
Converge VM is the first medium-scale VM to be created by someone outside the
core PyPy group, so those numbers are a testament to the power of the RPython
approach. I'd like to thank (alphabetically) Carl Friedrich Bolz, Maciej
Fijalkowski, and Armin Rigo who've offered help, encouragement and (in
Armin's case) big changes to RPython to help the Converge VM. I wouldn't have
got this far without their help, or of others on the PyPy IRC channel.
Now, the bad news: the RPython VM should be a lot faster than it currently is
as the old VM is (to say the least) not very good. In large part this is
because I simply don't know how best to optimise an RPython VM, particularly
the JIT (in fact, at the moment the JITted VM seems to be sometimes slower
than the non-JIT VM: I don't have an explanation for why). I'm therefore
soliciting advice / code from those more knowledgeable than myself on how to
optimise an RPython VM. Note that I'm intentionally being more general than
the Converge VM: I hope that some of the ideas generated will prove useful to
the many VMs that I hope will come to be implemented in RPython. Some of my
ignorance is simply that many parts of RPython have little or no
documentation: by playing the part of a clueless outsider (it comes naturally
to me!), I hope I may help pinpoint areas where further documentation is most
If you want to test out the VM it's here:
Building should mostly be:
$ export PYPY_SRC=<location of PyPy source directory>
$ cd converge
That will build a JITted VM. If you want a lower level of optimisation,
specify it to configure e.g. "./configure --opt=3". Please note that, for the
time being, on Linux the JIT won't work with the default GC root finder, so
you'll need to manually specify --gcrootfinder=shadowstack in vm/Makefile.
Apart from that, building on 64 bit Unix should hopefully be reasonably
A simple performance benchmark is along the lines of "make clean ; cd vm ;
make ; cd .. ; time make regress" which builds the compiler, various
examples, and the handful of tests that Converge comes with. [Please note,
Converge wasn't built with a TDD philosophy; while I welcome contributions of
tests, I am unable to make any major efforts in that regard myself in the
short-term. I know this sits uneasily with many in the PyPy community, but I
hope you are able to overlook this difference in development philosophy.]
Here are some examples of questions I'd love to know answers to:
* Why is the JITted VM that's built sometimes 2x slower than --opt=3,
but other times a few percent faster *on the same benchmark*?!
* What are virtualrefs?
* What is a more precise semantics of elidable?
* What is 'specialize'?
* Is it worth manually inlining functions in the main VM loop?
* Can I avoid the (many) calls to rffi.charp[size]2str, given that
they're mostly taken from the never-free'd mod_bc? Would this work
well on some GCs but not others?
No doubt there are many other things I would do well to know, but plainly do
not - please educate me!
If you have any questions or comments about Converge or the VM, please don't
hesitate to ask me - and thank you in advance for your help.
The Converge programming language http://convergepl.org/
More information about the pypy-dev