[Python-ideas] optimized VM ideas

Antoine Pitrou solipsis at pitrou.net
Fri Jan 23 13:19:39 CET 2009


Leonardo Santagada <santagada at ...> writes:
> 
> The way TraceMonkey works reminds me a bit of Psyco, although I might  
> be mixing it with the PyPy JIT.

I'm not sure it works like Psyco. Read the paper on trace trees, it is quite
interesting, and it may be doable to retrofit at least the trace construction
part in the current bytecode execution loop.
http://www.ics.uci.edu/~franz/Site/pubs-pdf/ICS-TR-06-16.pdf

> But talking about Psyco, why people  
> don't go help Psyco if all they want is a JIT? It is not like the idea  
> of having a JIT on Python is even new... Psyco was optimizing code  
> even before Webkit/V8 existed.

Three problems with Psyco:
1) it is not 100% compatible with Python semantics
2) it only works on i386 (not even x86-64)
3) it's (roughly) unmaintained

> Don't know, but a good GC is way faster than what CPython is doing  
> already, but maybe it is a good idea to explore some others  
> perspectives on this.

Without changing how the GC works, it's probably possible to improve things a
bit.
See e.g. http://bugs.python.org/issue4688

> > * Possibly modify the bytecode to be register-based, as in  
> > SquirrelFish.
> >  Not sure if this is worth it with python code.
> 
> Maybe it would help a bit. I don't think it would help more than 10%  
> tops (but I am completely guessing here)

I had thought about this and it seems a problem would be reference counting.
You have to DECREF a register as soon as it isn't used anymore, which adds
some bookkeeping.

> The problem with this is (besides the error someone has already stated  
> about your phrasing) that python has really complex bytecodes, so this  
> would also only gain around 10% and it only works with compilers that  
> accept goto labels which the MSVC for example does not (maybe there  
> are more compilers that also doesn't).

Well the bytecodes are not that complex. A bunch of them are completely
implemented inline the evaluation loop. Also, since it is a stack machine,
there are many trivial bytecodes such as LOAD_FAST, POP_TOP etc.

The threaded code patch in http://bugs.python.org/issue4753 has given between
0% and 20% speedups on pybench totals amongst the various posters (and a 5%
slowdown in one case).

cheers

Antoine.





More information about the Python-ideas mailing list