[pypy-dev] compiler optimizations: collecting ideas

Mon Nov 17 15:05:31 CET 2008

Paolo Giarrusso wrote:

> specialized bytecode can be significant, I guess, only if the
> interpreter is really fast (either a threaded one, or a code-copying
> one). Is the PyPy interpreter threaded?

sometime ago I tried to measure if/how much we can gain with a threaded 
interpreter. I manually modified the produced C code to make the main 
loop threaded, but we didn't gain anything; I think there are three 
possible reasons:

1) in Python a lot of opcodes are quite complex and time-consuming, so 
the time spent to dispatch to them is a little percentage of the total 
time spent for the execution

2) due to Python's semantics, it's not possible to just jump from one 
opcode to the next, as we need to do a lot of bookkeeping, like 
remembering what was the last line executed, etc. This means that the 
trampolines at the end of each opcode contains a lot code duplication, 
leading to a bigger main loop, with possibly bad effects on the cache 
(didn't measure this, though)

3) it's possible that I did something wrong, so in that case my 
measurements are completely useless :-).  If anyone wants to try again, 
it cannot hurt.

ciao,
Anto