[pypy-dev] compiler optimizations: collecting ideas

Paolo Giarrusso p.giarrusso at gmail.com
Mon Nov 17 15:32:24 CET 2008


On Mon, Nov 17, 2008 at 15:05, Antonio Cuni <anto.cuni at gmail.com> wrote:
> Paolo Giarrusso wrote:
>
>> specialized bytecode can be significant, I guess, only if the
>> interpreter is really fast (either a threaded one, or a code-copying
>> one). Is the PyPy interpreter threaded?
>
> sometime ago I tried to measure if/how much we can gain with a threaded
> interpreter. I manually modified the produced C code to make the main loop
> threaded, but we didn't gain anything; I think there are three possible
> reasons:

> 1) in Python a lot of opcodes are quite complex and time-consuming, so the
> time spent to dispatch to them is a little percentage of the total time
> spent for the execution
That's something difficult to believe, I think. Well, it is possible
to profile execution to count mispredictions, to avoid having to think
about it.

Well, since arithmetic ops may involve a method lookup, I understand
what you mean. OTOH, a method lookup means just one dictionary lookup
when first seeing the bytecode (I would save the hashcode inline in
the bytecode after first execution actually, to speed that up), and an
unpredictable indirect branch; making the dispatch branch more
predictable by threading should still have a significant impact.

> 2) due to Python's semantics, it's not possible to just jump from one opcode
> to the next, as we need to do a lot of bookkeeping, like remembering what
> was the last line executed, etc.

This is a more likely culprit.
But... all languages that I know of (Java, compiled languages) just
have a reverse map from bytecode positions to the original line
information, to be used when and if they are needed (i.e. as debug
info, or to unwind the stack when an exception is thrown I guess).
Isn't that possible for Python?
In any

> This means that the trampolines at the end
> of each opcode contains a lot code duplication, leading to a bigger main
> loop, with possibly bad effects on the cache (didn't measure this, though)


> 3) it's possible that I did something wrong, so in that case my measurements
> are completely useless :-).  If anyone wants to try again, it cannot hurt.

Do you have the original code somewhere? I know one has to redo it
anyway even because that's generated code, but still guidance is
helpful.

-- 
Paolo Giarrusso



More information about the Pypy-dev mailing list