[pypy-dev] array performace?
Carl Friedrich Bolz
cfbolz at gmx.de
Fri Jul 2 20:35:46 CEST 2010
On 07/02/2010 02:08 PM, Paolo Giarrusso wrote:
>>> Unsupported claim is for example that fast interpreters are 10x
>>> slower than C.
> That's the only unsupported claim, but it comes from "The Structure
> and Performance of Eﬃcient Interpreters". I studied that as a
> student on VM, you are writing one, so I (unconsciously) guessed
> that everybody knows that paper - I know that's a completely broken
> way of writing, but I didn't spot it.
Even if something is claimed by a well-known paper, it doesn't
necessarily have to be true. The paper considers a class of interpreters
where each specific bytecode does very little work (the paper does not
make this assumption explicit). This is not the case for Python at all,
so I think that the conclusions of the paper don't apply directly.
This is explained quite clearly in the following paper:
Virtual-Machine Abstraction and Optimization Techniques by Stefan
Brunthaler in Bytecode 2009.
> Well, at the abstraction level I'm speaking, it sounds like there in
> the end, the JIT will be able to do what is needed. I am not aware
> of the details. But then, at the end of that project, it seems to me
> that it should be possible to write the array module in pure Python
> using this new FFI interface and have the JIT look at it, shouldn't
> it? I do not concentrate on array specifically - rewriting a few
> modules at interpreter level is fine. But as a Python developer I
> should have no need for that.
That's a noble goal :-). I agree with the goal, but I still wanted to
point out that the case of array is really quite outside of the range of
possibilities of typical JIT compilers. Consider the hypothetical
problem of having to write a pure-Python array module without using any
other module, only builtin types. Then you would have to map arrays to
be normal Python lists, and you would have no way to circumvent the fact
that all objects in the lists are boxed. The JIT is now not helping you
at all, because it only optimizes on a code level, and cannot change the
way your data is structured in memory.
I know that this is not at all how you are proposing the array module
should be written, but I still wanted to point out that current JITs
don't help you much if your data is represented in a bad way. We have
some ideas how data representations could be optimized at runtime, but
nothing implemented yet.
More information about the Pypy-dev