[pypy-dev] array performace?
p.giarrusso at gmail.com
Fri Jul 2 21:35:55 CEST 2010
On Fri, Jul 2, 2010 at 20:35, Carl Friedrich Bolz <cfbolz at gmx.de> wrote:
> Hi Paolo,
> On 07/02/2010 02:08 PM, Paolo Giarrusso wrote:
>>>> Unsupported claim is for example that fast interpreters are 10x
>>>> slower than C.
>> That's the only unsupported claim, but it comes from "The Structure
>> and Performance of Eﬃcient Interpreters". I studied that as a
>> student on VM, you are writing one, so I (unconsciously) guessed
>> that everybody knows that paper - I know that's a completely broken
>> way of writing, but I didn't spot it.
> Even if something is claimed by a well-known paper, it doesn't
> necessarily have to be true. The paper considers a class of interpreters
> where each specific bytecode does very little work (the paper does not
> make this assumption explicit). This is not the case for Python at all,
> so I think that the conclusions of the paper don't apply directly.
Well, actually what I mention is not a conclusion of that paper, but
what you say probably applies to the original paper which is
referenced, so it doesn't matter.
> This is explained quite clearly in the following paper:
> Virtual-Machine Abstraction and Optimization Techniques by Stefan
> Brunthaler in Bytecode 2009.
I already mentioned that paper, a couple of years ago, when discussing
threading in PyPy, and my point was dismissed on general arguments.
I'm happy to see now a paper stating your point, so that it can be
discussed more precisely.
But the obvious question is: given the mixed characteristics of the
Lua interpreter, what is the instruction subdivision in that case?
They write it's in the same class without any measurement, while it
can complete an addition in 5 instructions instead of 3, and avoiding
the need for separate loads. In Python, instead, refcounting alone is
a very expensive operation.
Beyond that, that paper also acknowledges that a virtual machine for
Prolog, even if using dynamic types like Python, was in the same
efficiency class as lower-level VMs. I agree however that other
optimizations are needed first.
I would expect Lua to seem more 'low-level' also from this point of
view, and thus able to benefit more from threading. And with Python
3.0, where the distinction between int and long is gone, the Lua
implementation would be almost fine, if one uses tagged integer and
optimizes overflow checking through assembler (it's two lines of
assembly code on x86/x86_64).
> That's a noble goal :-). I agree with the goal, but I still wanted to
> point out that the case of array is really quite outside of the range of
> possibilities of typical JIT compilers. Consider the hypothetical
> problem of having to write a pure-Python array module without using any
> other module, only builtin types. Then you would have to map arrays to
> be normal Python lists, and you would have no way to circumvent the fact
> that all objects in the lists are boxed. The JIT is now not helping you
> at all, because it only optimizes on a code level, and cannot change the
> way your data is structured in memory.
> I know that this is not at all how you are proposing the array module
> should be written, but I still wanted to point out that current JITs
> don't help you much if your data is represented in a bad way. We have
> some ideas how data representations could be optimized at runtime, but
> nothing implemented yet.
OK, agreed. It would still be generally useful if the JIT _could_
optimize such cases, but that's hard enough. Especially, trying to
recognize that the list is used with homogeneous element does not look
easy in such a setting. However, again, what about tagged integers?
They wouldn't allow optimizing all uses of arrays, but they would be
generally useful on at least 31-bit integers and narrow characters.
If I had more free time, and then also enough disk space to translate
PyPy (I recall I hadn't when I conceived trying), I could maybe try
doing that myself, with some help. Don't hold your breath for that,
Paolo Giarrusso - Ph.D. Student
More information about the Pypy-dev