[pypy-dev] array performace?
bokr at oz.net
Sat Jul 3 00:56:39 CEST 2010
On 07/02/2010 11:35 AM Carl Friedrich Bolz wrote:
> Hi Paolo,
> On 07/02/2010 02:08 PM, Paolo Giarrusso wrote:
>>>> Unsupported claim is for example that fast interpreters are 10x
>>>> slower than C.
>> That's the only unsupported claim, but it comes from "The Structure
>> and Performance of Eï¬ƒcient Interpreters". I studied that as a
>> student on VM, you are writing one, so I (unconsciously) guessed
>> that everybody knows that paper - I know that's a completely broken
>> way of writing, but I didn't spot it.
> Even if something is claimed by a well-known paper, it doesn't
> necessarily have to be true. The paper considers a class of interpreters
> where each specific bytecode does very little work (the paper does not
> make this assumption explicit). This is not the case for Python at all,
> so I think that the conclusions of the paper don't apply directly.
> This is explained quite clearly in the following paper:
> Virtual-Machine Abstraction and Optimization Techniques by Stefan
> Brunthaler in Bytecode 2009.
>> Well, at the abstraction level I'm speaking, it sounds like there in
>> the end, the JIT will be able to do what is needed. I am not aware
>> of the details. But then, at the end of that project, it seems to me
>> that it should be possible to write the array module in pure Python
>> using this new FFI interface and have the JIT look at it, shouldn't
>> it? I do not concentrate on array specifically - rewriting a few
>> modules at interpreter level is fine. But as a Python developer I
>> should have no need for that.
> That's a noble goal :-). I agree with the goal, but I still wanted to
> point out that the case of array is really quite outside of the range of
> possibilities of typical JIT compilers. Consider the hypothetical
> problem of having to write a pure-Python array module without using any
> other module, only builtin types. Then you would have to map arrays to
> be normal Python lists, and you would have no way to circumvent the fact
> that all objects in the lists are boxed. The JIT is now not helping you
> at all, because it only optimizes on a code level, and cannot change the
> way your data is structured in memory.
> I know that this is not at all how you are proposing the array module
> should be written, but I still wanted to point out that current JITs
> don't help you much if your data is represented in a bad way. We have
> some ideas how data representations could be optimized at runtime, but
> nothing implemented yet.
Could/does JIT make use of information in an assert statement? E.g., could we write
assert set(type(x) for x in img) == set([float]) and len(img)==640*480
in front of a loop operating on img and have JIT use the info as assumed true
even when "if __debug__:" suites are optimized away?
Could such assertions allow e.g. a list to be implemented as a homogeneous vector
of unboxed representations?
What kind of guidelines for writing assertions would have to exist to make them
useful to JIT most easily?
More information about the Pypy-dev