[pypy-dev] Interpreter level array implementation
fijall at gmail.com
Fri Jul 2 23:21:17 CEST 2010
General note - we consider 2x optimized C a pretty good result :) Details below
On Fri, Jul 2, 2010 at 1:59 PM, Hakan Ardo <hakan at debian.org> wrote:
> we got the simplest possible interpreter level implementation of an
> array-like object running (in the interplevel-array branch) and it
> executes my previous example about 2 times slower than optimized C.
> Attached is the trace generated by the following example:
> img=array(640*480); l=0; i=0;
> while i<640*480:
> a simplified version of that trace is:
> 1. [p0, p1, p2, p3, i4, p5, p6, p7, p8, p9, p10, f11, i]
> 2. i14 = int_lt(i, 307200)
> 3. guard_true(i14, descr=<Guard1>)
> 4. guard_nonnull_class(p10, 145745952, descr=<Guard2>)
> 5. img = getfield_gc(p10, descr=<GcPtrFieldDescr 8>)
> 6. f17 = getarrayitem_gc(img, i, descr=<FloatArrayDescr>)
> 7. f18 = float_add(f11, f17)
> 8. i20 = int_add_ovf(i, 1)
> 9. guard_no_overflow(, descr=<Guard3>) #
> 10. i23 = getfield_raw(149604768, descr=<SignedFieldDescr 0>)
> 11. i25 = int_add(i23, 1)
> 12. setfield_raw(149604768, i25, descr=<SignedFieldDescr 0>)
> 13. i28 = int_and(i25, -2131755008)
> 14. i29 = int_is_true(i28)
> 15. guard_false(i29, descr=<Guard4>)
> 16. jump(p0, p1, p2, p3, 27, ConstPtr(ptr31), ConstPtr(ptr32),
> ConstPtr(ptr33), p8, p9, p10, f18, i20)
> Does these operation more or less correspond to assembler
Yes. Use PYPYJITLOG=log pypy-c ... to get assembler. View using
> I guess that the extra overhead here as compared to the
> the C version would be line 4, 5, 9 and 10-15. What's 10-15 all about?
It's about a couple of things that python interpreter has to perform.
Notably asynchronous signal checking and thread swapping with GIL.
> I guess that most of these additional operation would not affect the
> performance of more complicated loops as they will only occur once per
> loop (although combining the guard on line 9 with line 3 might be a
> possible optimization)? Line 4 will appear once for each array used in
> the loop and line 5 once for every array access, right?
Yes. We don't do loop invariant optimizations for some reasons, the
best of it being the fact that to loop you can always add a bridge
which will invalidate this invariant.
> Can the array implementation be designed in someway that would not
> generate line 5 above? Or would it be possible to get rid of it by
> some optimization?
No, it's about optimizations of JIT itself (it's an artifact of python
looping rather than array module).
> Håkan Ardö
> pypy-dev at codespeak.net
More information about the Pypy-dev