[pypy-dev] Interpreter level array implementation

Maciej Fijalkowski fijall at gmail.com
Fri Jul 2 23:21:17 CEST 2010


General note - we consider 2x optimized C a pretty good result :) Details below

On Fri, Jul 2, 2010 at 1:59 PM, Hakan Ardo <hakan at debian.org> wrote:
> Hi,
> we got the simplest possible interpreter level implementation of an
> array-like object running (in the interplevel-array branch) and it
> executes my previous example about 2 times slower than optimized C.
> Attached is the trace generated by the following example:
>
>    img=array(640*480);   l=0;   i=0;
>    while i<640*480:
>        l+=img[i]
>        i+=1
>
> a simplified version of that trace is:
>
>   1. [p0, p1, p2, p3, i4, p5, p6, p7, p8, p9, p10, f11, i]
>   2. i14 = int_lt(i, 307200)
>   3.   guard_true(i14, descr=<Guard1>)
>   4.   guard_nonnull_class(p10, 145745952, descr=<Guard2>)
>   5. img = getfield_gc(p10, descr=<GcPtrFieldDescr 8>)
>   6. f17 = getarrayitem_gc(img, i, descr=<FloatArrayDescr>)
>   7. f18 = float_add(f11, f17)
>   8. i20 = int_add_ovf(i, 1)
>   9.   guard_no_overflow(, descr=<Guard3>) #
>  10. i23 = getfield_raw(149604768, descr=<SignedFieldDescr 0>)
>  11. i25 = int_add(i23, 1)
>  12. setfield_raw(149604768, i25, descr=<SignedFieldDescr 0>)
>  13. i28 = int_and(i25, -2131755008)
>  14. i29 = int_is_true(i28)
>  15.   guard_false(i29, descr=<Guard4>)
>  16. jump(p0, p1, p2, p3, 27, ConstPtr(ptr31), ConstPtr(ptr32),
>           ConstPtr(ptr33), p8, p9, p10, f18, i20)
>
> Does these operation more or less correspond to assembler
> instructions?

Yes. Use PYPYJITLOG=log pypy-c ... to get assembler. View using
pypy/jit/backend/x86/tool/viewcode.py

>  I guess that the extra overhead here as compared to the
> the C version would be line 4, 5, 9 and 10-15. What's 10-15 all about?

It's about a couple of things that python interpreter has to perform.
Notably asynchronous signal checking and thread swapping with GIL.

> I guess that most of these additional operation would not affect the
> performance of more complicated loops as they will only occur once per
> loop (although combining the guard on line 9 with line 3 might be a
> possible optimization)? Line 4 will appear once for each array used in
> the loop and line 5 once for every array access, right?

Yes. We don't do loop invariant optimizations for some reasons, the
best of it being the fact that to loop you can always add a bridge
which will invalidate this invariant.

>
> Can the array implementation be designed in someway that would not
> generate line 5 above? Or would it be possible to get rid of it by
> some optimization?

No, it's about optimizations of JIT itself (it's an artifact of python
looping rather than array module).

>
> --
> Håkan Ardö
>
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

Cheers,
fijal



More information about the Pypy-dev mailing list