[pypy-dev] Interpretor for vectorized langugage

René Dudfield renesd at gmail.com
Thu Dec 16 23:43:19 CET 2010


On Thu, Dec 16, 2010 at 9:16 AM, Armin Rigo <arigo at tunes.org> wrote:

> Hi,
>
> On Wed, Dec 15, 2010 at 6:21 PM, René Dudfield <renesd at gmail.com> wrote:
> >> Is pypy suitable for writing interpretor of vectorized language like
> >> Matlab, R etc which vector and matrix are first class objects? This
> includes
> >> matrix shape inference, and efficient linear algebra code generation.
> >
> > have you seen numpy/scipy?
>
> The first aspect is simply if RPython would be suitable for writing an
> interpreter for, say, Matlab.  The answer is "probably yes": PyPy
> would be suitable for such dynamic languages, giving you a JIT
> compiler for free.  I don't really know how complex the core of these
> languages are, but I suspect not too much.
>
> Of course you are then going to hit the same problems that Ademan
> tries to solve for numpy/scipy, notably how to implement at least the
> basic linear algebra operations in such a way that the JIT can improve
> them.  There are various goals there, e.g. to turn Python (or Matlab)
> code like A+B+C, adding three matrices together, into one matrix
> operation instead of two (as it is now: (A+B)+C).  This is all a bit
> experimental so far.
>
>
> A bientôt,
>
> Armin.
> _______________________________________________
> pypy-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/pypy-dev
>

Hi,

Numexpr is really worth looking at:
    http://code.google.com/p/numexpr/

Especially as it combines expressions together, and does things like
chunking (called tiling in the graphics world).

>>> timeit a**2 + b**2 + 2*a*b
10 loops, best of 3: 35.9 ms per loop
>>> ne.set_num_threads(1)  # using 1 threads (on a 6-core machine)
>>> timeit ne.evaluate("a**2 + b**2 + 2*a*b")
100 loops, best of 3: 9.28 ms per loop   # 3.9x faster than NumPy
>>> ne.set_num_threads(4)  # using 4 threads (on a 6-core machine)
>>> timeit ne.evaluate("a**2 + b**2 + 2*a*b")
100 loops, best of 3: 4.17 ms per loop   # 8.6x faster than NumPy


This approach, could be applied in the pypy jit I guess.

For a vectorised xplatform run time assembler see:
    http://code.entropywave.com/git?p=orc.git;a=summary

This allows you to write asm code in the one language, and have it compile
to ARM, SSE, mmx, PPC, NEON, DSP, and even into C.  Instructions that are
not in one set, are emulated by multiple instructions.  It's mainly used for
video, sound and image processing - which are big arrays.


Just some interesting vectorised thingies to look at.

cya.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20101216/9fc0908f/attachment.html>


More information about the Pypy-dev mailing list