Hi there, I am a recently convert of python, which I started using for my numerical computing needs (I am a PhD student in signal processing) to replace matlab. For those who do not know matlab, it is a big (and expensive) software which implements a 'language' optimized for linear algebra, and with time becomes one of the most used software for numerical computation. In my field of research (signal processing), matlab is almost a standard (by standard, I mean everybody knows it and uses it). I think python with numpy/scipy is better in almost any way if you are ready to invest some time. Now, concerning pypy. The main idea of numpy is to give an array class to python, so that most inner loops are not interpreted, but run in Cpython or through highly optimized fortran libraries. In those cases, python is fast enough for most cases. But there are some cases where this paradigm of using linear algebra to speed things up does not work really well (recursive algorithms); in those case, the loop + function call cost of python makes any implementation for non toy problems really slow. Right now, the only choice is to code the thing in C, with a big loss on the flexibility side. I was wondering if pypy has some solution/new approaches to this problem. For example, when using numpy, I would suspect that many functions calls are 'static', that is always expect the same type of arguments; also, for simple loop on integers, my understanding is that JIT compilation has some nice solution to give to have much better performances (matlab has a JIT compiler to make loop faster for interpreted code). It looks like at some point, there was some work done in pypy relatively to numpy arrays, but I didn't find any documentation on that. cheers, David
David Cournapeau <david@ar.media.kyoto-u.ac.jp> writes:
Hi there,
I am a recently convert of python, which I started using for my numerical computing needs (I am a PhD student in signal processing) to replace matlab.
For those who do not know matlab, it is a big (and expensive) software which implements a 'language' optimized for linear algebra, and with time becomes one of the most used software for numerical computation. In my field of research (signal processing), matlab is almost a standard (by standard, I mean everybody knows it and uses it). I think python with numpy/scipy is better in almost any way if you are ready to invest some time.
Now, concerning pypy. The main idea of numpy is to give an array class to python, so that most inner loops are not interpreted, but run in Cpython or through highly optimized fortran libraries. In those cases, python is fast enough for most cases. But there are some cases where this paradigm of using linear algebra to speed things up does not work really well (recursive algorithms); in those case, the loop + function call cost of python makes any implementation for non toy problems really slow. Right now, the only choice is to code the thing in C, with a big loss on the flexibility side.
I was wondering if pypy has some solution/new approaches to this problem. For example, when using numpy, I would suspect that many functions calls are 'static', that is always expect the same type of arguments; also, for simple loop on integers, my understanding is that JIT compilation has some nice solution to give to have much better performances (matlab has a JIT compiler to make loop faster for interpreted code).
It may be that the JIT is good for this sort of code. It's not yet though :-)
It looks like at some point, there was some work done in pypy relatively to numpy arrays, but I didn't find any documentation on that.
As far as I'm aware this was something else: teaching PyPy's annotator to recognise code that uses Numeric arrays and the code generator how to compile this to equivalent code that manipulates arrays at a lower level. This could be seen as an alternative to rewriting your code in C. I'm not sure what the state of this code is, but I don't think it's very advanced. Cheers, mwh -- The Internet is full. Go away. -- http://www.disobey.com/devilshat/ds011101.htm
Michael Hudson wrote:
David Cournapeau <david@ar.media.kyoto-u.ac.jp> writes:
Hi there,
I am a recently convert of python, which I started using for my numerical computing needs (I am a PhD student in signal processing) to replace matlab.
For those who do not know matlab, it is a big (and expensive) software which implements a 'language' optimized for linear algebra, and with time becomes one of the most used software for numerical computation. In my field of research (signal processing), matlab is almost a standard (by standard, I mean everybody knows it and uses it). I think python with numpy/scipy is better in almost any way if you are ready to invest some time.
Now, concerning pypy. The main idea of numpy is to give an array class to python, so that most inner loops are not interpreted, but run in Cpython or through highly optimized fortran libraries. In those cases, python is fast enough for most cases. But there are some cases where this paradigm of using linear algebra to speed things up does not work really well (recursive algorithms); in those case, the loop + function call cost of python makes any implementation for non toy problems really slow. Right now, the only choice is to code the thing in C, with a big loss on the flexibility side.
I was wondering if pypy has some solution/new approaches to this problem. For example, when using numpy, I would suspect that many functions calls are 'static', that is always expect the same type of arguments; also, for simple loop on integers, my understanding is that JIT compilation has some nice solution to give to have much better performances (matlab has a JIT compiler to make loop faster for interpreted code).
It may be that the JIT is good for this sort of code. It's not yet though :-) I understand that pypy is still in (relative) infancy, but I have to confess I am more and more interested in all those concepts of JIT, etc... even if I don't know much outside the concept and the big picture.
As far as I'm aware this was something else: teaching PyPy's annotator to recognise code that uses Numeric arrays and the code generator how to compile this to equivalent code that manipulates arrays at a lower level. This could be seen as an alternative to rewriting your code in C. I'm not sure what the state of this code is, but I don't think it's very advanced.
I am not familiar with the pypy vocabulary yet, so let me rephrase to be sure I understand: the idea is to detect numpy arrays, and instead of using C extension for fast computation, it would generate automatically code in the target language (let's say C for C generation) ? So for example, if a and b are numpy arrays, a python expression b *= a would be translated in C by something like for(i = 0; i < b.size; ++i) {b[i] *= a[i];} ? Is anyone working on that ? cheers, David
David Cournapeau <david@ar.media.kyoto-u.ac.jp> writes:
Michael Hudson wrote:
David Cournapeau <david@ar.media.kyoto-u.ac.jp> writes:
I was wondering if pypy has some solution/new approaches to this problem. For example, when using numpy, I would suspect that many functions calls are 'static', that is always expect the same type of arguments; also, for simple loop on integers, my understanding is that JIT compilation has some nice solution to give to have much better performances (matlab has a JIT compiler to make loop faster for interpreted code).
It may be that the JIT is good for this sort of code. It's not yet though :-) I understand that pypy is still in (relative) infancy, but I have to confess I am more and more interested in all those concepts of JIT, etc... even if I don't know much outside the concept and the big picture.
It's definitely interesting stuff :-)
As far as I'm aware this was something else: teaching PyPy's annotator to recognise code that uses Numeric arrays and the code generator how to compile this to equivalent code that manipulates arrays at a lower level. This could be seen as an alternative to rewriting your code in C. I'm not sure what the state of this code is, but I don't think it's very advanced.
I am not familiar with the pypy vocabulary yet, so let me rephrase to be sure I understand: the idea is to detect numpy arrays, and instead of using C extension for fast computation, it would generate automatically code in the target language (let's say C for C generation) ?
So for example, if a and b are numpy arrays, a python expression b *= a would be translated in C by something like for(i = 0; i < b.size; ++i) {b[i] *= a[i];} ?
Yes, I think that's a pretty good description.
Is anyone working on that ?
Not that I know of. Cheers, mwh --
say-hi-to-the-flying-pink-elephants-for-me-ly y'rs, No way, the flying pink elephants are carrying MACHINE GUNS! Aiiee!! Time for a kinder, gentler hallucinogen... -- Barry Warsaw & Greg Ward, python-dev
On Wed, 20 Dec 2006 11:50:37 +0900 David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
So for example, if a and b are numpy arrays, a python expression b *= a would be translated in C by something like for(i = 0; i < b.size; ++i) {b[i] *= a[i];} ?
Yes, but numpy already does this. The ultimate goal was to be able to collapse many such operations into an "optimal" (minimal) number of loops, eg. a = b + c + d => for(...) {a[i]=b[i]+c[i]+d[i]} which is much more cache friendly. Simon.
participants (3)
-
David Cournapeau
-
Michael Hudson
-
Simon Burton