is numerical python only for prototyping ?
Hi, This is not sopposed to be an evil question; instead I'm hoping for the answer: "No, generally we get >=95% the speed of a pure C/fortran implementation" ;-) But as I am the strongest Python/numarray advocate in our group I get often the answer that Matlab is (of course) also very convenient but generally memory handling and overall execution performance is so bad that for final implementation one would generally have to reimplement in C. We are a bio-physics group at UCSF developping new algorithms for deconvolution (often in 3D). Our data sets are regularly bigger than several 100MB. When deciding for numarray I was assuming that the "Hubble Crowd" had a similar situation and all the operations are therefore very much optimized for this type of data. Is 95% a reasonable number to hope for ? I did wrap my own version of FFTW (with "plan-caching"), which should give 100% of the C-speed. But concerns arise from expression like "a=b+c*a" (think "convenience"!): If a,b,c are each 3D-datastacks creation of temporary data-arrays for 'c*a' AND then also for 'b+...' would have to be very costly. (I think this is at least happening for Numeric - I don't know about Matlab and numarray) Hoping for comments, Thanks Sebastian Haase UCSF, Sedat Lab
On 26 Jul 2005, at 18:41, Sebastian Haase wrote:
Hi, This is not sopposed to be an evil question; instead I'm hoping for the answer: "No, generally we get >=95% the speed of a pure C/fortran implementation" ;-)
you won't, generally. Question is: since you are certainly not going to gain an order of a magnitude doing it in C, do you really care?
But as I am the strongest Python/numarray advocate in our group I get often the answer that Matlab is (of course) also very convenient but generally memory handling and overall execution performance is so bad that for final implementation one would generally have to reimplement in C.
Well its true that implementations in C will be faster. And memory handling in Numeric/numarray can be a pain since the tendency is to create and destroy a lot of arrays if you are not careful.
We are a bio-physics group at UCSF developping new algorithms for deconvolution (often in 3D). Our data sets are regularly bigger than several 100MB. When deciding for numarray I was assuming that the "Hubble Crowd" had a similar situation and all the operations are therefore very much optimized for this type of data.
Funny you mention that example. I did my PhD in exactly the same field (considering you are from Sedats lab I guess you are in exactly the same field as I was/am, i.e. fluorescence microscopy. What are you guys up to these days?) and I developed all my algorithms in C at the time. Now, about 7 years later, I returned to the field to re-implement and extend some of my old algorithms for use with microscopy data that can consist of multiple sets, each several 100MB at least. Now I use python with numarray, and I am actualy quite happy with. I am pushing it by using up to 2GB of memory, (per process, after splitting the problem up and distributing it on a cluster...), but it works. I am sure I could squeeze maybe a factor of two or three in terms of speed and memory usage by rewriting in C, but that is currently not worth my time. So I guess that counts as using numarray as a prototyping environment, but the result is also suitable for production use.
Is 95% a reasonable number to hope for ? I did wrap my own version of FFTW (with "plan-caching"), which should give 100% of the C-speed.
That should help a lot, as the standard FFTs that come with Numarray/Numeric suck big time. I do use them, but have to go through all kind of tricks to get some decent memory usage in 32bit floating point. The FFT array module is in fact very badly written for use with large multi-dimensional data sets.
But concerns arise from expression like "a=b+c*a" (think "convenience"!): If a,b,c are each 3D-datastacks creation of temporary data-arrays for 'c*a' AND then also for 'b+...' would have to be very costly. (I think this is at least happening for Numeric - I don't know about Matlab and numarray)
That is indeed a problem, although I think in your case you maybe limited by your FFTs anyway, at least in terms of speed. One thing you should consider is replacing expressions such as ' c= a + b' with add(a, b, c). If you do that cleverly you can avoid quite some memory allocations and you 'should' get closer to C. That does not solve everything though: 1) Complex expressions still need to be broken up in sequences of operations which is likely slower then iterating once over you array and do the expression at each point. 2) Unfortunately not all numarray functions support an output array (maybe only the ufuncs?). This can be a big problem, as then temporary arrays must be allocated. (It sure was a problem for me.) You can of course always re-implement the parts that are critical in C and wrap them (as you did with FFTW). In fact, I think numarray now provides a relatively easy way to write ufuncs which would allow you to write a single python function in C for complex expressions.
Hoping for comments,
Hope this gives some insights. I guess I have had similar experiences, there are definitely some limits to the use of numarray/Numeric that could be relieved, for instance by having a consistent implementation of output arrays. That would allow writing algorithms where you could very strictly control the allocation and de-allocation of arrays, which would be a big help for working with large arrays. Cheers, Peter PS. I would not mind to hear a bit about your experiences doing the big deconvolutions in numarray/Numeric, but that may not be a good topic for this list.
Thanks Sebastian Haase UCSF, Sedat Lab
-- Dr Peter J Verveer European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany Tel. +49 6221 387 8245 Fax. +49 6221 397 8306
On Jul 26, 2005, at 12:41 PM, Sebastian Haase wrote:
Hi, This is not sopposed to be an evil question; instead I'm hoping for the answer: "No, generally we get >=95% the speed of a pure C/fortran implementation" ;-) But as I am the strongest Python/numarray advocate in our group I get often the answer that Matlab is (of course) also very convenient but generally memory handling and overall execution performance is so bad that for final implementation one would generally have to reimplement in C. We are a bio-physics group at UCSF developping new algorithms for deconvolution (often in 3D). Our data sets are regularly bigger than several 100MB. When deciding for numarray I was assuming that the "Hubble Crowd" had a similar situation and all the operations are therefore very much optimized for this type of data. Is 95% a reasonable number to hope for ? I did wrap my own version of FFTW (with "plan-caching"), which should give 100% of the C-speed. But concerns arise from expression like "a=b+c*a" (think "convenience"!): If a,b,c are each 3D-datastacks creation of temporary data-arrays for 'c*a' AND then also for 'b+...' would have to be very costly. (I think this is at least happening for Numeric - I don't know about Matlab and numarray)
Is it speed or memory usage you are worried about? Where are you actually seeing unacceptable performance? Offhand, I would say the temporaries are not likely to be serious speed issues (unless you running out of memory). We did envision at some point (we haven't done it yet) of recognizing situations where the temporaries could be reused. As for 95% speed, it's not what required for our work (I think that what an acceptable speed ratio is depends on the problem). In many cases being within 50% is good enough except for heavily used things where it would need to be faster. But yes, we do plan on (and do indeed actually use it now) for large data sets where speed is important. We generally don't compare the numarray speed to C speed all the time (if we were writing C equivalents every time, that would defeat the purpose of using Python :-). Perhaps you could give a more specific example with some measurements? I don't think I would promise anyone that all one's code could be done in Python. Undoubtedly, there are going to be some cases where coding in C or similar is going to be needed. I'd just argue that Python let's you keep as much as possible in a higher level language and a little as necessary in a low level language such as C. Perry
participants (3)
-
Perry Greenfield
-
Peter Verveer
-
Sebastian Haase