Mailman 3 April 2007 - NumPy-Discussion

Bus Error with string in ndarray with named fields
by Per B. Sederberg 20 Apr '07

20 Apr '07

>>> Hi Folks: I'm getting a very strange bus error in the recent versions of numpy (almost current svn). Here's how you can (hopefully) replicate it: On my MacBook: Python 2.4.3 (#1, Apr 7 2006, 10:54:33) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> N.version.version '1.0.2.dev3569' >>> fields = [('x',(N.str,40))] >>> dat = N.zeros(1,fields) >>> dat Bus error On my linux cluster: Python 2.4.3 (#1, May 8 2006, 11:36:25) [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-49)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> N.version.version '1.0.2.dev3567' >>> fields = [('x',(N.str,40))] >>> dat = N.zeros(1,fields) >>> dat Segmentation fault But it works on my linux desktop with an older numpy: Python 2.4.4c1 (#2, Oct 11 2006, 20:00:03) [GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as N >>> N.version.version '1.0rc1' >>> fields = [('x',(N.str,40))] >>> dat = N.zeros(1,fields) >>> dat array([('',)], dtype=[('x', '|S40')]) >>> Does anyone have any clue as to how to deal with this? Is there a better way to create an ndarray with named fields (recarray)? My real code has a number of fields of various types, but I was able to narrow it down to anytime I use a string type. For example, this works on the machines where using a string fails: >>> import numpy as N >>> fields = [('x',N.bool),('y',N.int32)] >>> dat = N.zeros(1,fields) >>> dat array([(False, 0)], dtype=[('x', '|b1'), ('y', '<i4')]) >>> Thanks for any help, Per

2 1

histogram2d bug?
by Emanuele Olivetti 19 Apr '07

19 Apr '07

While using histogram2d on simple examples I got these errors: import numpy x = numpy.array([0,0]) y = numpy.array([0,1]) numpy.histogram2d(x,y,bins=[2,2]) ----------------------------------------------------------------- Warning: divide by zero encountered in log10 --------------------------------------------------------------------------- exceptions.OverflowError Traceback (most recent call last) /home/ele/<ipython console> /usr/lib/python2.4/site-packages/numpy/lib/twodim_base.py in histogram2d(x, y, bins, range, normed, weights) 180 if N != 1 and N != 2: 181 xedges = yedges = asarray(bins, float) 182 bins = [xedges, yedges] --> 183 hist, edges = histogramdd([x,y], bins, range, normed, weights) 184 return hist, edges[0], edges[1] /usr/lib/python2.4/site-packages/numpy/lib/function_base.py in histogramdd(sample, bins, range, normed, weights) 206 decimal = int(-log10(dedges[i].min())) +6 207 # Find which points are on the rightmost edge. --> 208 on_edge = where(around(sample[:,i], decimal) == around(edges[i][-1], decimal))[0] 209 # Shift these points one bin to the left. 210 Ncount[i][on_edge] -= 1 /usr/lib/python2.4/site-packages/numpy/core/fromnumeric.py in round_(a, decimals, out) 687 except AttributeError: 688 return _wrapit(a, 'round', decimals, out) --> 689 return round(decimals, out) 690 691 around = round_ OverflowError: long int too large to convert to int ----------------- numpy.__version__ '1.0.3.dev3719' Hope this report helps, Emanuele

2 3

Building numpy on Solaris x86 with sun CC and libsunperf
by Peter C. Norton 19 Apr '07

19 Apr '07

Hello all, I'm trying to build numpy for some of my users, and I can't seem to get the [blas_opt] or the [lapack_opt] settings to be honored in my site.cfg: $ CFLAGS="-L$STUDIODIR/lib/ -l=sunperf" CPPFLAGS='-DNO_APPEND_FORTRAN' \ /scratch/nortonp/python-2.5.1c1/bin/python setup.py config Running from numpy source directory. F2PY Version 2_3649 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /lang/SunOS.5.i386/studio-11.0/SUNWspro/lib NOT AVAILABLE [etc, with nothing found] And after all this, I get /projects/python-2.5/numpy-1.0.2/numpy/distutils/system_info.py:1210: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) and a similar thing happens with lapack. My site.cfg boils down to this: [DEFAULT] library_dirs = /usr/local/lib:/lang/SunOS.5.i386/studio-11.0/SUNWspro/lib include_dirs = /usr/local/include:/lang/SunOS.5.i386/studio-11.0/SUNWspro/include [blas_opt] libraries = sunperf [lapack_opt] libraries = sunperf If I mess around with system_info.py I can get setup to acknowledge the addition to the list, but it seems from the output that the optimized libraries section in the site.cfg is ignored (eg. never added to the classes _lib_names array). Is there a known way to get around this? Also, since the lapack and blas libraries are already essentially part of libsunperf, built, do I still need a fortran compiler to link build numpy or can I just bypass that (somehow) and link the .so and go on my merry way? Thanks, -Peter -- The 5 year plan: In five years we'll make up another plan. Or just re-use this one.

2 3

Four bugs in Numexpr and another repository
by Ivan Vilata i Balaguer 19 Apr '07

19 Apr '07

Hi all, Francesc and I have found four bugs in Numexpr while developing PyTables. To spare you from the gory details, I'll point you to the PyTables Trac instance, where bugs are commented and patches are available which should be more or less readily applicable to mainstream Numexpr: 1. VariableNode error when numexpr is asked to evaluate some functions (arc*, pow): http://www.pytables.org/trac/ticket/54 >>> numexpr.evaluate('arcsin(0.5)') Traceback (most recent call last): ... TypeError: 'VariableNode' object is not callable 2. 'int' object has no attribute '__gt__' error in tables.numexpr (also affects to original numexpr): http://www.pytables.org/trac/ticket/55 >>> numexpr.evaluate('cos(1.3) < 0') Traceback (most recent call last): ... AttributeError: 'int' object has no attribute '__gt__' 3. Wrong behaviour of where function of numexpr (als affects to original numexpr): http://www.pytables.org/trac/ticket/56 >>> a = numpy.arange(0,5) >>> numexpr.evaluate('where(a < 2, 1, 0)') Traceback (most recent call last): ... NotImplementedError: couldn't find matching opcode for 'where_bbbb' 4. Segmentation fault with string constant expression: http://www.pytables.org/trac/ticket/58 >>> tables.numexpr.evaluate('"foo"') Segmentation fault (core dumped) This in fact only applies to the PyTables version of Numexpr, which includes the patches I've been sending here for a while. I'm taking this opportunity to point out that you can check that version out from the PyTables Subversion repository: http://www.pytables.org/svn/pytables/trunk/tables/numexpr/ If you need more help on this, don't hesitate to ask. Thanks a lot! (Is it OK to report Numexpr bugs in the SciPy Trac instance? And, what's the right capitalisation of "Numexpr"? ;) ) :: Ivan Vilata i Balaguer >qo< http://www.carabos.com/ Cárabos Coop. V. V V Enjoy Data ""

2 1

Problem with roots and complex coefficients
by lorenzo bolla 19 Apr '07

19 Apr '07

dear all, I've some problems with numpy.roots. take a look at the following code: ======================================== import numpy OK = numpy.roots([1, 1, 1]) OK = numpy.roots([1j, 1]) KO = numpy.roots([1, 1j, 1]) ======================================== it fails with this error message, trying to execute the last line: TypeError: can't convert complex to float; use abs(z)/usr/lib/python2.4/site-packages/numpy/lib/polynomial.py in roots(p) 119 if N > 1: 120 # build companion matrix and find its eigenvalues (the roots) --> 121 A = diag(NX.ones((N-2,), p.dtype), -1) 122 A[0, :] = -p[1:] / p[0] 123 roots = _eigvals(A) /usr/lib/python2.4/site-packages/numpy/lib/twodim_base.py in diag(v, k) 66 i = arange(0,n+k) 67 fi = i+(i-k)*n ---> 68 res.flat[fi] = v 69 return res 70 elif len(s)==2: TypeError: can't convert complex to float; use abs(z) any ideas? thanks, Lorenzo

2 2

building numpy with atlas on ubuntu edgy
by Christian K 19 Apr '07

19 Apr '07

Hi, I'm trying to build numpy from svn on ubuntu edgy with atlas provided by ubuntu package atlas3-sse2-dev which contains: /usr /usr/lib /usr/lib/sse2 /usr/lib/sse2/libatlas.a /usr/lib/sse2/libcblas.a /usr/lib/sse2/libf77blas.a /usr/lib/sse2/liblapack_atlas.a /usr/lib/atlas /usr/lib/atlas/sse2 /usr/lib/atlas/sse2/libblas.a /usr/lib/atlas/sse2/liblapack.a /usr/share /usr/share/doc /usr/share/doc/atlas3-sse2-dev /usr/share/doc/atlas3-sse2-dev/copyright /usr/share/doc/atlas3-sse2-dev/changelog.Debian.gz /usr/lib/sse2/libatlas.so /usr/lib/sse2/libcblas.so /usr/lib/sse2/libf77blas.so /usr/lib/sse2/liblapack_atlas.so /usr/lib/atlas/sse2/libblas.so /usr/lib/atlas/sse2/liblapack.so I tried both with and without a site.cfg: [DEFAULT] library_dirs = /usr/lib/sse2 include_dirs = /usr/include [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack, f77blas, cblas, atlas and tested wether numpy is actually using the optimized libs as demonstrated in a posting by Simon Burton (http://article.gmane.org/gmane.comp.python.numeric.general/5849). It apparently is linked to /usr/lib/sse2/libatlas.so.3.0 /usr/lib/sse2/libcblas.so.3.0 /usr/lib/sse2/libf77blas.so.3.0 /usr/lib/python2.4/site-packages/numpy/linalg/lapack_lite.so The optimized lapack lib is not used. This is consistent with the output of the build script: ck@kiste:~/prog/scipy/numpy$ python setup.py build Running from numpy source directory. non-existing path in 'numpy/distutils': 'site.cfg' F2PY Version 2_3714 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/atlas FOUND: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2'] language = c customize GnuFCompiler customize GnuFCompiler customize GnuFCompiler using config compiling '_configtest.c': /* This file is generated from numpy_distutils/system_info.py */ void ATL_buildinfo(void); int main(void) { ATL_buildinfo(); return 0; } C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O2 -Wall -Wstrict-prototypes -fPIC compile options: '-c' gcc: _configtest.c gcc -pthread _configtest.o -L/usr/lib/sse2 -lf77blas -lcblas -latlas -o _configtest ATLAS version 3.6.0 built by root on Fri Jan 9 15:57:20 UTC 2004: UNAME : Linux intech67 2.4.20 #1 SMP Fri Jan 10 18:29:51 EST 2003 i686 GNU/Linux INSTFLG : MMDEF : /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/gemm ARCHDEF : /fix/g/camm/atlas3-3.6.0/CONFIG/ARCHS/P4SSE2/gcc/misc F2CDEFS : -DAdd__ -DStringSunStyle CACHEEDGE: 1048576 F77 : /usr/bin/g77, version GNU Fortran (GCC) 3.3.3 20031229 (prerelease) (Debian) F77FLAGS : -fomit-frame-pointer -O CC : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229 (prerelease) (Debian) CC FLAGS : -fomit-frame-pointer -O3 -funroll-all-loops MCC : /usr/bin/gcc, version gcc (GCC) 3.3.3 20031229 (prerelease) (Debian) MCCFLAGS : -fomit-frame-pointer -O success! removing: _configtest.c _configtest.o _configtest FOUND: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2'] language = c define_macros = [('ATLAS_INFO', '"\\"3.6.0\\""')] lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib/atlas libraries lapack_atlas not found in /usr/lib/atlas libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib libraries lapack_atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib/atlas libraries lapack_atlas not found in /usr/lib/atlas libraries lapack not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib libraries lapack_atlas not found in /usr/lib numpy.distutils.system_info.atlas_info /media/hda6/home/ck/prog/scipy/numpy/numpy/distutils/system_info.py:903: UserWarning: ********************************************************************* Could not find lapack library within the ATLAS installation. ********************************************************************* warnings.warn(message) FOUND: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/sse2'] language = c define_macros = [('ATLAS_WITHOUT_LAPACK', None)] lapack_info: libraries lapack not found in /usr/local/lib libraries lapack not found in /usr/lib NOT AVAILABLE Confusingly lapack_atlas resides in /usr/lib but even though setup.py looks for it in that place it reports 'not found'. What should I try next? Thanks, Christian

9 28

Efficient operator overloading
by Sturla Molden 18 Apr '07

18 Apr '07

On 4/18/2007 7:33 AM, Anne Archibald wrote: >copying. And the scope of improvement would be very limited; an expression like A*B+C*D would be much more efficient, probably, if the whole expression were evaluated at once for each element (due to memory locality and temporary allocation). But it is impossible for numpy, sitting inside python as it does, to do that. Most numerical array/matrix libraries dependent on operator overloading generates temporaries. That is why Fortran is usually perceived as superior to C++ for scientific programming. The Fortran compiler knows about arrays and can avoid allocating three temporary arrays to evaluate and expression like y = a * b + c * d If this expression is evaluated bya Fortran 90/95 compiler, it will automatically generate code like do i = 1,n y(i) = a(i) * b(i) + c(i) * d(i) enddo On the other hand, conventional use of overloaded operators would result in something like this: allocate(tmp1,n) do i = 1,n tmp1(i) = a(i) * b(i) enddo allocate(tmp2,n) do i = 1,n tmp2(i) = c(i) * d(i) enddo allocate(tmp3,n) do i = 1,n tmp3(i) = tmp1(i) + tmp2(i) enddo deallocate(tmp1) deallocate(tmp2) do i = 1,n y(i) = tmp3(i) enddo deallocate(tmp3) Traversing memory is one of the most expensive thing a CPU can do. This approach is therefore extremely inefficient compared with what a Fortran compiler can do. This does not mean that all use of operator overloading is inherently bad. Notably, there is a C++ numerical library called Blitz++ which can avoid these tempraries for small fixed-size arrays. As it depends on template metaprogramming, the size must be known at compile time. But if this is the case, it can be just as efficient as Fortran if the optimizer is smart enough to remove the redundant operations. Most modern C++ compilers is smart enough to do this. Note that it only works for fixed size arrays. Fortran compilers can do this on a more general basis. It is therefore advisable to have array syntax built into the language syntax it self, as in Fortran 90/95 and Ada 95. But if we implement the operator overloading a bit more intelligent, it should be possible to get rid of most of the temporary arrays. We could replace temporary arrays with an "unevaluated expression class" and let the library pretend it is a compiler. Let us assume again we have an expression like y = a * b + c * d where a,b,c and d are all arrays or matrices. In this case, the overloaded * and + operators woud not return a temporary array but an unevaluated expression of class Expr. Thus we would get tmp1 = Expr('__mul__',a,b) # symbolic representation of 'a * b' tmp2 = Expr('__mul__',c,d) # symbolic representation of 'c * d' tmp3 = Expr('__add__',tmp1,tmp1) # symbolic "a * b + c * d" del tmp1 del tmp2 y = tmp3 # y becomes a reference to an unevaluated expression Finally, we need a mechanism to 'flush' the unevaluated expression. Python does not allow the assignment operator to be evaluated, so one could not depend on that. But one could a 'flush on read/write' mechanism, and let an Expr object exist in different finite states (e.g. unevaluated and evaluated). If anyone tries to read an element from y or change any of the objects it involves, the expression gets evaluated without temporaries. Before that, there is no need to evalute the expression at all! We can just keep a symbolic representation of it. Procrastination is good! Thus later on... x = y[i] # The expression 'a * b + c * d' gets evaluated. The # object referred to by y now holds an actual array. or a[i] = 2.0 # The expression 'a * b + c * d' gets evaluated. The # object referred to by y now holds an actual array. # Finally, 2.0 is written to a[i]. or y[i] = 2.0 # The expression 'a * b + c * d' gets evaluated. The # object referred to by y now holds an actual array. # Finally, 2.0 is written to y[i]. When the expression 'a * b + c * d' is finally evaluated, we should through symbolic manipulation get something (at least close to) a single efficient loop: do i = 1,n y(i) = a(i) * b(i) + c(i) * d(i) enddo I'd really like to see a Python extension library do this one day. It would be very cool and (almost) as efficient as plain Fortran - though not quite, we would still get some small temporary objects created. But that is a sacrifice I am willing to pay to use Python. We would gain some efficacy over Fortran by postponing indefinitely evaluation of computations that are not needed, when this is not known at compile time. Any comments? Sturla Molden Ph.D.

7 10

Scimark, icc, & Core 2 Duo
by rex 18 Apr '07

18 Apr '07

Keith Goodman <kwgoodman(a)gmail.com> [2007-04-18 12:46]: > Thanks for that. For a variety of reasons I'm sticking with atlas. > Does the parallel flag give you a big speed increase? I imagine it > speeds things up more for larger matrices. Surprisingly little. Below are the results of running Scimark with various icc and gcc compiler flags set. The maximum Scimark score is 55% larger with icc than with gcc, though there may be flags other than -O3 that would help gcc. The optimized (for Xeon, not for Core 2 Duo) LINPACK that ships with MKL runs at about 7 gigaflops max on my Core 2 Duo overclocked to 2.93 GHz (it's different from LINPACK 1000). There is a Core 2 Duo optimized version for OSX. icc with no flags set: > icc *.c -o no_flags > ./noflags -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 605.84 FFT Mflops: 111.70 (N=1048576) SOR Mflops: 868.52 (1000 x 1000) MonteCarlo: Mflops: 120.37 Sparse matmult Mflops: 853.33 (N=100000, nz=1000000) LU Mflops: 1075.27 (M=1000, N=1000) > icc -fast *.c -o fast > ./fast -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 785.63 FFT Mflops: 108.31 (N=1048576) SOR Mflops: 985.81 (1000 x 1000) MonteCarlo: Mflops: 848.81 Sparse matmult Mflops: 825.81 (N=100000, nz=1000000) LU Mflops: 1159.42 (M=1000, N=1000) > icc -fast -parallel *.c -o fast_para IPO: performing multi-file optimizations IPO: generating object file /tmp/ipo_iccvHW42m.o scimark2.c(63) : (col. 18) remark: LOOP WAS VECTORIZED. kernel.c(157) : (col. 13) remark: LOOP WAS VECTORIZED. kernel.c(212) : (col. 17) remark: LOOP WAS VECTORIZED. > ./fast_para -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 796.33 FFT Mflops: 111.70 (N=1048576) SOR Mflops: 1001.91 (1000 x 1000) MonteCarlo: Mflops: 855.57 Sparse matmult Mflops: 832.52 (N=100000, nz=1000000) LU Mflops: 1179.94 (M=1000, N=1000) > icc -fast -parallel -fno-alias *.c -o fast_para_noali IPO: performing multi-file optimizations IPO: generating object file /tmp/ipo_iccLUySDv.o scimark2.c(63) : (col. 18) remark: LOOP WAS VECTORIZED. kernel.c(157) : (col. 13) remark: LOOP WAS VECTORIZED. kernel.c(212) : (col. 17) remark: LOOP WAS VECTORIZED. > ./fast_para_noali -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 890.46 FFT Mflops: 109.70 (N=1048576) SOR Mflops: 1488.28 (1000 x 1000) MonteCarlo: Mflops: 855.57 Sparse matmult Mflops: 829.15 (N=100000, nz=1000000) LU Mflops: 1169.59 (M=1000, N=1000) > icc -fast -parallel -fno-alias -funroll-loops *.c -o fast_para_noali_unr IPO: performing multi-file optimizations IPO: generating object file /tmp/ipo_icc2KA1ui.o scimark2.c(63) : (col. 18) remark: LOOP WAS VECTORIZED. kernel.c(157) : (col. 13) remark: LOOP WAS VECTORIZED. kernel.c(212) : (col. 17) remark: LOOP WAS VECTORIZED. > ./fast_para_noali_unr -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 901.11 FFT Mflops: 113.48 (N=1048576) SOR Mflops: 1510.28 (1000 x 1000) MonteCarlo: Mflops: 865.92 Sparse matmult Mflops: 835.92 (N=100000, nz=1000000) LU Mflops: 1179.94 (M=1000, N=1000) > gcc -lm *.c -o ggc_none > ./ggc_none -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 323.63 FFT Mflops: 83.56 (N=1048576) SOR Mflops: 729.97 (1000 x 1000) MonteCarlo: Mflops: 73.75 Sparse matmult Mflops: 329.26 (N=100000, nz=1000000) LU Mflops: 401.61 (M=1000, N=1000) > gcc -lm -O3 *.c -o ggc_O3 > ./gcc_O3 -large ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to pozo(a)nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 580.55 FFT Mflops: 108.86 (N=1048576) SOR Mflops: 842.27 (1000 x 1000) MonteCarlo: Mflops: 115.70 Sparse matmult Mflops: 825.81 (N=100000, nz=1000000) LU Mflops: 1010.10 (M=1000, N=1000) -rex

1 0

Question about Optimization (Inline and Pyrex)
by Simon Berube 18 Apr '07

18 Apr '07

I recently made the switch from Matlab to Python and am very interested in optimizing certain routines that I find too slow in python/numpy (long loops). I have looked and learned about the different methods used for such problems such as blitz, weave and pyrex but had a question for more experienced developpers. It appears that pyrex is the fastest of the bunch with weave very close behind but at the same time pyrex requires entirely different modules while weave can be inserted almost painlessly into existing code. Is the speed gain and usefulness of pyrex severely limited by the extra maintenance required y having separate "fast" routines from the rest of the code files? I am greatly interested in finding out what more experienced developers feel about these issues given that I may be completely off track and missing on a useful tool(pyrex) thinking weave is better than it actually is and I am quite frankly afraid of writing routines in one format and realizing later that it creates problems that I need to rewrite. I have tried searching for previous similar posts but could not find any. My apologies if this is a repeat or a severly dumb question. Regards, Simon Berube

11 26

fumfunction question
by Mark.Miller 18 Apr '07

18 Apr '07

Can additional function arguments (aside from the dimensions of an array) be used in conjunction with fromfunction? Thanks, -Mark

3 4