I'm sure this is well known, but I just realized that numpy cannot use the _dotblas.so that it makes when it is compiled with ACML. This is because ACML only has the fortran blas libraries, not cblas. numpy will find the acml libraries and use them to make a _dotblas.so without complaining, if you have [blas] blas_libs = acml language = f77 in your site.cfg. But attempting to import this _dotblas.so gives errors .... so numpy never actually uses it.
import _dotblas.so Traceback (most recent call last): File "<stdin>", line 1, in ? ImportError: ./_dotblas.so: undefined symbol: cblas_zaxpy
nm _dotblas.so gives a whole stream of undefined cblas_xxxx symbols. So what are the options? Forget about ACML? Find an optimized cblas for numpy _dotblas but use the ACML flapack for scipy? Persuade somebody to write a scipy-style ?f2py interface to generate _dotblas.so using the fblas? George Nurser.
On Saturday 28 January 2006 19:36, George Nurser wrote:
I'm sure this is well known, but I just realized that numpy cannot use the _dotblas.so that it makes when it is compiled with ACML. This is because ACML only has the fortran blas libraries, not cblas.
numpy will find the acml libraries and use them to make a _dotblas.so without complaining, if you have
[blas] blas_libs = acml language = f77 in your site.cfg.
But attempting to import this _dotblas.so gives errors .... so numpy never actually uses it.
import _dotblas.so
Traceback (most recent call last): File "<stdin>", line 1, in ? ImportError: ./_dotblas.so: undefined symbol: cblas_zaxpy
nm _dotblas.so gives a whole stream of undefined cblas_xxxx symbols.
So what are the options? Forget about ACML? Find an optimized cblas for numpy _dotblas but use the ACML flapack for scipy? Persuade somebody to write a scipy-style ?f2py interface to generate _dotblas.so using the fblas?
George Nurser.
There is code for that on netlib: http://www.netlib.org/blas/blast-forum/cblas.tgz I used it myself for my C code before and it worked just fine. Piotr
There is code for that on netlib: http://www.netlib.org/blas/blast-forum/cblas.tgz
I used it myself for my C code before and it worked just fine.
Piotr
Piotr, Thanks. I got numpy to work using the cblas & acml. Details at the bottom of the email. I then ran the bench.py tests on numpy [1 processor Opteron ?1.8 GHZ] and got slightly unexpected answers: numpy times given both linked to cblas+acml and not linked. Neither of numarray, Numeric linked to any blas: python bench.py Tests x.T*y x*y.T A*x A*B A.T*x half 2in2 Dimension: 5 Array 0.5700 0.1600 0.1200 0.1600 0.6200 0.4300 0.4800 --acml +cblas Matrix 3.1000 0.9300 0.4000 0.4600 0.6500 1.7000 2.6200--acml +cblas Array 0.6400 0.1700 0.1500 0.1800 0.6100 0.3600 0.4000 Matrix 3.2300 0.6900 0.4100 0.4600 0.6700 1.4900 2.3400 NumArr 1.2100 2.8500 0.2700 2.8600 5.0000 4.1100 6.8300 Numeri 0.7300 0.1800 0.1600 0.2000 0.4100 0.3300 0.4300 Dimension: 50 Array 5.9200 0.8400 0.2900 6.9300 8.0900 2.3600 2.4500--acml +cblas Matrix 30.5500 1.8500 0.6000 7.4500 0.9300 3.7100 4.6400--acml +cblas Array 6.5900 2.7100 0.7500 25.3100 8.5000 0.5600 0.6100 Matrix 32.5200 3.2600 1.0200 25.6100 1.2900 1.7400 2.5900 NumArr 12.6600 3.9700 0.7400 27.7900 6.4900 4.5500 7.1900 Numeri 7.9700 1.5000 0.6500 24.2700 7.4200 0.6000 2.3200 Dimension: 500 Array 0.9800 3.2900 0.6100 65.0000 10.8600 2.3100 2.5500--acml +cblas Matrix 3.5300 3.3500 0.6400 64.9300 0.6500 2.3300 2.6100--acml +cblas Array 1.0900 4.5600 0.8300 589.0000 11.0700 0.1300 0.2600 Matrix 3.7000 4.5800 0.8400 593.7300 1.1700 0.1300 0.3200 NumArr 1.6700 3.3100 0.7700 417.5600 4.3900 0.8500 1.1000 Numeri 1.1900 3.5200 0.7800 559.8100 9.7400 0.8000 2.4100 -- acml+blas indeed speeds up matrix multiplication by factor of 10. but --doesn't really help vector dot products. --slows down searching operations half, 2in2 by factor of 10. Matrices generally much slower than arrays, except for A.T*x, which is ~10x faster for matrices. I also tried with the goto blas library linked in with cblas. Similar results, except slightly faster x.T*y. But trickier to get linked. --George Nurser ------------------------------------------------------------------------ ---------------------------------------------------- making the cblas.a library was straightforward. I just changed the flags in Makefile.LINUX to: CFLAGS = -O3 -DADD_ -pthread -fno-strict-aliasing -m64 -msse2 - mfpmath=sse -march=opteron -fPIC FFLAGS = -Wall -fno-second-underscore -fPIC -O3 -funroll-loops - march=opteron -mmmx -msse2 -msse -m3dnow RANLIB = ranlib BLLIB = where libacml.so lives/libacml.so then link Makefile.LINUX to Makefile.in and make. The resulting cblas.a must then be moved or linked to libcblas.a in the *same* directory as the libacml.so. This directory then needs to be added to the $LD_LIBRARY_PATH if it is not a standard one. I needed a site.cfg in numpy/numpy/distutils/site.cfg as follows: [blas] blas_libs = cblas, acml library_dirs = where libacml.so lives include_dirs = where cblas.h lives [lapack] language = f77 lapack_libs = acml library_dirs = where libacml.so lives include_dirs = where acml *.h live Then numpy and scipy both seem to build fine. numpy passes t=numpy.test(), scipy passes scipy.test(level=10).
participants (2)
-
George Nurser
-
Piotr Luszczek