Problems building NumPy with GotoBLAS
I'm having problems getting the GotoBLAS library (Nehalem optimized BLAS  "http://www.tacc.utexas.edu/taccprojects/gotoblas2/") working properly under the Python NumPy package ("http://numpy.scipy.org/") on a quadcore Nehalem under FC10.
The command used to build the library is: make BINARY=64 USE_THREAD=1 MAX_CPU_NUMBER=4
I'm limiting this to four cores, as I believe HyperThreading will slow it down (I've seen this happen with other scientific code). I'll benchmark later to see whether or not HyperThreading helps.
I built the library (it uses fPIC), then installed it under /usr/local/lib64, and created the appropriate links: # cp libgoto2_nehalempr1.13.a /usr/local/lib64 # cp libgoto2_nehalempr1.13.so /usr/local/lib64 # cd /usr/local/lib64 # ln s libgoto2_nehalempr1.13.a libgoto2.a # ln s libgoto2_nehalempr1.13.so libgoto2.so # ln s libgoto2_nehalempr1.13.a libblas.a # ln s libgoto2_nehalempr1.13.so libblas.so
Without the libblas links, the NumPy configuration used the system default BLAS library (singlethreaded NetLib under FC10); it's set up for the NetLib and Atlas BLAS libraries.
I used the latest release of NumPy, with no site.cfg file, and no NumPy directory ("rm rf /usr/local/python2.6/lib/python2.6/sitepackages/numpy"). The configuration step ("python setup.py config") appears to run OK, as do the build ("python setup.py build") and install ("python setup.py install") steps. The problem is that the benchmark takes 8.5 seconds, which is what it took before I changed the library.
python c "import numpy as N; a=N.random.randn(1000, 1000); N.dot(a, a)"
I expect I'm missing something really simple here, but I've spent >10 hours on it, and I have no idea as to what it could be. I've tried various permutations on the site.cfg file, all to no avail. I've also tried different names on the library, and different locations. I've even tried a set of symbolic links in /usr/local/lib64 for liblapack that point to libgoto.
Could someone offer some suggestions?
Thank you for your time.
Peter Ashford
On 08/17/2010 01:58 PM, ashford@whisperpc.com wrote:
I'm having problems getting the GotoBLAS library (Nehalem optimized BLAS  "http://www.tacc.utexas.edu/taccprojects/gotoblas2/") working properly under the Python NumPy package ("http://numpy.scipy.org/") on a quadcore Nehalem under FC10.
The command used to build the library is: make BINARY=64 USE_THREAD=1 MAX_CPU_NUMBER=4
I'm limiting this to four cores, as I believe HyperThreading will slow it down (I've seen this happen with other scientific code). I'll benchmark later to see whether or not HyperThreading helps.
I built the library (it uses fPIC), then installed it under /usr/local/lib64, and created the appropriate links: # cp libgoto2_nehalempr1.13.a /usr/local/lib64 # cp libgoto2_nehalempr1.13.so /usr/local/lib64 # cd /usr/local/lib64 # ln s libgoto2_nehalempr1.13.a libgoto2.a # ln s libgoto2_nehalempr1.13.so libgoto2.so # ln s libgoto2_nehalempr1.13.a libblas.a # ln s libgoto2_nehalempr1.13.so libblas.so
The .so are only used when linking, and not the ones used at runtime generally (the full version, e.g. .so.1.2.3 is). Which version exactly depends on your installation, but I actually advise you against doing those softlink. You should instead specificaly link the GOTO library to numpy, by customizing the site.cfg,
cheers,
David
Peter,
please below a script that will build numpy using a relevant site.cfg for your configuration (you need to update GOTODIR and LAPACKDIR and PYTHONDIR):
#!/bin/sh
#BLAS/LAPACK configuration file echo "[blas]" > ./site.cfg echo "library_dirs = GOTODIR" >> ./site.cfg echo "blas_libs = goto2_nehalempr1.13" >> ./site.cfg echo "[lapack]" >> ./site.cfg echo "library_dirs = LAPACKDIR" >> ./site.cfg echo "lapack_libs = lapack" >> ./site.cfg
#compilation variables export CC=gcc export F77=gfortran export F90=gfortran export F95=gfortran export LDFLAGS="shared Wl,rpath=\'\$ORIGIN/../../../..\'" export CFLAGS="O1 pthread" export FFLAGS="O2"
#build python setup.py config python setup.py build python setup.py install
#copy site.cfg cp ./site.cfg PYTHONDIR/lib/python2.6/sitepackages/numpy/distutils/.
# Test PWD=`pwd` cd $HOME python c "import numpy ; numpy.test()" cd $PWD
Regards, Eloi
On Tuesday 17 August 2010 07:14:29 ashford@whisperpc.com wrote:
David,
You should instead specificaly link the GOTO library to numpy, by customizing the site.cfg,
That was one of the many things I tried came to the list.
Peter Ashford
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Eloi,
please below a script that will build numpy using a relevant site.cfg for your configuration (you need to update GOTODIR and LAPACKDIR and PYTHONDIR):
#copy site.cfg cp ./site.cfg PYTHONDIR/lib/python2.6/sitepackages/numpy/distutils/.
I believe this needs a $ prior to PYTHONDIR.
I tried this, and the benchmark still came in at 8.5S.
Any ideas? How long should the following benchmark take on a Corei7 930, with Atlas or Goto?
time python c "import numpy as N; a=N.random.randn(1000, 1000); N.dot(a, a)"
Thank you.
Peter Ashford
On 08/18/2010 07:39 AM, ashford@whisperpc.com wrote:
Eloi,
please below a script that will build numpy using a relevant site.cfg for your configuration (you need to update GOTODIR and LAPACKDIR and PYTHONDIR):
#copy site.cfg cp ./site.cfg PYTHONDIR/lib/python2.6/sitepackages/numpy/distutils/.
I believe this needs a $ prior to PYTHONDIR.
I tried this, and the benchmark still came in at 8.5S.
Any ideas? How long should the following benchmark take on a Corei7 930, with Atlas or Goto?
time python c "import numpy as N; a=N.random.randn(1000, 1000); N.dot(a, a)"
Do you have a _dotblas.so file ? We only support _dotblas linked against atlas AFAIK, which means goto won't be used in that case. To check which libraries are used by your extensions, you should use ldd on the .so files (for example ldd .../numpy/linalg/lapack_lite.so).
cheers,
David
Peter,
As mentionned, you need to replace the values of GOTODIR, LAPACKDIR and PYTHONDIR in this script with the ones matching your environment.
Eloi
On Wednesday 18 August 2010 00:39:48 ashford@whisperpc.com wrote:
Eloi,
please below a script that will build numpy using a relevant site.cfg for your configuration (you need to update GOTODIR and LAPACKDIR and PYTHONDIR):
#copy site.cfg cp ./site.cfg PYTHONDIR/lib/python2.6/sitepackages/numpy/distutils/.
I believe this needs a $ prior to PYTHONDIR.
I tried this, and the benchmark still came in at 8.5S.
Any ideas? How long should the following benchmark take on a Corei7 930, with Atlas or Goto?
time python c "import numpy as N; a=N.random.randn(1000, 1000); N.dot(a, a)"
Thank you.
Peter Ashford
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Hi all,
I wonder if Peter finally got Gotoblas working with numpy. I am trying with gotoblas 1.13 installed in the same way:
$ ls R .: include lib
./include: goto
./include/goto: blaswrap.h cblas.h clapack.h f2c.h
./lib: libgoto2.a libgoto2_nehalempr1.13.a libgoto2_nehalempr1.13.so libgoto2.so
and numpy 1.5.1 with this site.cfg
[DEFAULT] library_dirs = /usr/local/gcc/4.5.2/gcc/lib64:/usr/local/gcc/4.5.2/gcc/lib:/usr/local/gcc/4.5.2/gcc/lib32:/usr/local/gcc/4.5.2/gotoblas2/1.13/lib:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/lib:/usr/local/gcc/4.5.2/fftw/3.2.2/lib include_dirs = /usr/local/gcc/4.5.2/gcc/include:/usr/local/gcc/4.5.2/gotoblas2/1.13/include/goto:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/include:/usr/local/gcc/4.5.2/fftw/3.2.2/include search_static_first = 1 [blas_opt] libraries = goto2 language = fortran [lapack_opt] libraries = goto2 language = fortran [amd] amd_libs = amd [umfpack] umfpack_libs = umfpack [fftw] libraries = fftw3
(I also tried without "_opt" and "language = fortran"); I used goto2 for lapack because I read lapack should be included in libgoto (anyway things do not change using "lapack"). I am quite sure the system is not using my gotolapack stuff since every time I buid i get:
building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.linuxx86_642.7/numpy/linalg *### Warning: Using unoptimized lapack ###* * adding 'numpy/linalg/lapack_litemodule.c' to sources.* adding 'numpy/linalg/python_xerbla.c' to sources. adding 'numpy/linalg/zlapack_lite.c' to sources. adding 'numpy/linalg/dlapack_lite.c' to sources. adding 'numpy/linalg/blas_lite.c' to sources. adding 'numpy/linalg/dlamch.c' to sources. adding 'numpy/linalg/f2c_lite.c' to sources. building extension "numpy.random.mtrand" sources creating build/src.linuxx86_642.7/numpy/random C compiler: /usr/local/gcc/4.5.2//gcc/bin/gcc DNDEBUG g fwrapv O3 Wall Wstrictprototypes O1 pthread fPIC march=native mtune=native I/usr/local/gcc/4.5.2//gcc/include I/usr/local/gcc/4.5.2//suiteSparse/3.6.0/include I/usr/local/gcc/4.5.2//fftw/3.2.2/include fPIC
during numpy installation (which ends successfully). Moreover I cannot see any lgoto2 as I would have expected. Incidentally, I cannot see lamd, lumfpack, lfftw3 (or any reference to amd, umfpack, fftw3) neither, although there seems to be something to handle them in system_info.py. The failure is so complete that I must have done some big mistake but I can't correct my site.cfg even after searching the internet. This seems to be one of the major discussion about this topic so I am asking here for some help, please. Is the problem related with site.cfg or with gotoblas2 installation? Is it true that gotoblas2 hosts a full lapack inside?
thank you very much!
giuseppe
I'm no expert, but I just pulled off the scipy+numpy+GotoBLAS2 installation.
From what I gather, the Makefile for libgoto2 downloads and compiles the
generic lapack from netlib. It also wraps lapack into libgoto2.so/.a. I believe the idea is as long as the BLAS implementation is fast(TM), the lapack performance will be good.
To wit*, what I did was to tell numpy where libgoto2 was: env BLAS=/path/to/libgoto2.so python setup.py install Scipy also wants the path to lapack, which is wrapped inside libgoto2: env BLAS=/path/to/libgoto2.so LAPACK=/path/to/libgoto2.so python setup.py install Afterwards, I added the path to LD_LIBRARY_PATH. This was on a linux cluster, if that matters. At any rate, I can testify that it was not a big job to get numpy and scipy working with goto blas.
Good luck, Paul.
*) I have notes on this on a different computer, but not available right now.
On Tue, Mar 22, 2011 at 10:13 AM, Giuseppe Aprea giuseppe.aprea@gmail.comwrote:
Hi all,
I wonder if Peter finally got Gotoblas working with numpy. I am trying with gotoblas 1.13 installed in the same way:
$ ls R .: include lib
./include: goto
./include/goto: blaswrap.h cblas.h clapack.h f2c.h
./lib: libgoto2.a libgoto2_nehalempr1.13.a libgoto2_nehalempr1.13.so libgoto2.so
and numpy 1.5.1 with this site.cfg
[DEFAULT] library_dirs = /usr/local/gcc/4.5.2/gcc/lib64:/usr/local/gcc/4.5.2/gcc/lib:/usr/local/gcc/4.5.2/gcc/lib32:/usr/local/gcc/4.5.2/gotoblas2/1.13/lib:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/lib:/usr/local/gcc/4.5.2/fftw/3.2.2/lib include_dirs = /usr/local/gcc/4.5.2/gcc/include:/usr/local/gcc/4.5.2/gotoblas2/1.13/include/goto:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/include:/usr/local/gcc/4.5.2/fftw/3.2.2/include search_static_first = 1 [blas_opt] libraries = goto2 language = fortran [lapack_opt] libraries = goto2 language = fortran [amd] amd_libs = amd [umfpack] umfpack_libs = umfpack [fftw] libraries = fftw3
(I also tried without "_opt" and "language = fortran"); I used goto2 for lapack because I read lapack should be included in libgoto (anyway things do not change using "lapack"). I am quite sure the system is not using my gotolapack stuff since every time I buid i get:
building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.linuxx86_642.7/numpy/linalg *### Warning: Using unoptimized lapack ###*
 adding 'numpy/linalg/lapack_litemodule.c' to sources.* adding 'numpy/linalg/python_xerbla.c' to sources. adding 'numpy/linalg/zlapack_lite.c' to sources. adding 'numpy/linalg/dlapack_lite.c' to sources. adding 'numpy/linalg/blas_lite.c' to sources. adding 'numpy/linalg/dlamch.c' to sources. adding 'numpy/linalg/f2c_lite.c' to sources.
building extension "numpy.random.mtrand" sources creating build/src.linuxx86_642.7/numpy/random C compiler: /usr/local/gcc/4.5.2//gcc/bin/gcc DNDEBUG g fwrapv O3 Wall Wstrictprototypes O1 pthread fPIC march=native mtune=native I/usr/local/gcc/4.5.2//gcc/include I/usr/local/gcc/4.5.2//suiteSparse/3.6.0/include I/usr/local/gcc/4.5.2//fftw/3.2.2/include fPIC
during numpy installation (which ends successfully). Moreover I cannot see any lgoto2 as I would have expected. Incidentally, I cannot see lamd, lumfpack, lfftw3 (or any reference to amd, umfpack, fftw3) neither, although there seems to be something to handle them in system_info.py. The failure is so complete that I must have done some big mistake but I can't correct my site.cfg even after searching the internet. This seems to be one of the major discussion about this topic so I am asking here for some help, please. Is the problem related with site.cfg or with gotoblas2 installation? Is it true that gotoblas2 hosts a full lapack inside?
thank you very much!
giuseppe
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Den 22.03.2011 23:18, skrev Paul Anton Letnes:
I'm no expert, but I just pulled off the scipy+numpy+GotoBLAS2 installation. From what I gather, the Makefile for libgoto2 downloads and compiles the generic lapack from netlib. It also wraps lapack into libgoto2.so/.a http://libgoto2.so/.a. I believe the idea is as long as the BLAS implementation is fast(TM), the lapack performance will be good.
GotoBLAS replaces a few LAPACK routines where BLAS optimization is not sufficient. Last time I built GotoBLAS2 it came with Netlib LAPACK sources in the tarball.
What really matters for LAPACK performance is not even BLAS, but the general matrix multiply routines *GEMM in BLAS. That is why AMD has made a GPU version of ACML where matrix multiplication in BLAS can be deferred to the ATI GPU.
Sturla
Dear Paul Anton,
thanks a lot for your suggestion. I was also successful with
[blas] libraries = blas library_dirs = $PREFIX/gotoblas2/lib [lapack] libraries = lapack library_dirs = $PREFIX/gotoblas2/lib
but I had compile clapack as a shared library and place a symbolic link ln s $PREFIX/gotoblas2/lib/libgoto2.so $PREFIX/gotoblas2/lib/libblas.so
I think your solution works better (the only difference is that I had to define LAPACK variable before installing numpy to avoid that warning about an optimized lapack lib missing), but what about source dirs? do you also set BLAS_SRC and LAPACK_SRC? how?
cheers
On Tue, Mar 22, 2011 at 11:18 PM, Paul Anton Letnes < paul.anton.letnes@gmail.com> wrote:
I'm no expert, but I just pulled off the scipy+numpy+GotoBLAS2 installation. From what I gather, the Makefile for libgoto2 downloads and compiles the generic lapack from netlib. It also wraps lapack into libgoto2.so/.a. I believe the idea is as long as the BLAS implementation is fast(TM), the lapack performance will be good.
To wit*, what I did was to tell numpy where libgoto2 was: env BLAS=/path/to/libgoto2.so python setup.py install Scipy also wants the path to lapack, which is wrapped inside libgoto2: env BLAS=/path/to/libgoto2.so LAPACK=/path/to/libgoto2.so python setup.py install Afterwards, I added the path to LD_LIBRARY_PATH. This was on a linux cluster, if that matters. At any rate, I can testify that it was not a big job to get numpy and scipy working with goto blas.
Good luck, Paul.
*) I have notes on this on a different computer, but not available right now.
On Tue, Mar 22, 2011 at 10:13 AM, Giuseppe Aprea <giuseppe.aprea@gmail.com
wrote:
Hi all,
I wonder if Peter finally got Gotoblas working with numpy. I am trying with gotoblas 1.13 installed in the same way:
$ ls R .: include lib
./include: goto
./include/goto: blaswrap.h cblas.h clapack.h f2c.h
./lib: libgoto2.a libgoto2_nehalempr1.13.a libgoto2_nehalempr1.13.so libgoto2.so
and numpy 1.5.1 with this site.cfg
[DEFAULT] library_dirs = /usr/local/gcc/4.5.2/gcc/lib64:/usr/local/gcc/4.5.2/gcc/lib:/usr/local/gcc/4.5.2/gcc/lib32:/usr/local/gcc/4.5.2/gotoblas2/1.13/lib:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/lib:/usr/local/gcc/4.5.2/fftw/3.2.2/lib include_dirs = /usr/local/gcc/4.5.2/gcc/include:/usr/local/gcc/4.5.2/gotoblas2/1.13/include/goto:/usr/local/gcc/4.5.2/suiteSparse/3.6.0/include:/usr/local/gcc/4.5.2/fftw/3.2.2/include search_static_first = 1 [blas_opt] libraries = goto2 language = fortran [lapack_opt] libraries = goto2 language = fortran [amd] amd_libs = amd [umfpack] umfpack_libs = umfpack [fftw] libraries = fftw3
(I also tried without "_opt" and "language = fortran"); I used goto2 for lapack because I read lapack should be included in libgoto (anyway things do not change using "lapack"). I am quite sure the system is not using my gotolapack stuff since every time I buid i get:
building extension "numpy.numarray._capi" sources building extension "numpy.fft.fftpack_lite" sources building extension "numpy.linalg.lapack_lite" sources creating build/src.linuxx86_642.7/numpy/linalg *### Warning: Using unoptimized lapack ###*
 adding 'numpy/linalg/lapack_litemodule.c' to sources.* adding 'numpy/linalg/python_xerbla.c' to sources. adding 'numpy/linalg/zlapack_lite.c' to sources. adding 'numpy/linalg/dlapack_lite.c' to sources. adding 'numpy/linalg/blas_lite.c' to sources. adding 'numpy/linalg/dlamch.c' to sources. adding 'numpy/linalg/f2c_lite.c' to sources.
building extension "numpy.random.mtrand" sources creating build/src.linuxx86_642.7/numpy/random C compiler: /usr/local/gcc/4.5.2//gcc/bin/gcc DNDEBUG g fwrapv O3 Wall Wstrictprototypes O1 pthread fPIC march=native mtune=native I/usr/local/gcc/4.5.2//gcc/include I/usr/local/gcc/4.5.2//suiteSparse/3.6.0/include I/usr/local/gcc/4.5.2//fftw/3.2.2/include fPIC
during numpy installation (which ends successfully). Moreover I cannot see any lgoto2 as I would have expected. Incidentally, I cannot see lamd, lumfpack, lfftw3 (or any reference to amd, umfpack, fftw3) neither, although there seems to be something to handle them in system_info.py. The failure is so complete that I must have done some big mistake but I can't correct my site.cfg even after searching the internet. This seems to be one of the major discussion about this topic so I am asking here for some help, please. Is the problem related with site.cfg or with gotoblas2 installation? Is it true that gotoblas2 hosts a full lapack inside?
thank you very much!
giuseppe
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On 08/17/2010 08:43 PM, Eloi Gaudry wrote:
Peter,
please below a script that will build numpy using a relevant site.cfg for your configuration (you need to update GOTODIR and LAPACKDIR and PYTHONDIR):
#!/bin/sh
#BLAS/LAPACK configuration file echo "[blas]"> ./site.cfg echo "library_dirs = GOTODIR">> ./site.cfg echo "blas_libs = goto2_nehalempr1.13">> ./site.cfg echo "[lapack]">> ./site.cfg echo "library_dirs = LAPACKDIR">> ./site.cfg echo "lapack_libs = lapack">> ./site.cfg
#compilation variables export CC=gcc export F77=gfortran export F90=gfortran export F95=gfortran export LDFLAGS="shared Wl,rpath=\'\$ORIGIN/../../../..\'" export CFLAGS="O1 pthread" export FFLAGS="O2"
#build python setup.py config python setup.py build python setup.py install
#copy site.cfg cp ./site.cfg PYTHONDIR/lib/python2.6/sitepackages/numpy/distutils/.
This should be PYTHONPATH, not PYTHONDIR. Also, on 64 bits, you need fPIC in every *FLAGS variables.
cheers,
David
participants (6)

ashford＠whisperpc.com

David

Eloi Gaudry

Giuseppe Aprea

Paul Anton Letnes

Sturla Molden