[Numpy-discussion] Numpy, BLAS, and CBLAS questions
eric at ericmart.in
Mon Jul 13 02:40:47 EDT 2015
I've been playing around recently with linking Numpy to different BLAS
implementations, particularly Eigen and ACML 6 (with openCL support!). I've
successfully linked Numpy to both of these libraries, but I found the
process overly difficult and confusing. I'm interested in either writing a
blog post or adding to Numpy docs to make it easier for others, but I'm
hoping to clear up some of my own confusions first.
I'll start with a rough outline of what I did to link Numpy with Eigen and
Modify numpy/core/setup.py. Change
- blas_info = get_info('blas_opt', 0)
+ blas_info = get_info('blas', 0)
and changing get_dotblas_sources to
def get_dotblas_sources(ext, build_dir):
(to remove the check for ('NO_ATLAS_INFO', 1)).
Compile CBLAS with BLLIB in Makefile.in pointing to the shared object for
your BLAS. Make a shared object (not a static library) out of CBLAS. This
requires adding -fPIC to the CFLAGS and FFLAGS.
*Question: Is it a bug that I couldn't get Numpy working with a static
CBLAS library and a shared object BLAS?*
Modify site.cfg at the top level of the Numpy directory with
library_dirs = /path/to/directory/containing/shared_objects
include_dirs = /path/to/headers/from/CBLAS
blas_libs = cblas, your_blas_lib
where there headers from CBLAS are cblas_f77.h and cblas.h. For the
blas_libs variable, the library name "foo" loads libfoo.so, so with the
above example the libraries should be called libcblas.so and
libyour_blas_lib.so and lie in the listed library_dir.
Finally, run "python setup.py build" from the root of the Numpy codebase
(same directory that site.cfg lives in).
*My questions about this:*
What does CBLAS do, and why/when is it necessary? For both ACML 6 and
Eigen, I could not link directly to the library but could with CBLAS. My
understanding is that the BLAS interface is a Fortran ABI, and the CBLAS
provides a C ABI (cdecl?) to BLAS.
Why can't the user link Numpy directly to the Fortran ABI? How are ATLAS
and openBLAS handled?
My procedure questions:
Is the installation procedure I outlined above reasonable, or does it
contain steps that could/should be removed? Having to edit Numpy source
seems sketchy to me. I largely came up with this procedure by looking up
tutorials online and by trial and error. I don't want to write
documentation that encourages people to do something in a non-optimal way,
so if there is a better way to do this, please let me know.
*Some final thoughts:*
Although I linked properly to the library, I discovered ACML 6 didn't work
at all on my computer (the ACML6 example code didn't even work). This is
very disappointed, as openCL support in ACML 6 + integrated GPU on laptop +
openCL on Intel integrated GPUs on Linux through beignet seemed like a
potentially very promising performance boost for all of us running Numpy on
Eigen has excellent performance. On my i5-5200U (Broadwell) CPU, I found
Eigen BLAS compiled with AVX and FMA instructions to take 3.93s to multiply
2 4000x4000 double matrices with a single thread, while my install of Numpy
from ubuntu took 9s (and used 4 threads on my 2 cores). My Ubuntu numpy
appears to built against "libblas", which I think is the reference
Eigen gave 32GFLOPS of 64 bit performance from a single laptop core, I find
this quite impressive!
Thanks for any feedback and response to the questions!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion