On Tue, Apr 29, 2014 at 1:10 AM, Julian Taylor
On 29.04.2014 02:05, Matthew Brett wrote:
On Mon, Apr 28, 2014 at 4:30 PM, Nathaniel Smith
It would be really interesting if someone were to try hacking simple runtime CPU detection into BLIS and see how far you could get -- right now they do kernel selection via the C preprocessor, but hacking in some function pointer thing instead would not be that hard I think. A maintainable library that builds on Linux/OSX/Windows, gets competitive performance on last-but-one generation x86-64 CPUs, and gets better-than-reference-BLAS performance everywhere else, would be a very very compelling product that I bet would quickly attract the necessary attention to make it competitive on all CPUs.
I wonder - is there anyone who might be able to do this work, if we found funding for a couple of months to do it?
On scipy-dev a interesting BLIS related message was posted recently: http://mail.scipy.org/pipermail/scipy-dev/2014-April/019790.html http://www.cs.utexas.edu/~flame/web/
It seems some work of integrating BLIS into a proper BLAS/LAPACK library is already done.
BLIS itself ships with a BLAS-compatible interface, that you can use with reference LAPACK (just like OpenBLAS). I wouldn't be surprised if there are various annoying Fortran/C ABI hacks remaining to be worked out, but at least in principle BLIS is a BLAS. The problem is that this BLAS has no threading, runtime configuration (you have to edit a config file and recompile to change CPU support), or windows build goop. Basically the authors seem to still be thinking of a BLAS library's target audience as being supercomputer sysadmins, not naive end-users. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org