[Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows)

Dr. Michael Lehn michael.lehn at uni-ulm.de
Fri Jul 11 02:21:29 EDT 2014


Am 29.04.2014 um 02:01 schrieb Nathaniel Smith <njs at pobox.com>:

> On Tue, Apr 29, 2014 at 12:52 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
>> On 29/04/14 01:30, Nathaniel Smith wrote:
>> 
>>> I finally read this paper:
>>> 
>>>    http://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev2.pdf
>>> 
>>> and I have to say that I'm no longer so convinced that OpenBLAS is the
>>> right starting point.
>> 
>> I think OpenBLAS in the long run is doomed as an OSS project. Having
>> huge portions of the source in assembly is not sustainable in 2014.
>> OpenBLAS (like GotoBLAS2 before it) runs a high risk of becoming
>> abandonware.
> 
> Have you read the paper I linked? I really recommend it. BLIS is
> apparently 95% straight-up-C, plus a slot where you stick in a tiny
> CPU-specific super-optimized kernel [1]. So this localizes the nasty
> stuff to one tiny function, plus most of the kernels that have been
> written so far do in fact use intrinsics [2].
> 
> [1] https://code.google.com/p/blis/wiki/KernelsHowTo
> [2] https://code.google.com/p/blis/wiki/HardwareSupport
> 

I was teaching this summer an undergraduate class „Software Basics on HPC“.  Of course on topic
was the efficient implementation of the matrix-matrix product GEMM.  The BLIS paper [1] is a great
source for that.

In my opinion having your own hands-on experience is very important for actually understanding this
concepts.  That in particular means that we implemented our own matrix-matrix product.  The pure C
(ANSI C) implementation has less than 450 lines of code.  The code consists of several function and
students developed these functions one by one from one assignment to the other.  You can see the
result here:

	http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/page02/index.html#toc4

Other assignments where about improving the micro kernel with SSE instructions.  You can travers
through the pages to see how we where doing so step by step.

Please understand that this course material is still work in progress and needs some polish here and
there.  Still it could be useful for others and even a starting point for a simple BLAS implementation.

Cheers,

Michael


[1]: http://www.cs.utexas.edu/users/flame/pubs/BLISTOMSrev2.pdf


-----------------------------------------------------------------------------------
Dr. Michael Lehn
University of Ulm, Institute for Numerical Mathematics
Helmholtzstr. 20
D-89069 Ulm, Germany
Phone: (+49) 731 50-23534, Fax: (+49) 731 50-23548
-----------------------------------------------------------------------------------


More information about the NumPy-Discussion mailing list