[Numpy-discussion] inplace matrix multiplication

Wed Apr 29 02:19:17 EDT 2009

On 28-Apr-09, at 10:56 AM, Dan Goodman wrote:

> Can anyone explain the results below? It seems that for small matrices
> dot(x,y) is outperforming dgemm(1,x,y,0,y,overwrite_c=1), but for  
> larger
> matrices the latter is winning. In principle it seems like I ought  
> to be
> able to always do better with inplace rather than making copies?

It sounds as though there is some overhead associated with calling  
dgemm directly that exceeds the time cost of copying for small  
matrices but is eaten on large matrices by the time saved by not  
allocating an output array.

I'm not all that familiar with the core numpy/scipy linear algebra  
systems, it could be some cost associated with making a Fortran call  
if numpy.dot is using CBLAS and dgemm is from FBLAS, I really have no  
idea. You could try stepping through the code with a debugger and  
trying to find out. ;)

David