Hi, I would like to know how I can make a call to the blas function gemm in numpy. I need a multiply and accumulate for matrix and I don't want to allocate a new matrix each time I do it. thanks for your time Frederic Bastien
On Fri, Jan 9, 2009 at 08:25, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I would like to know how I can make a call to the blas function gemm in numpy. I need a multiply and accumulate for matrix and I don't want to allocate a new matrix each time I do it.
You can't in numpy. With scipy.linalg.fblas.dgemm() and the right arguments, you can.  Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco
Thanks, and sorry for the delayed reply. Fred On Fri, Jan 9, 2009 at 4:31 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Fri, Jan 9, 2009 at 08:25, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I would like to know how I can make a call to the blas function gemm in numpy. I need a multiply and accumulate for matrix and I don't want to allocate a new matrix each time I do it.
You can't in numpy. With scipy.linalg.fblas.dgemm() and the right arguments, you can.
 Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth."  Umberto Eco _______________________________________________ Numpydiscussion mailing list Numpydiscussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpydiscussion
You are right, some here told me about this. I should have posted it here for reference. thanks Fred On Fri, Apr 24, 2009 at 1:45 PM, David WardeFarley <dwf@cs.toronto.edu>wrote:
On 9Jan09, at 4:31 PM, Robert Kern wrote:
You can't in numpy. With scipy.linalg.fblas.dgemm() and the right arguments, you can.
Make sure your output array is Fortranordered, however, otherwise copies will be made.
David _______________________________________________ Numpydiscussion mailing list Numpydiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
Can anyone explain the results below? It seems that for small matrices dot(x,y) is outperforming dgemm(1,x,y,0,y,overwrite_c=1), but for larger matrices the latter is winning. In principle it seems like I ought to be able to always do better with inplace rather than making copies? From looking at the RAM used by python.exe it seems that indeed dot(x,y) is allocating lots of memory and making copies, and dgemm is not. My system is a Windows PC with a Pentium M processor 1.86GHz family 6 model 13 (which I think supports SSE2 but not SSE3), Python version 2.5, scipy version 0.7.0.dev5410 and numpy version 1.2.1. In [71]: x=array(randn(3,3), order='F') In [73]: n=1000 In [74]: y=array(randn(3,n), order='F') In [75]: y0=copy(y) In [76]: %timeit y[:]=y0[:]; dot(x,y) 10000 loops, best of 3: 48.5 ┬Ás per loop In [77]: %timeit y[:]=y0[:]; dgemm(1,x,y,0,y,overwrite_c=1) 10000 loops, best of 3: 61.6 ┬Ás per loop In [79]: n=100000 In [80]: y=array(randn(3,n), order='F') In [81]: y0=copy(y) In [82]: %timeit y[:]=y0[:]; dot(x,y) 10 loops, best of 3: 22.9 ms per loop In [83]: %timeit y[:]=y0[:]; dgemm(1,x,y,0,y,overwrite_c=1) 100 loops, best of 3: 8.37 ms per loop Dan Frédéric Bastien wrote:
You are right, some here told me about this. I should have posted it here for reference.
thanks
Fred
On Fri, Apr 24, 2009 at 1:45 PM, David WardeFarley <dwf@cs.toronto.edu <mailto:dwf@cs.toronto.edu>> wrote:
On 9Jan09, at 4:31 PM, Robert Kern wrote:
> You can't in numpy. With scipy.linalg.fblas.dgemm() and the right > arguments, you can.
Make sure your output array is Fortranordered, however, otherwise copies will be made.
David _______________________________________________ Numpydiscussion mailing list Numpydiscussion@scipy.org <mailto:Numpydiscussion@scipy.org> http://mail.scipy.org/mailman/listinfo/numpydiscussion

_______________________________________________ Numpydiscussion mailing list Numpydiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On 28Apr09, at 10:56 AM, Dan Goodman wrote:
Can anyone explain the results below? It seems that for small matrices dot(x,y) is outperforming dgemm(1,x,y,0,y,overwrite_c=1), but for larger matrices the latter is winning. In principle it seems like I ought to be able to always do better with inplace rather than making copies?
It sounds as though there is some overhead associated with calling dgemm directly that exceeds the time cost of copying for small matrices but is eaten on large matrices by the time saved by not allocating an output array. I'm not all that familiar with the core numpy/scipy linear algebra systems, it could be some cost associated with making a Fortran call if numpy.dot is using CBLAS and dgemm is from FBLAS, I really have no idea. You could try stepping through the code with a debugger and trying to find out. ;) David
participants (4)

Dan Goodman

David WardeFarley

Frédéric Bastien

Robert Kern