Hi, I have a quick question: is there a reason why scipy doesn't use the 'gemm' Lapack function to perform matrix multiplication? It is already wrapped and is in my experience much faster. Here is a small python function if you want to test it (warning: this function doesn't control its arguments). def matmult(a,b, alpha=1.0, beta=0.0, c=None, trans_a=0, trans_b=0): """Return alpha*(a*b) + beta*c. a,b,c : matrices alpha, beta: scalars trans_a : 0 (a not transposed), 1 (a transposed), 2 (a conjugate transposed) trans_b : 0 (b not transposed), 1 (b transposed), 2 (b conjugate transposed) """ if c: gemm,=scipy.linalg.get_blas_funcs(('gemm',),(a,b,c)) else: gemm,=scipy.linalg.get_blas_funcs(('gemm',),(a,b)) return gemm(alpha, a, b, beta, c, trans_a, trans_b) Thank you for any feedback, Pietro.
On Mon, 26 May 2003, Pietro Berkes wrote:
Hi,
I have a quick question: is there a reason why scipy doesn't use the 'gemm' Lapack function to perform matrix multiplication? It is already wrapped and is in my experience much faster.
Quick answer: you just used gemm from scipy;-) I guess a better [*] documentation is needed on how to use blas/lapack functions provided by scipy. [*] any is better than none. Pearu
This seems like a reasonable alias to make, though, to the Numeric routine. With the following code, I get a marked speed up: C:\temp>python matmult_tst.py std matmult: 1.28100001812 gemm matmult: 0.171999931335 Other opinions? Eric ###################### import time from scipy import * N = 500 a = stats.random((N,N)) b = stats.random((N,N)) t1 = time.time() matrixmultiply(a,b) t2 = time.time() print 'std matmult:', t2 - t1 def matrixmultiply(a,b, alpha=1.0, beta=0.0, c=None, trans_a=0, trans_b=0): """ Return alpha*(a*b) + beta*c. a,b,c : matrices alpha, beta: scalars trans_a : 0 (a not transposed), 1 (a transposed), 2 (a conjugate transposed) trans_b : 0 (b not transposed), 1 (b transposed), 2 (b conjugate transposed) """ if c: gemm,=linalg.get_blas_funcs(('gemm',),(a,b,c)) else: gemm,=linalg.get_blas_funcs(('gemm',),(a,b)) return gemm(alpha, a, b, beta, c, trans_a, trans_b) t1 = time.time() matrixmultiply(a,b) t2 = time.time() print 'gemm matmult:', t2 - t1 ################## ---------------------------------------------- eric jones 515 Congress Ave www.enthought.com Suite 1614 512 536-1057 Austin, Tx 78701
-----Original Message----- From: scipy-user-admin@scipy.net [mailto:scipy-user-admin@scipy.net] On Behalf Of Pearu Peterson Sent: Monday, May 26, 2003 3:26 AM To: scipy-user@scipy.net Subject: Re: [SciPy-user] matrix multiplication
On Mon, 26 May 2003, Pietro Berkes wrote:
Hi,
I have a quick question: is there a reason why scipy doesn't use the 'gemm' Lapack function to perform matrix multiplication? It is already wrapped and is in my experience much faster.
Quick answer: you just used gemm from scipy;-)
I guess a better [*] documentation is needed on how to use blas/lapack functions provided by scipy.
[*] any is better than none.
Pearu
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
On Mon, 26 May 2003, eric jones wrote:
This seems like a reasonable alias to make, though, to the Numeric routine. With the following code, I get a marked speed up:
C:\temp>python matmult_tst.py std matmult: 1.28100001812 gemm matmult: 0.171999931335
Other opinions?
I get speed up factor around 5. Before making an alias to Numeric.dot, we should test how gemm matmult measures on non-contiguous arrays as linalg.gemm will make a contiguous copy of its arguments while Numeric.dot just runs the multiplication loop. Pearu
I did some benchmarks using the attached code (which is actually a cut-and-paste from linalg/test_basic.py and scipy_test/testing.py). The test function multiplies contiguous/non-contiguous arrays of type 'f'/'d'. Here is the result: Multiplying matrices of type f ================================== | contiguous | non-contiguous ---------------------------------------------- size | scipy | matmult | scipy | matmult 20 | 0.11 | 0.20 | 0.11 | 0.20 (secs for 2000 calls) 50 | 1.00 | 0.57 | 1.00 | 0.57 (secs for 2000 calls) 75 | 3.11 | 1.40 | 3.12 | 1.40 (secs for 2000 calls) 100 | 3.74 | 0.81 | 3.74 | 0.83 (secs for 1000 calls) 500 | 5.04 | 0.25 | 5.04 | 0.25 (secs for 4 calls) 1000 | 27.67 | 0.92 | 27.75 | 0.92 (secs for 2 calls) Multiplying matrices of type d ================================== | contiguous | non-contiguous ---------------------------------------------- size | scipy | matmult | scipy | matmult 20 | 0.11 | 0.22 | 0.10 | 0.22 (secs for 2000 calls) 50 | 1.01 | 0.61 | 1.02 | 0.61 (secs for 2000 calls) 75 | 3.43 | 1.78 | 3.42 | 1.78 (secs for 2000 calls) 100 | 3.78 | 1.52 | 3.80 | 1.56 (secs for 1000 calls) 500 | 7.96 | 0.50 | 7.95 | 0.50 (secs for 4 calls) 1000 | 35.71 | 1.82 | 35.65 | 1.82 (secs for 2 calls) Calling the lapack routine is always is always faster but for small (20x20) arrays. For large matrices it can get up to 30 time faster on my machine (Pentium 4, 1.8 GHz, 1.5Gb memory)! Pietro. On Monday 26 May 2003 22:56, Pearu Peterson wrote:
On Mon, 26 May 2003, eric jones wrote:
This seems like a reasonable alias to make, though, to the Numeric routine. With the following code, I get a marked speed up:
C:\temp>python matmult_tst.py std matmult: 1.28100001812 gemm matmult: 0.171999931335
Other opinions?
I get speed up factor around 5. Before making an alias to Numeric.dot, we should test how gemm matmult measures on non-contiguous arrays as linalg.gemm will make a contiguous copy of its arguments while Numeric.dot just runs the multiplication loop.
Pearu
_______________________________________________ SciPy-user mailing list SciPy-user@scipy.net http://www.scipy.net/mailman/listinfo/scipy-user
participants (3)
-
eric jones -
Pearu Peterson -
Pietro Berkes