matrixmultiply implementation

I'm curious to see if anyone has looked at the implementation of dot/matrixmultiply recently. I was doing a benchmark lapack-based matrix inversion and was surprised to find that the accuracy check was taking longer than the inversion. The bottleneck was the matrix multiply. It appears that a blas dgemm implementation might give a large improvement. Here is a sample benchmark using PyLapack's dblas.dgemm for comparison. Results for a 1000 x 1000 matrix inversion: time elapsed for inverse (sec) 10.050768 time elapsed for matrixmultiply (sec) 24.142714 time elapsed for dgemm-matrixmultiply (sec) 5.997188 max error is: 3.30398486348e-10 Given the "dot" code doesn't look too bad, so I might be able to add dgemm support myself, at least for some cases. -chris p.s. I'll include the test code below ##################################### import time from Numeric import * from LinearAlgebra import * import dblas for N in [5,100, 1000]: a = reshape(arange(float(N*N)), (N,N)) + identity(N) # print "typecode of array is: ", a.typecode() start = time.time() inv_a = inverse(a) stop = time.time() # print inv_a startmult = time.time() b = matrixmultiply(a, inv_a) stopmult = time.time() residual = b - identity(N) #DGEMM(TRANSA, TRANSB, M, N, K, ALPHA,A, LDA, B, LDB, BETA,C,LDC) C = zeros(a.shape, a.typecode()) alpha = array([1.0]) beta = array([1.0]) startdgemm = time.time() dblas.dgemm('N','N', N,N,N, alpha, a,N, # A, LDA inv_a, N, # B, LDB beta, # BETA C, N) stopdgemm = time.time() # print C print "%d x %d matrix inversion" % (N,N) print "max error is:", print maximum.reduce(fabs(ravel(residual))) print "time elapsed for inverse (sec) %f" % (stop-start) print "time elapsed for matrixmultiply (sec) %f" % (stopmult-startmult) print "time elapsed for dgemm-matrixmultiply (sec) %f" % (stopdgemm-startdgemm) print

Not to put a damper on anything, but it would be a big change to REQUIRE dgemm in the Numeric core. The issues we shunted aside to the subpackage would return. -----Original Message----- From: numpy-discussion-admin@lists.sourceforge.net [mailto:numpy-discussion-admin@lists.sourceforge.net]On Behalf Of Christopher Lee Sent: Friday, December 01, 2000 4:43 PM To: numpy-discussion@lists.sourceforge.net Subject: [Numpy-discussion] matrixmultiply implementation I'm curious to see if anyone has looked at the implementation of dot/matrixmultiply recently. I was doing a benchmark lapack-based matrix inversion and was surprised to find that the accuracy check was taking longer than the inversion. The bottleneck was the matrix multiply. It appears that a blas dgemm implementation might give a large improvement. Here is a sample benchmark using PyLapack's dblas.dgemm for comparison. Results for a 1000 x 1000 matrix inversion: time elapsed for inverse (sec) 10.050768 time elapsed for matrixmultiply (sec) 24.142714 time elapsed for dgemm-matrixmultiply (sec) 5.997188 max error is: 3.30398486348e-10 Given the "dot" code doesn't look too bad, so I might be able to add dgemm support myself, at least for some cases. -chris p.s. I'll include the test code below ##################################### import time from Numeric import * from LinearAlgebra import * import dblas for N in [5,100, 1000]: a = reshape(arange(float(N*N)), (N,N)) + identity(N) # print "typecode of array is: ", a.typecode() start = time.time() inv_a = inverse(a) stop = time.time() # print inv_a startmult = time.time() b = matrixmultiply(a, inv_a) stopmult = time.time() residual = b - identity(N) #DGEMM(TRANSA, TRANSB, M, N, K, ALPHA,A, LDA, B, LDB, BETA,C,LDC) C = zeros(a.shape, a.typecode()) alpha = array([1.0]) beta = array([1.0]) startdgemm = time.time() dblas.dgemm('N','N', N,N,N, alpha, a,N, # A, LDA inv_a, N, # B, LDB beta, # BETA C, N) stopdgemm = time.time() # print C print "%d x %d matrix inversion" % (N,N) print "max error is:", print maximum.reduce(fabs(ravel(residual))) print "time elapsed for inverse (sec) %f" % (stop-start) print "time elapsed for matrixmultiply (sec) %f" % (stopmult-startmult) print "time elapsed for dgemm-matrixmultiply (sec) %f" % (stopdgemm-startdgemm) print _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/numpy-discussion

Not to put a damper on anything, but it would be a big change to REQUIRE dgemm in the Numeric core. The issues we shunted aside to the subpackage would return. -----Original Message----- From: numpy-discussion-admin@lists.sourceforge.net [mailto:numpy-discussion-admin@lists.sourceforge.net]On Behalf Of Christopher Lee Sent: Friday, December 01, 2000 4:43 PM To: numpy-discussion@lists.sourceforge.net Subject: [Numpy-discussion] matrixmultiply implementation I'm curious to see if anyone has looked at the implementation of dot/matrixmultiply recently. I was doing a benchmark lapack-based matrix inversion and was surprised to find that the accuracy check was taking longer than the inversion. The bottleneck was the matrix multiply. It appears that a blas dgemm implementation might give a large improvement. Here is a sample benchmark using PyLapack's dblas.dgemm for comparison. Results for a 1000 x 1000 matrix inversion: time elapsed for inverse (sec) 10.050768 time elapsed for matrixmultiply (sec) 24.142714 time elapsed for dgemm-matrixmultiply (sec) 5.997188 max error is: 3.30398486348e-10 Given the "dot" code doesn't look too bad, so I might be able to add dgemm support myself, at least for some cases. -chris p.s. I'll include the test code below ##################################### import time from Numeric import * from LinearAlgebra import * import dblas for N in [5,100, 1000]: a = reshape(arange(float(N*N)), (N,N)) + identity(N) # print "typecode of array is: ", a.typecode() start = time.time() inv_a = inverse(a) stop = time.time() # print inv_a startmult = time.time() b = matrixmultiply(a, inv_a) stopmult = time.time() residual = b - identity(N) #DGEMM(TRANSA, TRANSB, M, N, K, ALPHA,A, LDA, B, LDB, BETA,C,LDC) C = zeros(a.shape, a.typecode()) alpha = array([1.0]) beta = array([1.0]) startdgemm = time.time() dblas.dgemm('N','N', N,N,N, alpha, a,N, # A, LDA inv_a, N, # B, LDB beta, # BETA C, N) stopdgemm = time.time() # print C print "%d x %d matrix inversion" % (N,N) print "max error is:", print maximum.reduce(fabs(ravel(residual))) print "time elapsed for inverse (sec) %f" % (stop-start) print "time elapsed for matrixmultiply (sec) %f" % (stopmult-startmult) print "time elapsed for dgemm-matrixmultiply (sec) %f" % (stopdgemm-startdgemm) print _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/numpy-discussion
participants (2)
-
Christopher Lee
-
Paul F. Dubois