[Numpy-discussion] Changes to improve performance on small matricies

Gary Bishop gb at cs.unc.edu
Wed Oct 3 12:08:08 EDT 2001

We are porting some code from Matlab that does many thousands of 
operations on small matrices. We have found that some small changes to 
LinearAlgebra.py, Numeric.py, and multiarraymodule.c make our code run 
approximately twice as fast as with the unmodified Numeric-20.2 code.

Our changes specifically replace LinearAlgebra._castCopyAndTranspose 
with a version we call _fastCopyAndTranspose that does most of the work 
in C instead of calling Numeric.transpose, which calls arange, which is 
quite expensive. We also optimized Numeric.dot to call a new function 
multiarray.matrixproduct which does the axis swap on the fly instead of 
calling swapaxes (which calls arange) and then calling innerproduct 
(which as the very first step copies the transposed matrix to make it 

Formerly our code spent much of its time in arange (which we never 
explicitly call), dot, and _castCopyAndTranspose. The changes described 
above eliminate this overhead.

I'm writing to ask if these changes might make a worthy patch to NumPy? 
We have tested them with on Windows2k (both Native and under Cygwin) 
and on Linux. Soon, we'll have a test on Mac OS X.1.

If anyone is interested, I will figure out how to generate a patch file 
to submit.


More information about the NumPy-Discussion mailing list