data:image/s3,"s3://crabby-images/4e1bf/4e1bff9f64c66e081948eead1d34d3ee25b06db6" alt=""
On Sat, 2004-10-16 at 07:27, Francesc Alted wrote:
A Divendres 15 Octubre 2004 19:03, Francesc Alted va escriure:
import timeit t1 = timeit.Timer("m3=numarray.dot(m1,m2)", "import numarray;dim1=500;m1=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32);m2=numarray.arange(dim1*dim1,shape=(dim1,dim1), type=numarray.Float32)") t1.repeat(3,10) [3.7274820804595947, 3.8542821407318115, 3.7117569446563721]
However, Numeric seems to get it:
t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');Numeric.reshape(m2,(dim1,dim1))") t3.repeat(3,10) [0.0093162059783935547, 0.0096318721771240234, 0.0092968940734863281]
i.e. almost 300 faster than numarray
Ooops! The Numeric test had a bug on it. The correct test would be:
t3 = timeit.Timer("m3=Numeric.dot(m1,m2)", "import Numeric;dim1=500;m1=Numeric.arange(dim1*dim1, typecode='f');m1=Numeric.reshape(m1, (dim1,dim1));m2=Numeric.arange(dim1*dim1,typecode='f');m2=Numeric.reshape(m2,(dim1,dim1))") t3.repeat(3,10) [0.47363090515136719, 0.47403502464294434, 0.47770595550537109]
which is 8 times faster, more or less, than numarray (or Numeric) without ATLAS.
Just to clarify things ;)
Hi Francesc, I don't think numarray dot() will pick up any boost at all from ATLAS because it's not written to do it. Besides that, there are two performance problems I know of with numarray's dot() which may dominate or dilute any ATLAS benefits: 1. dot() requires array creation. 2. dot() requires array copies. Because it has a class hierarchy and a memory buffer object, numarray is at a disadvantage for (1). (2) just hasn't been optimized yet for noncontiguous arrays which (I think) are always present when dot() starts with two contiguous array parameters. Regards, Todd