I finally got it to work, but the Numpy-ized version runs slower than the plain Python one. I think that I can transpose the NodeCord matrix once in the program and feed that in, rather than the scratch matrix that is generated here. Evidently transpose is not very fast even for smal matrices. Here is my test program: from Numeric import * Qpnt=array(([20,21,22],[23,24,25],[26,27,28])) NodeCord=array(([1,2,3],[4,5,6],[7,8,9])) TrngleNode=array((1,2,0)) #the original routine def ComputeGaussQuadPoint(QuadPoint,TrngleNode,Qpnt,NodeCord): SrcPointCol=zeros((3)) SrcPointCol[0] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],0]\ + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],0]\ + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],0] SrcPointCol[1] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],1]\ + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],1]\ + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],1] SrcPointCol[2] = Qpnt[QuadPoint,0]*NodeCord[TrngleNode[0],2]\ + Qpnt[QuadPoint,1]*NodeCord[TrngleNode[1],2]\ + Qpnt[QuadPoint,2]*NodeCord[TrngleNode[2],2] return SrcPointCol #the yet-to-be-faster routine def Quad(QuadPoint, TrngleNode, Qpnt,NodeCord): s = Qpnt[QuadPoint,:] c= take(NodeCord, TrngleNode) SrcPointCol= add.reduce(s * transpose(c),1) return SrcPointCol QuadPoint=1 print "The Correct:" print ComputeGaussQuadPoint(QuadPoint,TrngleNode,Qpnt,NodeCord) print "The New" print Quad(QuadPoint,TrngleNode,Qpnt,NodeCord) -- The Numeric Python EM Project www.members.home.net/europax