Charles R Harris wrote:
On Thu, Jul 10, 2008 at 10:38 AM, Dan Lussier
mailto:dtlussier@gmail.com> wrote: Hello,
I am relatively new to numpy and am having trouble with the speed of a specific array based calculation that I'm trying to do.
What I'm trying to do is to calculate the total total potential energy and coordination number of each atom within a relatively large simulation. Each atom is at a position (x,y,z) given by a row in a large array (approximately 1e6 by 3) and presently I have no information about its nearest neighbours so each its position must be checked against all others before cutting the list down prior to calculating the energy.
This looks to be O(n^2) and might well be the bottle neck. There are various ways to speed up such things but more information would help determine the method, i.e., is this operation within a loop so that the values change a lot. However, one quick thing to try is a sort on one of the coordinates so you only need to check a subset of the vectors. Searchsorted could be useful here also.
Chuck
------------------------------------------------------------------------
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
If I understand correctly, I notice that you are doing many scalar multiplications where some are the same or constants. For example, you compute r2*r2 a few times: numpy.power(r2,2)= r2*r2 numpy.power(r2,3)= r2*r2*r2 numpy.power(r2,6)= r2*r2*r2*r2*r2*r2 But you don't need this numpy.power(r2,6) as it can be factored. Also the division is unnecessary. So the liTotal is really: ljTotal[i] = 2.0*((numpy.power(r2,-3)*(numpy.power(r2,-3)-1)).sum(axis=0)) However, the real problem is getting the coordinates that I do not follow and limits how you can remove the loop over xyz. Like why the need for the repeated (r0-xyz)? This is a huge cost that you need to avoid perhaps by just extracting those elements within your criteria. Bruce