[Numpy-discussion] Rookie problems - Why is C-code much faster?
Zachary Pincus
zpincus at stanford.edu
Tue Feb 21 09:19:02 EST 2006
Mads,
The game with numpy, just as it is with Matlab or any other
interpreted numeric environment, is to try push as much of the
looping down into the C code as you can. This is because, as you now
know, compiled C can loop much faster than interpreted python.
A simple example for averaging 1000 (x,y,z) points:
print data.shape
(1000, 3)
# bad: explicit for loop in python
avg = numpy.zeros(3, numpy.float_)
for i in data: avg += i
avg /= 1000.0
# good: implicit for loop in C
avg = numpy.add.reduce(data, axis = 0)
avg /= 1000.0
In your case, instead of explicitly looping through each point, why
not do the calculations in parallel, operating on entire vectors of
points at one time? Then the looping is "pushed down" into compiled C
code. Or if you're really lucky, it's pushed all the way down to the
vector math units on your cpu if you have a good BLAS or whatever
installed.
Zach Pincus
Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine
> 2. Here is the main loop for finding all possible pair distances,
> which corresponds to a loop over the upper triangular part of a
> square matrix
>
> # Loop over all particles
> for i in range(n-1):
> dx = x[i+1:] - x[i]
> dy = y[i+1:] - y[i]
>
> dx -= box*rint(dx/box)
> dy -= box*rint(dy/box)
>
> r2 = dx**2 + dy**2 # square of dist. between points
>
> where x and y contain the positions of the particles. A naive
> implementation in C is
>
>
> // loop over all particles
> for (int i=0; i<n-1; i++) {
> for (int j=i+1; j<n; j++) {
> dx = x[j] - x[i];
> dy = y[j] - y[i];
>
> dx -= box*rint(dx/box);
> dy -= box*rint(dy/box);
>
> r2 = dx*dx + dy*dy;
> }
> }
>
> For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is
> app. 10 times faster than the Python/Numeric counterpart. This is of
> course not satisfactory.
>
> Are there any things I am doing completely wrong here, basic
> approaches completely misunderstood, misuses etc?
>
> Any suggestions, guidelines, hints are most welcome.
>
> Best regards,
>
> Mads Ipsen
>
>
> +---------------------------------+-------------------------+
> | Mads Ipsen | |
> | Dept. of Chemistry | phone: +45-35320220 |
> | H.C.Ørsted Institute | fax: +45-35320322 |
> | Universitetsparken 5 | |
> | DK-2100 Copenhagen Ø, Denmark | mpi at osc.kiku.dk |
> +---------------------------------+-------------------------+
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through
> log files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD
> SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list