[Numpy-discussion] Distance Matrix speed
Tim Hochberg
tim.hochberg at cox.net
Fri Jun 16 13:48:49 EDT 2006
Christopher Barker wrote:
>Bruce Southey wrote:
>
>
>>Please run the exact same code in Matlab that you are running in
>>NumPy. Many of Matlab functions are very highly optimized so these are
>>provided as binary functions. I think that you are running into this
>>so you are not doing the correct comparison
>>
>>
>
>He is doing the correct comparison: if Matlab has some built-in compiled
>utility functions that numpy doesn't -- it really is faster!
>
>It looks like other's suggestions show that well written numpy code is
>plenty fast, however.
>
>One more suggestion I don't think I've seen: numpy provides a built-in
>compiled utility function: hypot()
>
>
> >>> x = N.arange(5)
> >>> y = N.arange(5)
> >>> N.hypot(x,y)
>array([ 0. , 1.41421356, 2.82842712, 4.24264069, 5.65685425])
> >>> N.sqrt(x**2 + y**2)
>array([ 0. , 1.41421356, 2.82842712, 4.24264069, 5.65685425])
>
>Timings:
> >>> timeit.Timer('N.sqrt(x**2 + y**2)','import numpy as N; x =
>N.arange(5000); y = N.arange(5000)').timeit(100)
>0.49785208702087402
> >>> timeit.Timer('N.hypot(x,y)','import numpy as N; x = N.arange(5000);
>y = N.arange(5000)').timeit(100)
>0.081479072570800781
>
>A factor of 6 improvement.
>
>
Here's another thing to note: much of the time distance**2 works as well
as distance (for instance if you are looking for the nearest point). If
you're in that situation, computing the square of the distance is much
cheaper:
def d_2():
d = zeros([4, 10000], dtype=float)
for i in range(4):
xy = A[i] - B
d[i] = xy[:,0]**2 + xy[:,1]**2
return d
This is something like 250 times as fast as the naive Python solution;
another five times faster than the fastest distance computing version
that I could come up with (using hypot).
-tim
More information about the NumPy-Discussion
mailing list