[Numpy-discussion] Distance Matrix speed

Tim Hochberg tim.hochberg at cox.net
Fri Jun 16 13:48:49 EDT 2006


Christopher Barker wrote:

>Bruce Southey wrote:
>  
>
>>Please run the exact same code in Matlab that you are running in
>>NumPy. Many of Matlab functions are very highly optimized so these are
>>provided as binary functions. I think that you are running into this
>>so you are not doing the correct comparison
>>    
>>
>
>He is doing the correct comparison: if Matlab has some built-in compiled 
>utility functions that numpy doesn't -- it really is faster!
>
>It looks like other's suggestions show that well written numpy code is 
>plenty fast, however.
>
>One more suggestion I don't think I've seen: numpy provides a built-in 
>compiled utility function: hypot()
>  
>
> >>> x = N.arange(5)
> >>> y = N.arange(5)
> >>> N.hypot(x,y)
>array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])
> >>> N.sqrt(x**2 + y**2)
>array([ 0.        ,  1.41421356,  2.82842712,  4.24264069,  5.65685425])
>
>Timings:
> >>> timeit.Timer('N.sqrt(x**2 + y**2)','import numpy as N; x = 
>N.arange(5000); y = N.arange(5000)').timeit(100)
>0.49785208702087402
> >>> timeit.Timer('N.hypot(x,y)','import numpy as N; x = N.arange(5000); 
>y = N.arange(5000)').timeit(100)
>0.081479072570800781
>
>A factor of 6 improvement.
>  
>
Here's another thing to note: much of the time distance**2 works as well 
as distance (for instance if you are looking for the nearest point). If 
you're in that situation, computing the square of the distance is much 
cheaper:

    def d_2():
        d = zeros([4, 10000], dtype=float)
        for i in range(4):
            xy = A[i] - B
            d[i] = xy[:,0]**2 + xy[:,1]**2
        return d

This is something like 250 times as fast as the naive Python solution; 
another five times faster than the fastest distance computing version 
that I could come up with (using hypot).

-tim






More information about the NumPy-Discussion mailing list