[Numpy-discussion] distance matrix speed

Michael Sorich michael.sorich at gmail.com
Fri Jun 16 02:26:37 EDT 2006


Hi Sebastian,

I am not sure if there is a function already defined in numpy, but
something like this may be what you are after

def distance(a1, a2):
    return sqrt(sum((a1[:,newaxis,:] - a2[newaxis,:,:])**2, axis=2))

The general idea is to avoid loops if you want the code to execute
fast. I hope this helps.

Mike

On 6/16/06, Sebastian Beca <sebastian.beca at gmail.com> wrote:
> Hi,
> I'm working with NumPy/SciPy on some algorithms and i've run into some
> important speed differences wrt Matlab 7. I've narrowed the main speed
> problem down to the operation of finding the euclidean distance
> between two matrices that share one dimension rank (dist in Matlab):
>
> Python:
> def dtest():
>     A = random( [4,2])
>     B = random( [1000,2])
>
>     d = zeros([4, 1000], dtype='f')
>     for i in range(4):
>         for j in range(1000):
>             d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) )
>     return d
>
> Matlab:
>     A = rand( [4,2])
>     B = rand( [1000,2])
>     d = dist(A, B')
>
> Running both of these 100 times, I've found the python version to run
> between 10-20 times slower. My question is if there is a faster way to
> do this? Perhaps I'm not using the correct functions/structures? Or
> this is as good as it gets?
>
> Thanks on beforehand,
>
> Sebastian Beca
> Department of Computer Science Engineering
> University of Chile
>
> PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have
> ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>




More information about the NumPy-Discussion mailing list