[Numpy-discussion] Distance Matrix speed

Sebastian Beca sebastian.beca at gmail.com
Wed Jun 14 18:19:19 EDT 2006


Hi,
I'm working with NumPy/SciPy on some algorithms and i've run into some
important speed differences wrt Matlab 7. I've narrowed the main speed
problem down to the operation of finding the euclidean distance
between two matrices that share one dimension rank (dist in Matlab):

Python:
def dtest():
    A = random( [4,2])
    B = random( [1000,2])

    d = zeros([4, 1000], dtype='f')
    for i in range(4):
        for j in range(1000):
            d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) )
    return d

Matlab:
    A = rand( [4,2])
    B = rand( [1000,2])
    d = dist(A, B')

Running both of these 100 times, I've found the python version to run
between 10-20 times slower. My question is if there is a faster way to
do this? Perhaps I'm not using the correct functions/structures? Or
this is as good as it gets?

Thanks on beforehand,

Sebastian Beca
Department of Computer Science Engineering
University of Chile

PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have
ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.




More information about the NumPy-Discussion mailing list