[Numpy-discussion] faster way for calculating distance matrix...

Kasper Souren Kasper.Souren at ircam.fr
Wed May 21 12:56:05 EDT 2003


I was happy to see solutions coming up for Cliff's problems and it made me 
think: maybe I can mention my problem here as well.

From an array of feature vectors I want to calculate it's distance matrix. 
Something like [1]. Currently it can take quite a while to calculate the 
stuff for a long array.

Some questions:

1) Is there a smart speed up possible? Like, a way to avoid the double loop? 
It's no problem if this would lead to less generality (like the choice for a 
distance function). I know there is a little speed up to be gained by leaving 
out the array(f) thing, but that's not what I'm looking for.

2) Is it possible (in Numeric or numarray) to define a class DiagonalMatrix 
that at least saves half of the memory?

3) If 1) is not possible, what would be the way to go for speeding it up by 
writing it in C? weave because of its availability in scipy or would pyrex be 
more interesting, or are there even more options..?


[1] Example program:

import Numeric

def euclidean_dist(a, b):
	diff = a - b
	return Numeric.dot(diff, diff)

def calc_dist_matrix(f, distance_function):
	W = Numeric.array(f)
	length = W.shape[0]
	S = Numeric.zeros((length, length)) * 1.0

	for i in range(length):
		for j in range(i):
			S[j, i] = S[i, j] = distance_function(W[i], W[j])
	return S

print calc_dist_matrix(Numeric.sin(Numeric.arange(30)), euclidean_dist) 

More information about the NumPy-Discussion mailing list