faster way for calculating distance matrix...
![](https://secure.gravatar.com/avatar/8c27a792722f48451744daff21b4e123.jpg?s=120&d=mm&r=g)
Hi, I was happy to see solutions coming up for Cliff's problems and it made me think: maybe I can mention my problem here as well. From an array of feature vectors I want to calculate it's distance matrix. Something like [1]. Currently it can take quite a while to calculate the stuff for a long array. Some questions: 1) Is there a smart speed up possible? Like, a way to avoid the double loop? It's no problem if this would lead to less generality (like the choice for a distance function). I know there is a little speed up to be gained by leaving out the array(f) thing, but that's not what I'm looking for. 2) Is it possible (in Numeric or numarray) to define a class DiagonalMatrix that at least saves half of the memory? 3) If 1) is not possible, what would be the way to go for speeding it up by writing it in C? weave because of its availability in scipy or would pyrex be more interesting, or are there even more options..? bye, Kasper [1] Example program: import Numeric def euclidean_dist(a, b): diff = a - b return Numeric.dot(diff, diff) def calc_dist_matrix(f, distance_function): W = Numeric.array(f) length = W.shape[0] S = Numeric.zeros((length, length)) * 1.0 for i in range(length): for j in range(i): S[j, i] = S[i, j] = distance_function(W[i], W[j]) return S print calc_dist_matrix(Numeric.sin(Numeric.arange(30)), euclidean_dist)
![](https://secure.gravatar.com/avatar/ac01d2df296363441e93634887ab5a12.jpg?s=120&d=mm&r=g)
Kasper Souren wrote:
The following is a bit faster (at the cost of some memory): def euclidean_dist(a, b): diff = a - b return diff**2 def calc_dist_matrix2(f, distance_function): W = Numeric.array(f) i,j = indices((len(W),)*2) S = distance_function(take(W,i), take(W,j)) return S -- Jeffery Collins (http://www.boulder.net/~jcollins)
![](https://secure.gravatar.com/avatar/ac01d2df296363441e93634887ab5a12.jpg?s=120&d=mm&r=g)
Kasper Souren wrote:
The following is a bit faster (at the cost of some memory): def euclidean_dist(a, b): diff = a - b return diff**2 def calc_dist_matrix2(f, distance_function): W = Numeric.array(f) i,j = indices((len(W),)*2) S = distance_function(take(W,i), take(W,j)) return S -- Jeffery Collins (http://www.boulder.net/~jcollins)
participants (2)
-
Jeffery D. Collins
-
Kasper Souren