# [Numpy-discussion] faster way for calculating distance matrix...

Jeffery D. Collins jcollins at boulder.net
Wed May 21 18:03:02 EDT 2003

```Kasper Souren wrote:

>Hi,
>
>I was happy to see solutions coming up for Cliff's problems and it made me
>think: maybe I can mention my problem here as well.
>
>>From an array of feature vectors I want to calculate it's distance matrix.
>Something like . Currently it can take quite a while to calculate the
>stuff for a long array.
>
>Some questions:
>
>1) Is there a smart speed up possible? Like, a way to avoid the double loop?
>It's no problem if this would lead to less generality (like the choice for a
>distance function). I know there is a little speed up to be gained by leaving
>out the array(f) thing, but that's not what I'm looking for.
>
>2) Is it possible (in Numeric or numarray) to define a class DiagonalMatrix
>that at least saves half of the memory?
>
>3) If 1) is not possible, what would be the way to go for speeding it up by
>writing it in C? weave because of its availability in scipy or would pyrex be
>more interesting, or are there even more options..?
>
>bye,
>Kasper
>
>
>
> Example program:
>
>import Numeric
>
>def euclidean_dist(a, b):
>	diff = a - b
>	return Numeric.dot(diff, diff)
>
>def calc_dist_matrix(f, distance_function):
>	W = Numeric.array(f)
>	length = W.shape
>	S = Numeric.zeros((length, length)) * 1.0
>
>	for i in range(length):
>		for j in range(i):
>			S[j, i] = S[i, j] = distance_function(W[i], W[j])
>	return S
>
>print calc_dist_matrix(Numeric.sin(Numeric.arange(30)), euclidean_dist)
>
>

The following is a bit faster (at the cost of some memory):

def euclidean_dist(a, b):
diff = a - b
return diff**2

def calc_dist_matrix2(f, distance_function):
W = Numeric.array(f)
i,j = indices((len(W),)*2)
S = distance_function(take(W,i), take(W,j))
return S

--
Jeffery Collins (http://www.boulder.net/~jcollins)

```