[Numpy-discussion] distance matrix and (weighted) p-norm

David Cournapeau david at ar.media.kyoto-u.ac.jp
Wed Sep 3 09:16:42 EDT 2008


Emanuele Olivetti wrote:
>
> Thanks for the pointer but the distance subpackage in cluster is about
> the distance matrix of vectors in one set of vectors. So the resulting
> matrix is symmetric. I need to compute distances between two
> different sets of vectors (i.e. a non-symmetric distance matrix).
> It is not clear to me how to use it in my case.
>   

You may need to extend the code, indeed (although I am more or less
responsible for scipy.cluster these days, I have not looked carefully at
all the code in distance yet).

> Then cluster.distance offers:
> 1) slow python double for loop for computing each entry of the matrix
> 2) or fast C implementation (numpy/cluster/distance/src/distance.c).
>
> I guess I need to extend distance.c, then work on the wrapper and then
> on distance.py. But after that it would be meaningless to have those
> distances under 'cluster', since clustering doesn't need distances between
> two sets of vectors.
>   

FWIW, distance is deemed to move to a separate package, because distance
computation is useful in other contexts than clustering.

cheers,

David



More information about the NumPy-Discussion mailing list