[Numpy-discussion] distance matrix and (weighted) p-norm

Damian Eads eads at soe.ucsc.edu
Sun Sep 7 15:56:50 EDT 2008


Hi there,

The pdist function computes pairwise distances between vectors in a 
single collection, storing the distances in a condensed distance matrix. 
  This is not exactly what you want--you want to compute distance 
between two collections of vectors.

Suppose XA is a m_A by n array and XB is a m_B by n array,

   M=scipy.cluster.distance.cdist(XA, XB, metric='mahalanobis')

computes a m_A by m_B distance matrix M. The ij'th entry is the distance 
between XA[i,:] and XB[j,:]. The core computation is implemented in C 
for efficiency. I've committed the new function along with documentation 
and about two dozen tests.

Cheers,

Damian

Emanuele Olivetti wrote:
> David Cournapeau wrote:
>> FWIW, distance is deemed to move to a separate package, because distance
>> computation is useful in other contexts than clustering.
>>
>>   
> 
> Excellent. I was thinking about something similar. I'll have a look
> to the separate package. Please drop an email to this list when
> distance will be moved.
> 
> Thanks,
> 
> Emanuele

-----------------------------------------------------
Damian Eads                             Ph.D. Student
Jack Baskin School of Engineering, UCSC        E2-479
1156 High Street
Santa Cruz, CA 95064    http://www.soe.ucsc.edu/~eads



More information about the NumPy-Discussion mailing list