[Numpy-discussion] fast way of doing "cross-multiplications" ?

Bill Baxter wbaxter at gmail.com
Tue Jul 18 10:45:34 EDT 2006


Maybe this will help -- it computes the squared distances between a
bunch of points:

def dist2(x,c):
    """
    Calculates squared distance between two sets of points.

    n2 = dist2(x, c)
    D = DIST2(X, C) takes two matrices of vectors and calculates the
    squared Euclidean distance between them.  Both matrices must be of
    the same column dimension.  If X has M rows and N columns, and C has
    L rows and N columns, then the result has M rows and L columns.  The
    I, Jth entry is the  squared distance from the Ith row of X to the
    Jth row of C.
    """
    ndata, dimx = x.shape
    ncentres, dimc = c.shape
    if dimx != dimc:
        raise ValueError('Data dimension does not match dimension of centres')

    n2 = (x*x).sum(1)[:,numpy.newaxis] + (c*c).sum(1) - 2*dot(x,T(c))

    # Rounding errors occasionally cause negative entries in n2
    #if (n2<0).any():
    #   n2[n2<0] = 0

    return n2

Take 1.0/numpy.sqrt(dist2(V,V)) and from there you can probably get
the sum with sum() calls.  It's obviously not as efficient as it could
be since it'll be computing both halves of a symmetric matrix, but it
might be faster than nested Python loops.
(The above was adapted from a routine in Netlab).

--bb

On 7/18/06, Eric Emsellem <emsellem at obs.univ-lyon1.fr> wrote:
> Hi,
>
> I have a specific quantity to derive from an array, and I am at the
> moment unable to do it for a too large array because it just takes too
> long! So I am looking for an advice on how to efficiently compute such a
> quantity:
>
> I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do:
>
> result = 0.
> for i in range(N) :
>    for j in range(i+1,N,1) :
>       result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z[j] -
> z[i])**2)
>
>
> Of course the procedure written above is very inefficient and I thought
> of doing:
>
> result = 0.
> for i in range(N) :
>    result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 +
> (z[i+1:] - z[i])**2)
>
> Still, this is quite slow and not workable for very large arrays (> 10^6
> floats per array).
>
> Any hint on how to speed things up here?
>
> Thanks in advance!
>
> Eric




More information about the NumPy-Discussion mailing list