[Numpy-discussion] Faster

Fri May 2 22:21:55 EDT 2008

On Fri, May 2, 2008 at 8:02 PM, Keith Goodman <kwgoodman at gmail.com> wrote:

> On Fri, May 2, 2008 at 6:29 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Isn't the lengthy part finding the distance between clusters?  I can
> think
> > of several ways to do that, but I think you will get a real speedup by
> doing
> > that in c or c++. I have a module made in boost python that holds
> clusters
> > and returns a list of lists containing their elements. Clusters are
> joined
> > by joining any two elements, one from each. It wouldn't take much to add
> a
> > distance function, but you could use the list of indices in each cluster
> to
> > pull a subset out of the distance matrix and then find the minimum
> function
> > in that. This also reminds me of Huffman codes.
>
> You're right. Finding the distance is slow. Is there any way to speed
> up the function below? It returns the row and column indices of the
> min value of the NxN array x.
>
> def dist(x):
>    x = x + 1e10 * np.eye(x.shape[0])

x += x + diag(ones(x.shape[0])*1e10

would be faster.

>    i, j = np.where(x == x.min())
>     return i[0], j[0]
>

i = x.argmin()
j = i % x.shape[0]
i = i / x.shape[0]

But I wouldn't worry about speed yet if you are just trying things out.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080502/83cef134/attachment.html>