I think that I found the problem.  It was in not recognizing that the inner for loop is actually a dot product.  If I replace the following lines of code:

s = 0

for k in range(i+1,j):

    s = s + R[i,k]*R[k,j]


with


s = np.dot(R[i,(i+1):j],R[(i+1):j,j])


Run time decreases from 367 to 15.6 seconds.  My guess is that you could get considerable further speedup, but I'm pleased with the 15.6 seconds.  If you copy the sqrtm function from scipy and make that change I think that you'll see considerable improvement.



On Saturday, April 13, 2013 8:47:25 AM UTC-7, Vivek Kulkarni wrote:
Hi,
I am implementing spectral clustering for my course work and am using the sqrtm function to find the square root of a matrix . But its far too slow, my matrix is of size (1258,1258). Any suggestions on how I can speed things up or some other function that scipy supports which can be used.

I am implementing the algorithm as described here:
http://books.nips.cc/papers/files/nips14/AA35.pdf

Finding the square root of D^(-1) is way too slow for D^-1 of size (1258,1258).


Thanks.
Vivek.