On 14 Jun 2013 14:46, "Paul Blelloch" <paul.blelloch@gmail.com> wrote:
>
> I think that I found the problem. It was in not recognizing that the inner for loop is actually a dot product. If I replace the following lines of code:
>
>
> s = 0
>
> for k in range(i+1,j):
>
> s = s + R[i,k]*R[k,j]
>
>
> with
>
>
> s = np.dot(R[i,(i+1):j],R[(i+1):j,j])
>
>
> Run time decreases from 367 to 15.6 seconds. My guess is that you could get considerable further speedup, but I'm pleased with the 15.6 seconds. If you copy the sqrtm function from scipy and make that change I think that you'll see considerable improvement.
If you'd like to submit a pull request with this change then I bet the scipy developers will be very interested...
-n