2010/3/5 Ian Mallett <geometrian@gmail.com>:
Cool--this works perfectly now :-)
:-)
Unfortunately, it's actually slower :P Most of the slowest part is in the removing doubles section.
Hmm. Let's see ... Can you tell me how I can test the time calls in a script take? I have no idea.
#takes 0.04 seconds inner = np.inner(ns, v1s - some_point)
I think I can do nothing about that at the moment.
#0.0840001106262 sum_1 = sum.reshape((len(sum), 1)).repeat(len(sum), axis = 1)
#0.0329999923706 sum_2 = sum.reshape((1, len(sum))).repeat(len(sum), axis = 0)
#0.0269999504089 comparison_sum = (sum_1 == sum_2)
We can leave out the repeat() calls and leave only the reshape() calls there. Numpy will substitute dimi == 1 dimensions with stride == 0, i.e., it will effectively repeat those dimension, just as we did it explicitly.
#0.0909998416901 diff_1 = diff.reshape((len(diff), 1)).repeat(len(diff), axis = 1)
#0.0340001583099 diff_2 = diff.reshape((1, len(diff))).repeat(len(diff), axis = 0)
#0.0269999504089 comparison_diff = (diff_1 == diff_2)
Same here. Delete the repeat() calls, but not the reshape() calls.
#0.0230000019073 same_edges = comparison_sum * comparison_diff
Hmm, maybe use numpy.logical_and(comparison_sum, comparison_diff)? I don't know, but I guess it is in some way optimised for such things.
#0.128999948502 doublet_count = same_edges.sum(axis = 0)
Maybe try axis = 1 instead. I wonder why this is so slow. Or maybe it's because he does the conversion to ints on-the-fly, so maybe try same_edges.astype(numpy.int8).sum(axis = 0). Hope this gives some improvement. I attach the modified version. Ah, one thing to mention, have you not accidentally timed also the printout functions? They should be pretty slow. Friedrich