<br><br><div class="gmail_quote">On Tue, May 13, 2008 at 5:59 PM, Andrew Straw <<a href="mailto:strawman@astraw.com">strawman@astraw.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Thanks for all the comments on my original question. I was more offline<br>

than intended after I sent it until now, so I'm sorry I wasn't<br>

immediately able to participate in the discussion.<br>

<br>

Anyhow, after working on this a bit more, I came up with a few<br>

implementations of search algorithms doing just what I needed with the<br>

same interface available using bazaar and launchpad at<br>

<a href="http://launchpad.net/%7Eastraw/+junk/fastsearch" target="_blank">http://launchpad.net/~astraw/+junk/fastsearch</a> (MIT license). I have<br>

attached the output of the plot_comparisons.py benchmarking script to<br>

this email (note that this benchmarking is pretty crude).<br>

<br>

For the problem I originally wrote about, I get what a nearly<br>

unbelievable speedup of ~250x using the<br>

fastsearch.downsamp.DownSampledPreSearcher class, which is very similar<br>

in spirit to Charles' suggestion. It takes 1000 values from the original<br>

array to create a new first-level array that is itself localized in<br>

memory and points to a more localized region of the full original array.<br>

Also, I get a similar (though slightly slower) result using AVL trees<br>

using the fastsearch.avlsearch.AvlSearcher class, which uses pyavl (<br>

<a href="http://sourceforge.net/projects/pyavl" target="_blank">http://sourceforge.net/projects/pyavl</a> ).<br>

<br>

Using the benchmarking code included in the bzr branch, I don't get<br>

anything like this speedup (e.g. the attached figure), so I'm not sure<br>

exactly what's going on at this point, but I'm not going to argue with a<br>

250x speedup, so the fastsearch.downsamp code is now being put to use in<br>

one of my projects.<br>

<br>

Stefan -- I think your code simply implements the classic binary search<br>

-- I don't see how it will reduce cache misses.<br>

<br>

Anyhow, perhaps someone will find the above useful. I guess it would<br>

still be a substantial amount of work to make a numpy-types-aware<br>

implementation of AVL trees or similar algorithms. These sorts of binary<br>

search trees seem like the right way to solve this problem and thus<br>

there might be an interesting project in this. I imagine that a<br>

numpy-types-aware Cython might make such implementation significantly<br>

easier and still blazingly fast compared to the binary search<br>

implemented in searchsorted() given today's cached memory architectures.<br>

<font color="#888888"></font></blockquote><div><br>That's pretty amazing, but I don't understand the graph. The  DownSampled search looks like the worst. Are the curves mislabled? Are the axis correct? I'm assuming smaller is better here.<br>

<br>Chuck<br></div><br></div><br>