Mailman 3 searchsorted() and memory cache - NumPy-Discussion

9 May 2008

      I've got a big element array (25 million int64s) that searchsorted()
takes a long time to grind through. After a bit of digging in the
literature and the numpy source code, I believe that searchsorted() is
implementing a classic binary search, which is pretty bad in terms of
cache misses. There are several modern implementations of binary search
which arrange items in memory such that cache misses are much more rare.
Clearly making such an indexing arrangement would take time, but in my
particular case, I can spare the time to create an index if searching
was faster, since I'd make the index once but do the searching many times.

Is there an implementation of such an algorithm that works easilty with
numpy? Also, can you offer any advice, suggestions, and comments to me
if I attempted to implement such an algorithm?

Thanks,
Andrew

searchsorted() and memory cache

Andrew Straw

Robert Kern

Charles R Harris

Charles R Harris

Charles R Harris

Francesc Alted

Bruce Southey

Charles R Harris

Stéfan van der Walt

Charles R Harris

Andrew Straw

Charles R Harris

Andrew Straw

Andrew Straw

Andrew Straw

Charles R Harris

Andrew Straw

Charles R Harris

Andrew Straw

Charles R Harris

Nathan Bell

Andrew Straw

tags

participants (7)