[Numpy-discussion] Proposal: np.search() to complement np.searchsorted()

Martin Spacek numpy at mspacek.mm.st
Tue May 9 12:46:25 EDT 2017


Hello,

I've opened up a pull request to add a function called np.search(), or something 
like it, to complement np.searchsorted():

https://github.com/numpy/numpy/pull/9055

There's also this issue I opened before starting the PR:

https://github.com/numpy/numpy/issues/9052

Proposed API changes require discussion on the list, so here I am!

This proposed function (and perhaps array method?) does the same as 
np.searchsorted(a, v), but doesn't require `a` to be sorted, and explicitly 
checks if all the values in `v` are a subset of those in `a`. If not, it 
currently raises an error, but that could be controlled via a kwarg.

As I mentioned in the PR, I often find myself abusing np.searchsorted() by not 
explicitly checking these assumptions. The temptation to use it is great, 
because it's such a fast and convenient function, and most of the time that I 
use it, the assumptions are indeed valid. Explicitly checking those assumptions 
each and every time before I use np.searchsorted() is tedious, and easy to 
forget to do. I wouldn't be surprised if many others abuse np.searchsorted() in 
the same way.

Looking at my own habits and uses, it seems to me that finding the indices of 
matching values of one array in another is a more common use case than finding 
insertion indices of one array into another sorted array. So, I propose that 
np.search(), or something like it, could be even more useful than np.searchsorted().

Thoughts?

Martin


More information about the NumPy-Discussion mailing list