Re: [Numpy-discussion] Proposal: np.search() to complement np.searchsorted()
16 May
2017
16 May
'17
2:02 a.m.
From: Stephan Hoyer
I like the idea of a strategy keyword argument. strategy='auto' leaves the door open for future improvements, e.g., if we ever add hash tables to numpy.
For the algorithm, I think we actually want to sort the needles array as well in most (all?) cases.
If haystack is also sorted, advancing thorough both arrays at once brings down the cost of the actual search itself down to O(n+k). (Possibly this is worth exposing as np.searchbothsorted or something similar?)
I actually suggest reducing the scope of the problem to
search_unique(haystack, needles)
which finds the index* in a haystack (of unique values) of each needle
and can make the assumption (e.g. as Stephen Hoyer points out) that if
len(needles)<
2536
Age (days ago)
2536
Last active (days ago)
0 comments
1 participants
participants (1)
-
Peter Creasey