[Numpy-discussion] Find n closest values

Eelco Hoogendoorn hoogendoorn.eelco at gmail.com
Sun Jun 22 13:06:22 EDT 2014


Also, if you use scipy.spatial.KDTree, make sure to use cKDTree; the native
python kdtree is sure to be slow as hell.


On Sun, Jun 22, 2014 at 7:05 PM, Eelco Hoogendoorn <
hoogendoorn.eelco at gmail.com> wrote:

> Well, if the spacing is truly uniform, then of course you don't really
> need the search, and you can do away with the extra log-n, and there is a
> purely linear solution:
>
> def find_closest_direct(start, end, count, A):
>     Q = (A-start)/(end-start)*count
>     mid = ((Q[1:]+Q[:-1]+1)/2).astype(np.int)
>     boundary = np.zeros(count, np.int)
>     boundary[mid] = 1
>     return np.add.accumulate(boundary)
>
> I expect this to be a bit faster, but nothing dramatic, unless your
> datasets are huge. It isn't really more or less elegant either, id say.
> Note that the output isn't 100% identical; youd need to do a little
> tinkering to figure out the correct/desired rounding behavior.
>
>
> On Sun, Jun 22, 2014 at 5:16 PM, Nicolas P. Rougier <
> Nicolas.Rougier at inria.fr> wrote:
>
>>
>> Thanks for the answer.
>> I was secretly hoping for some kind of hardly-known numpy function that
>> would make things faster auto-magically...
>>
>>
>> Nicolas
>>
>>
>> On 22 Jun 2014, at 10:30, Eelco Hoogendoorn <hoogendoorn.eelco at gmail.com>
>> wrote:
>>
>> > Perhaps you could simplify some statements, but at least the
>> algorithmic complexity is fine, and everything is vectorized, so I doubt
>> you will get huge gains.
>> >
>> > You could take a look at the functions in scipy.spatial, and see how
>> they perform for your problem parameters.
>> >
>> >
>> > On Sun, Jun 22, 2014 at 10:22 AM, Nicolas P. Rougier <
>> Nicolas.Rougier at inria.fr> wrote:
>> >
>> >
>> > Hi,
>> >
>> > I have an array L with regular spaced values between 0 and width.
>> > I have a (sorted) array I with irregular spaced values between 0 and
>> width.
>> >
>> > I would like to find the closest value in I for any value in L.
>> >
>> > Currently, I'm using the following script but I wonder if I missed an
>> obvious (and faster) solution:
>> >
>> >
>> > import numpy as np
>> >
>> > def find_closest(A, target):
>> >     idx = A.searchsorted(target)
>> >     idx = np.clip(idx, 1, len(A) - 1)
>> >     left = A[idx - 1]
>> >     right = A[idx]
>> >     idx -= target - left < right - target
>> >     return idx
>> >
>> > n, width = 256, 100.0
>> >
>> > # 10 random sorted values in [0,width]
>> > I = np.sort(np.random.randint(0,width,10))
>> >
>> > # n regular spaced values in [0,width]
>> > L = np.linspace(0, width, n)
>> >
>> > print I[find_closest(I,L)]
>> >
>> >
>> >
>> > Nicolas
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140622/a953ae86/attachment.html>


More information about the NumPy-Discussion mailing list