[SciPy-User] Finding closest point in array - inverse of KDTree

Wed Oct 11 13:01:31 EDT 2017

Hello all,

I have the same question I posted on stack overflow: https://stackoverflow.com/questions/46693557/finding-closest-point-in-array-inverse-of-kdtree

I have a very large ndarray A, and a sorted list of points k (a small list, about 30 points).

For every element of A, I want to determine the closest element in the list of points k, together with the index. So something like:

    >>> A = np.asarray([3, 4, 5, 6])
    >>> k = np.asarray([4.1, 3])
    >>> values, indices
    [3, 4.1, 4.1, 4.1], [1, 0, 0, 0]

Now, the problem is that A is very very large. So I can't do something inefficient like adding one dimension to A, take the abs difference to k, and then take the minimum of each column.

For now I have been using np.searchsorted, as shown in the second answer here: https://stackoverflow.com/questions/2566412/find-nearest-value-in-numpy-array but even this is too slow.

I thought of using scipy.spatial.KDTree:

    >>> d = scipy.spatial.KDTree(k)
    >>> d.query(A)

This turns out to be much slower than the searchsorted solution.

On the other hand, the array A is always the same, only k changes. So it would be beneficial to use some auxiliary structure (like a "inverse KDTree") on A, and then query the results on the small array k.

Is there something like that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-user/attachments/20171011/c11967ff/attachment.html>