nearest neighbor in 2D

Mon Nov 3 23:15:34 EST 2003

>>>>> "Bengt" == Bengt Richter <bokr at oz.net> writes:

    Bengt> Are you trying to find closest location to a mouse cursor
    Bengt> as it moves, and then adding a point when there's a click?
    Bengt> I.e., your particular use case might suggest a strategy
    Bengt> that's different from, e.g., what you'd do if each new
    Bengt> point's coordinates where read from file or came from a
    Bengt> generator, and you had exactly one search leading to
    Bengt> exactly one update of the set.  And also what you wanted to
    Bengt> do with the completed set.

I had two use cases just yesterday.  The one that prompted the
question arose in making a contour plot.  I'm defining a contour as an
ordered sequence of values over a 2D MxN matrix where the values
differ from some target value by at most some tolerance.  I maintain a
list of i,j indices into the matrix for a given contour value, and
follow the contour from a given i,j location by examining its
neighbors.  In order to close the loop (eg, when the contour finder
has passed once around a level curve of a mountain, I want to test
whether a given point i,j is close to a previously discovered point
k,l.  Since I have a list of these 2 tuple coordinates, I want to find
the nearest neighbor in the list and terminate the contour when the
nearest neighbor falls within some minimum distance

     3 4 5 
    2     6
13  1      7
12         8
 11   10  9

In the example above, I am traversing a contour identifying points in
the order 1,2,3...; as above each point represents an i,j tuple which
is an index into the matrix I am contouring.  I would like to
terminate the search at 13 rather than spiral around the existing
contour 1-12. Each time I add a new point to the contour, I would like
to query the existing list (excluding the most recently added points
which are close by construction) of locations and find the minimum
distance.  If I'm not too close to the preexisting contour, I add the
new point and proceed.

As I write this I realize there is an important efficiency.  Since
from an existing point I add the closest neighbor, the biggest step I
can make is 1,1.  If on the last nearest neighbor query I find a
minimum distance of d, it will take me d minimum steps to approach the
existing contour.  So I don't need to check the distance again for at
least d steps.  So the algorithm can proceed 1) obtain the distance d
from the existing contour to the most recently obtained point 2) make
d steps adding points that meet the value criteria 3) repeat.

The second use case arose with gdmodule, which can only allocate 256
colors, which I cache as a dict from rgb tuples (eg, 0.0, 0.05, 1.0)
to color.  When the total number of color allocations is made, and a
new rgb request comes into the color manager, I pick the already
allocated point in rgb space closest to the requested point.

I'll try David Eppstein's approach tomorrow and see how this fares.

Thanks to all for suggestions,
John Hunter