[Numpy-discussion] finding close together points.

Anne Archibald peridot.faceted at gmail.com
Fri Nov 13 02:31:47 EST 2009

2009/11/11 Christopher Barker <Chris.Barker at noaa.gov>:
> Anne Archibald wrote:
>> 2009/11/10 Christopher Barker <Chris.Barker at noaa.gov>:
>>> I have a bunch of points in 2-d space, and I need to find out which
>>> pairs of points are within a certain distance of one-another (regular
>>> old Euclidean norm).
>> This is an eminently reasonable thing to want, and KDTree should
>> support it. Unfortunately it doesn't.
> darn.
>> It's slow both because you're using the python implementation rather
>> than the C,
> I noticed that.
>> The one good thing about using the python implementation rather than
>> the Cython one is that you can subclass it, providing a new method.
>> There's still a certain amount of boilerplate code to write, but it
>> shouldn't be too bad.
> Can you give me a hint as to where to start?  I have no idea how kd
> trees work.
>> If this is still too slow, I have no problem incorporating additional
>> code into cKDTree; the only reason none of the ball queries are in
>> there is because ball queries must return variable-size results, so
>> you lose a certain amount of speed because you're forced to manipulate
>> python objects. But if there are relatively few result points, this
>> need not be much of a slowdown.
> And certainly in this case, there would not be very many results, and
> lists are pretty fast.

This is now implemented in SVN. I (tentatively?) used a set to store
the collection of pairs, because my tree traversal is not smart enough
to reliably uniquify the pairs without using sets. With more time and
energy, I'm sure the algorithm could be improved to avoid using sets
(both internally and on return), but I think that's something to save
for the Cython version.


More information about the NumPy-Discussion mailing list