[SciPy-User] efficient computation of point cloud nearest neighbors

denis denis-bz-gg at t-online.de
Mon May 30 13:01:27 EDT 2011


Christian, folks,
  a couple of comments:

FLANN gets  a lot of speed by quitting early, after looking at e.g.
.1N or .01N or (FLANN default) 32*leafsize points.
Accuracy *may* decrease -- guarantees are gone --
but I've found big speedup for ~ same accuracy,
especially in dimensions say > 20 where "distance whiteout" sets in.
I've added cutoff= to Anne Archibald's nice cython ckdtree.pyx,
also verbose= to help use it;
Would like a friendly proofreader
or else post some data, let me try it here.

What's your metric ?
(Choice of metric is *really* important for clustering -- Hastie p.
506).
FLANN does Euclidean, L2, only (and returns dist^2);
ANN can be compiled 3 ways for L1 L2 Lmax;
cKDTree does any Lp metric.

cheers
  -- denis

On May 28, 1:13 am, Christian Jauvin <cjau... at gmail.com> wrote:
> Hi,
>
> I need to compute the k nearest neighbors of every point in a point
> cloud of at least a million points.



More information about the SciPy-User mailing list