Minimum distance between 2 paths in 3D

Hi All, I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them. Any suggestion is welcome :-) Thank you. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/

Hi Andrea,
I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them.
In 2D there would be a few tricks you could use, but in higher dimensions anything smart that you could attempt might cost you more computation time than just comparing the points (unless N is huge). At least make sure to put the "looping" over points into a vectorized form to avoid python for loops. e.g. two curves given by 3xN arrays c and d: from numpy import concatenate, argmin from numpy.linalg import norm distvec = concatenate([c[:,i]-d.T for i in range(N)]) # all N**2 distance vectors ns = [norm(a) for a in distvec] # all N**2 norms of the distance vectors cix, dix = divmod(argmin(ns), N) # find the index of the minimum norm from [0 .. N**2] and decode which point this corresponds to The minimum is between the points c[:,cix] and d[:,dix]. A numpy wonk might be able to squeeze a bit more optimization out of this, but I think this code works OK. Unfortunately, unless you know more mathematical properties of your curves in advance (info about their maximum curvature, for instance) you'll end up needing to check every pair of points. If N is really big, like over 10**4 maybe, it might be worth trying to break the curves up into pieces contained in bounding boxes which you can eliminate from a full search if they don't intersect. -Rob

Hi Rob & All, On Sat, Sep 27, 2008 at 4:05 PM, Rob Clewley wrote:
Hi Andrea,
I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them.
In 2D there would be a few tricks you could use, but in higher dimensions anything smart that you could attempt might cost you more computation time than just comparing the points (unless N is huge). At least make sure to put the "looping" over points into a vectorized form to avoid python for loops. e.g. two curves given by 3xN arrays c and d:
from numpy import concatenate, argmin from numpy.linalg import norm
distvec = concatenate([c[:,i]-d.T for i in range(N)]) # all N**2 distance vectors ns = [norm(a) for a in distvec] # all N**2 norms of the distance vectors cix, dix = divmod(argmin(ns), N) # find the index of the minimum norm from [0 .. N**2] and decode which point this corresponds to
The minimum is between the points c[:,cix] and d[:,dix]. A numpy wonk might be able to squeeze a bit more optimization out of this, but I think this code works OK.
Unfortunately, unless you know more mathematical properties of your curves in advance (info about their maximum curvature, for instance) you'll end up needing to check every pair of points. If N is really big, like over 10**4 maybe, it might be worth trying to break the curves up into pieces contained in bounding boxes which you can eliminate from a full search if they don't intersect.
Thank you very much for your kind suggestions and the code snippet. I think it will do just fine, as my well trajectories may have from 20 to 500 points maximum. I'll try it tomorrow at work and see how it goes. Thank you again. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/

2008/9/27 Andrea Gavana <andrea.gavana@gmail.com>:
I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them.
If you treat the points as simply two clouds of points (that is, ignore the fact that they represent points on paths), spatial data structures can make this kind of question faster. For example a kd-tree can be constructed in something like O(n log n) time, and you can answer questions like "what is the closest point in this set to (a,b,c)?" in something like O(log n) time. So you could do this by putting one set of points in a kd-tree and just running queries against each point in the other. There exist other spatial data structures - octrees, voxel grids, bounding-volume hierarchies, other binary space partitioning trees, as well as various more specialized ones. As the number of dimensions becomes large they become substantially more difficult to work with, and you do need to balance construction time against lookup time, but they are definitely worth thinking about in your problem. Sadly, no such data structure exists in either numpy or scipy, though biopython apparently has an implementation of kd-trees. Good luck, Anne

On Sep 27, 2008, at 10:23 PM, Anne Archibald wrote:
Sadly, no such data structure exists in either numpy or scipy, though biopython apparently has an implementation of kd-trees.
It's not part of scipy but I saw there was a scikit which supported ANN: http://scipy.org/scipy/scikits/wiki/AnnWrapper scikits.ann exposes a single class, kdtree that wraps the Approximate Nearest Neighbor library's kd-tree implementation. kdtree has a single (non-constructor) method, knn that finds the indecies (of the points used to construct the kdtree) of the k-nearest neighbors and the squared distances to those points Does anyone have any experience with it? Though perhaps I should ask on the scipy list instead of numpy .... Andrew dalke@dalkescientific.com

On Sat, Sep 27, 2008 at 15:23, Anne Archibald <peridot.faceted@gmail.com> wrote:
2008/9/27 Andrea Gavana <andrea.gavana@gmail.com>:
I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them.
If you treat the points as simply two clouds of points (that is, ignore the fact that they represent points on paths), spatial data structures can make this kind of question faster. For example a kd-tree can be constructed in something like O(n log n) time, and you can answer questions like "what is the closest point in this set to (a,b,c)?" in something like O(log n) time. So you could do this by putting one set of points in a kd-tree and just running queries against each point in the other. There exist other spatial data structures - octrees, voxel grids, bounding-volume hierarchies, other binary space partitioning trees, as well as various more specialized ones. As the number of dimensions becomes large they become substantially more difficult to work with, and you do need to balance construction time against lookup time, but they are definitely worth thinking about in your problem. Sadly, no such data structure exists in either numpy or scipy, though biopython apparently has an implementation of kd-trees.
Sturla Molden has just contributed a pure Python+numpy implementation on the Scipy Cookbook. http://www.scipy.org/Cookbook/KDTree -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

2008/9/27 Robert Kern <robert.kern@gmail.com>:
On Sat, Sep 27, 2008 at 15:23, Anne Archibald <peridot.faceted@gmail.com> wrote:
2008/9/27 Andrea Gavana <andrea.gavana@gmail.com>:
I was wondering if someone had any suggestions/references/snippets of code on how to find the minimum distance between 2 paths in 3D. Basically, for every path, I have I series of points (x, y, z) and I would like to know if there is a better way, other than comparing point by point the 2 paths, to find the minimum distance between them.
If you treat the points as simply two clouds of points (that is, ignore the fact that they represent points on paths), spatial data structures can make this kind of question faster. For example a kd-tree can be constructed in something like O(n log n) time, and you can answer questions like "what is the closest point in this set to (a,b,c)?" in something like O(log n) time. So you could do this by putting one set of points in a kd-tree and just running queries against each point in the other. There exist other spatial data structures - octrees, voxel grids, bounding-volume hierarchies, other binary space partitioning trees, as well as various more specialized ones. As the number of dimensions becomes large they become substantially more difficult to work with, and you do need to balance construction time against lookup time, but they are definitely worth thinking about in your problem. Sadly, no such data structure exists in either numpy or scipy, though biopython apparently has an implementation of kd-trees.
Sturla Molden has just contributed a pure Python+numpy implementation on the Scipy Cookbook.
I think a kd-tree implementation would be a valuable addition to scipy, perhaps in a submodule scipy.spatial that might eventually contain other spatial data structures and algorithms. What do you think? Should we have one? Should it be based on Sturla Molden's code, if the license permits? I am willing to contribute one, if not. Anne

On Sat, Sep 27, 2008 at 11:18 PM, Anne Archibald <peridot.faceted@gmail.com> wrote:
I think a kd-tree implementation would be a valuable addition to scipy, perhaps in a submodule scipy.spatial that might eventually contain other spatial data structures and algorithms. What do you think? Should we have one? Should it be based on Sturla Molden's code, if the license permits? I am willing to contribute one, if not.
+1 If you're implementing one, I would highly recommend the "left-balanced" kd-tree. http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/2535/pdf/imm2535.pdf Here's one implementation (undetermined license): http://www2.imm.dtu.dk/~jab/software.html#kdtrees I have a C++ implementation based on the one above that does a few more operations (like nearest-n neighbors) that you're welcome to use. The use of n_th() in the STL makes left-balancing fairly straightforward. FWIW I also have a pure python implementation here: http://code.google.com/p/pydec/source/browse/trunk/pydec/math/kd_tree.py -- Nathan Bell wnbell@gmail.com http://graphics.cs.uiuc.edu/~wnbell/
participants (6)
-
Andrea Gavana
-
Andrew Dalke
-
Anne Archibald
-
Nathan Bell
-
Rob Clewley
-
Robert Kern