Hi guys, I want to add a couple routines to the HaloFinder having to do with halo nearest neighbor stuff. In particular, I'd like to add something like: halo_list.nearest_neighbor_3D(haloID,N) where it finds the Nth nearest neighbor to halo 'haloID' in three dimensions. halo_list.nearest_neighbor_2D(haloID,N,[x,y,z]) where it finds the Nth nearest neighbor in two dimensions, projected along vector [x,y,z]. The obvious way to go about this is a kdTree. I know this is something that Matt has put a lot of thought into. I've been playing around with this one <http://sites.google.com/site/mikescoderama/Home/kd-tree-knn> simply because it's in pure Python, not for any other reason. scipy comes with the spatial package, but it appears that some recent changes have caused its install to become tied to the rabbit hole that is the scipy package. Ideally, the scipy spatial would be better because it's surely faster, but scipy is a hassle. Before I put time into this, what thoughts do people have? The python kdTree works, and it can probably be included in yt. A kdTree will probably have uses elsewhere in yt, so if I write routines that use it, I'd like the kdTree I use to be useful elsewhere. Thanks! _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
Hi Stephen, If we have to install any of SciPy to get access to spatial, I am in complete agreement about avoiding it. The pure python implementation is fine with me, but keep in mind it won't scale to a huge number of points -- it might only be useful for large spheres, and we won't be able to use it for things like data points or particles unless the number considered is very low. Our long term strategy for such a thing would be to convert the code to optimized Cython and include that. Alternatively, the kD trees in HOP and FOF could be wrapped, which all-in-all would not be such a bad thing. Including this code is also okay with me. It will get shuffled around a bit after the reorganization, but for now tossing it into its own file in yt/lagos/ is fine. I would really appreciate it, however, if you could clean it up so that it's more in compliance with PEP-8. Thanks very much, Stephen, keeping us looking forward with geometric analysis would be great -- and distance measures for halos will be super useful. -Matt On Mon, Jun 15, 2009 at 10:49 PM, Stephen Skory<stephenskory@yahoo.com> wrote:
Hi guys,
I want to add a couple routines to the HaloFinder having to do with halo nearest neighbor stuff. In particular, I'd like to add something like:
halo_list.nearest_neighbor_3D(haloID,N) where it finds the Nth nearest neighbor to halo 'haloID' in three dimensions.
halo_list.nearest_neighbor_2D(haloID,N,[x,y,z]) where it finds the Nth nearest neighbor in two dimensions, projected along vector [x,y,z].
The obvious way to go about this is a kdTree. I know this is something that Matt has put a lot of thought into. I've been playing around with this one <http://sites.google.com/site/mikescoderama/Home/kd-tree-knn> simply because it's in pure Python, not for any other reason. scipy comes with the spatial package, but it appears that some recent changes have caused its install to become tied to the rabbit hole that is the scipy package. Ideally, the scipy spatial would be better because it's surely faster, but scipy is a hassle.
Before I put time into this, what thoughts do people have? The python kdTree works, and it can probably be included in yt. A kdTree will probably have uses elsewhere in yt, so if I write routines that use it, I'd like the kdTree I use to be useful elsewhere.
Thanks!
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi,
The pure python implementation is fine with me, but keep in mind it won't scale to a huge number of points
I ran some tests and of course you are correct, the Python implementation is vastly slower than the C code in FOF/HOP. I can compile it with Cython, which speeds things up by about 30%, without doing any actual optimization by hand. I'm not sure how much optimization can be done, this kd tree code passes a lot of python objects around.
Alternatively, the kD trees in HOP and FOF could be wrapped, which all-in-all would not be such a bad thing.
The only drawback to this is that they are (as coded) explicitly 3D. The python code is not. Do we think that higher dimension kd tree stuff will ever be needed? Another bonus to the python is the kd tree can be kept in memory between searches, while it would be more difficult to do that with a C module. Granted, building the kd tree typically takes less time than the nearest neighbor searches, but it would save some time. Thanks! _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
I ran some tests and of course you are correct, the Python implementation is vastly slower than the C code in FOF/HOP. I can compile it with Cython, which speeds things up by about 30%, without doing any actual optimization by hand. I'm not sure how much optimization can be done, this kd tree code passes a lot of python objects around.
Okay. I wouldn't worry about it as long as it works for your purposes (halo searches.)
The only drawback to this is that they are (as coded) explicitly 3D. The python code is not. Do we think that higher dimension kd tree stuff will ever be needed? Another bonus to the python is the kd tree can be kept in memory between searches, while it would be more difficult to do that with a C module. Granted, building the kd tree typically takes less time than the nearest neighbor searches, but it would save some time.
I don't think we'll need >3, and if we do we'll deal with that as it arises. I don't mind writing the wrapper code for the C code to present the full tree as a single object. I don't think I'm convinced that exposing each node would be necessary. For now, putting in the python version and then in the future writing the wrapper code for the hop kD-tree seems to be to be the best way forward. Thanks, Stephen! -Matt
participants (2)
-
Matthew Turk
-
Stephen Skory