Hi all, in the spirit started by Britton of announcing active development, some of may not know that I have been working on a highly parallel halo finder based on the HOP method. The current method of parallelizing HOP that is in 1.5 and svn-trunk essentially runs HOP serially on each subregion and glues the results back together at the very end. It works wonderfully when it works. Its biggest constraint is that every halo must exist fully inside at least one of the subregions. For high-particle count, small-cosmology simulations, one can end up with too many particles in one region, and this method breaks down. The new halo finder, called 'chainHOP' (for now) for lack of a better term, takes a different approach to parallelize halo finding. Without getting into the heavy details, chainHOP parallelizes and glues haloes together based on the membership of particles in the chains that make up the haloes. A single halo may exist in several subregions. Unfortunately, the results are not identical to good old HOP, and never will be. A primary reason is because I'm using a different kd tree from HOP, and the kd tree HOP uses gives wrong answers (it calculates the distances between particles incorrectly, starting at something like the 6th decimal place, but it's been a while since I compared the two). However, the haloes found are very similar in what counts - size, center of mass and number of haloes. The upshot of all of this is, I can, for example, find the haloes in the z=0 L7 (512**3 particles) dataset in about 16 minutes on 128 cores on Triton. I have done the same for a 1024**3 unigrid dataset on Triton (large nodes) on 512 cores in about 6 hours, and since then I have made a couple improvements, and it should be more like 5 hours now. Much of that increase in time comes from a 100x increase in the number of 'chains' that need to be merged. My work is not ready for prime-time yet, but I thought you might be interested to know! _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
Hi Stephen, This is great work! Congratulations on those benchmarks, they're really something to be proud of. I'm sure I speak for everyone when I say we look forward to seeing more from you along these lines. It's this kind of work that'll keep making us better and better. I personally side with you on the issue of breaking backwards compatibility with original HOP -- it's better to be faster and more accurate than to preserve compatibility. The issue of defining a halo is sticky at best anyway, so I think we all welcome your new approach with open arms. -Matt On Mon, Aug 24, 2009 at 3:42 PM, Stephen Skory<stephenskory@yahoo.com> wrote:
Hi all,
in the spirit started by Britton of announcing active development, some of may not know that I have been working on a highly parallel halo finder based on the HOP method.
The current method of parallelizing HOP that is in 1.5 and svn-trunk essentially runs HOP serially on each subregion and glues the results back together at the very end. It works wonderfully when it works. Its biggest constraint is that every halo must exist fully inside at least one of the subregions. For high-particle count, small-cosmology simulations, one can end up with too many particles in one region, and this method breaks down.
The new halo finder, called 'chainHOP' (for now) for lack of a better term, takes a different approach to parallelize halo finding. Without getting into the heavy details, chainHOP parallelizes and glues haloes together based on the membership of particles in the chains that make up the haloes. A single halo may exist in several subregions.
Unfortunately, the results are not identical to good old HOP, and never will be. A primary reason is because I'm using a different kd tree from HOP, and the kd tree HOP uses gives wrong answers (it calculates the distances between particles incorrectly, starting at something like the 6th decimal place, but it's been a while since I compared the two). However, the haloes found are very similar in what counts - size, center of mass and number of haloes.
The upshot of all of this is, I can, for example, find the haloes in the z=0 L7 (512**3 particles) dataset in about 16 minutes on 128 cores on Triton. I have done the same for a 1024**3 unigrid dataset on Triton (large nodes) on 512 cores in about 6 hours, and since then I have made a couple improvements, and it should be more like 5 hours now. Much of that increase in time comes from a 100x increase in the number of 'chains' that need to be merged.
My work is not ready for prime-time yet, but I thought you might be interested to know!
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (2)
-
Matthew Turk
-
Stephen Skory