Hi Brian & Eric,
As you know (since we discussed it off-list), I'm the reason for this being mentioned to you. I had some pretty horrible problems with the various incarnations of HOP in yt being excruciatingly slow and consuming huge amounts of memory for a 1024^3 unigrid dataset, to the point where my grad student and I
ended up just using P-GroupFinder, the standalone halo finder that comes with week-of-code enzo. Note that when I say "excruciatingly slow" and "consuming huge amounts of memory", I mean that when we used 256 nodes on Ranger, with 2 cores/node (so 512 cores total) for the 1024^3 dataset, it still ran Ranger out
of memory, or, alternately, didn't finish in 24 hours.
A few notes in response:
- Recently I ran a 2048^3 dataset on 264 cores that took about 2 hours which averaged about 8.5GB per task with a peak task of 10 GB. Your job is 1/8 the size and should have run, and I don't know why it didn't.
- If I wasn't trying to graduate I would have had more time to assist when your student (Brian) asked me for help. I'm sorry so much of your time was wasted.
- My tool as a public tool is not any good unless other people can use it too. Clearly I need to do some work on that.
- It *does* use much more memory than it needs to, you are right. I know where the problems are, and whoo-boy they are there, but they are not easy to fix.
- Speed could be better, but some of this has to do with how HOP itself works. For example, it needs to run the kD tree twice, unlike FOF which needs to only once. The final group building step is a "global" operation, so that's slow as well. On 128^3 particles, (normal) HOP takes about 75 seconds, and FOF about 25. The C HOP and FOF in yt both use the same kD tree, same data I/O methods, so that's a fair ratio of the increased workload.
_______________________________________________________ firstname.lastname@example.org o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________