Hi guys, I brought this up a while ago and I'm not sure where we left it. I thought I needed it earlier, but it turns out I didn't, so I forgot about it. However, now I definitely have need to get virial masses for L7 haloes. A parallel halo profiler would make this a much more reasonable task. I think I can probably figure out a way to do it myself (which would be a good thing for me to do) but I don't want to step on any toes or do it in a bad or inextensible way. Keeping that in mind, do any of you have some suggestions? I think the best way is to parallelize similarly to HOP using sub-volumes to keep memory usage low for each processor. Then each process would only run on haloes in its sub-volume. This would make things nicer if parallel HOP is run before the profiler, data wouldn't have to be read multiple times. Thanks! _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
Hi Stephen,
I just chatted with Britton on the phone about this, and we came to a
couple conclusions.
* Right now, the halo profiler is about 10% away from running in
parallel on *each* halo. i.e., each operation on the halos can be
parallelized with a few more lines of work.
* This method of parallelization is a bad way to go.
* Volume decomposition is also not likely to be terribly useful, with
the current means of IO. At some point in the future this could and
will change, but for now a round-robin approach is the way to go.
* To round robin this, two things need to happen --
* When projecting, we need to be able to set a flag to disable
parallelization, ala '_processing'. I'll handle this, but it should
be 1-2 lines of code.
* The ParallelAnalysisInterface needs to be generalized to provide
iterators over any object, not just _grids. Only a couple lines of
code, but annoying.
At that point, HaloProfiler can subclass ParallelAnalysisInterface and
use that as the iterator over halos. The IO needs to be looked at
briefly, but it should be mostly okay as is.
I think this can be done in a couple hours.
However, what I think the real problem is is that we need to ensure
that the serial halo profiler doesn't break. To that end, I think we
need to plan to apply Britton's sample parameter files to the
RD0005-mine dataset, and run that in both serial and in parallel as
our testing mechanism.
My inclination is that we start working on this in isolation; I'll
commit some changes I mention above tomorrow to the bitbucket yt repo,
which I am also mirroring on hg.enzotools.org.
-Matt
On Fri, Mar 13, 2009 at 2:05 PM, Stephen Skory
Hi guys,
I brought this up a while ago and I'm not sure where we left it. I thought I needed it earlier, but it turns out I didn't, so I forgot about it. However, now I definitely have need to get virial masses for L7 haloes. A parallel halo profiler would make this a much more reasonable task.
I think I can probably figure out a way to do it myself (which would be a good thing for me to do) but I don't want to step on any toes or do it in a bad or inextensible way. Keeping that in mind, do any of you have some suggestions?
I think the best way is to parallelize similarly to HOP using sub-volumes to keep memory usage low for each processor. Then each process would only run on haloes in its sub-volume. This would make things nicer if parallel HOP is run before the profiler, data wouldn't have to be read multiple times.
Thanks!
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
_______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Matt,
* Volume decomposition is also not likely to be terribly useful, with the current means of IO. At some point in the future this could and will change, but for now a round-robin approach is the way to go.
Out of general curiosity, what aspect of the IO would have to change?
My inclination is that we start working on this in isolation; I'll commit some changes I mention above tomorrow to the bitbucket yt repo, which I am also mirroring on hg.enzotools.org.
Thank you very much, Matt! While I was certainly not requesting you do my work for me, I had the feeling that what I wanted to do would be related to (or impinge upon) overarching plans you already had for yt. Let me know what I can do for you. _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
Out of general curiosity, what aspect of the IO would have to change?
Well, there are a couple problems. What I'll present as the most problematic is the copying of arrays. We constructs sphere to analyze halos, which then load-on-demand. If you either ran these as field-cuts (inlineextractedregions) from the tiles or enabled preloading of data and disabled the 'pop' method in the DataQueue object, you could get around this. It's something to think about for later, but I'm not yet it's an issue right now. I think we should try implementing the round robin, then run cProfile on it to see if IO is the limiting factor.
Thank you very much, Matt! While I was certainly not requesting you do my work for me, I had the feeling that what I wanted to do would be related to (or impinge upon) overarching plans you already had for yt. Let me know what I can do for you.
will do. This is in support of a ticket I opened maybe a month or two ago, 192, which is really two-in-one. -Matt
I've made my first pass at parallelizing the halo profiler. The
writing out stage still probably won't work, as all the procs will
still try to write out at the same time. You can see what I did in
these two changesets:
http://bitbucket.org/MatthewTurk/yt/changeset/556490d86872/
http://bitbucket.org/MatthewTurk/yt/changeset/91933c49f17f/
(note that I pushed a bunch more changesets where I fixed bugs I
missed... oops! the main stuff is in here)
I changed the GridIterator to be a general object iterator. The halo
profiler then runs with this. The halo profiler won't work just yet;
what needs to be done is to change hopHalos into a dict keyed by the
halo ids, and then it'll be a lot closer... Maybe Britton could take
a look at where it is, suggest if it's almost ready? I think it was
close to nearly there already. Specifically, I suspect we'll need to
have multiple round-robins over the halos with the current workflow in
the HaloProfiler, so we'll have to use arguments to
initialize_parallel and finalize_parallel to govern what gets done.
(We probably only need a finalize_parallel.)
-Matt
On Fri, Mar 13, 2009 at 10:37 PM, Matthew Turk
Out of general curiosity, what aspect of the IO would have to change?
Well, there are a couple problems. What I'll present as the most problematic is the copying of arrays. We constructs sphere to analyze halos, which then load-on-demand. If you either ran these as field-cuts (inlineextractedregions) from the tiles or enabled preloading of data and disabled the 'pop' method in the DataQueue object, you could get around this. It's something to think about for later, but I'm not yet it's an issue right now. I think we should try implementing the round robin, then run cProfile on it to see if IO is the limiting factor.
Thank you very much, Matt! While I was certainly not requesting you do my work for me, I had the feeling that what I wanted to do would be related to (or impinge upon) overarching plans you already had for yt. Let me know what I can do for you.
will do. This is in support of a ticket I opened maybe a month or two ago, 192, which is really two-in-one.
-Matt
participants (2)
-
Matthew Turk
-
Stephen Skory