
Looks mostly okay, but I think we need to rethink the initial mass calculation before it gets committed.
http://codereview.appspot.com/59041/diff/1/2 File HaloFinding.py (right):
http://codereview.appspot.com/59041/diff/1/2#newcode197 Line 197: Not sure we need this?
http://codereview.appspot.com/59041/diff/1/2#newcode433 Line 433: padded, LE, RE, self._data_source = self._partition_hierarchy_3d(padding=self.padding) We should get rid of this initial step of partitioning the entire hierarchy, and instead move to using a DerivedQuantity. I'd say something like
all_data = self.pf.h.all_data() all_data.quantities["TotalQuantity"]("ParticleMassMsun", lazy_reader=True)
which will automatically parallelize. This would get rid of most of the following lines, and we could avoid the mpi_allsum, too. I'll need to add in the preloading to the DQ object, however.
http://codereview.appspot.com/59041/diff/1/2#newcode450 Line 450: self._data_source.hierarchy.queue) Should be fine.

http://codereview.appspot.com/59041/diff/1/2#newcode197 Line 197: Not sure we need this?
Yup. I commented it out and it was ok on a test run.
I'd say something like
all_data = self.pf.h.all_data() all_data.quantities["TotalQuantity"]("ParticleMassMsun", lazy_reader=True)
which will automatically parallelize. This would get rid of most of the following lines, and we could avoid the mpi_allsum, too. I'll need to add in the preloading to the DQ object, however.
Let me know when you do this to DerivedQuantities, and I'll modify HaloFinding to reflect this idea above.
Awesome! Thanks!
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

We should get rid of this initial step of partitioning the entire hierarchy, and instead move to using a DerivedQuantity.
Also, is there a DerivedQuantity for what FOF needs, the total number of particles? I'm looking at the file and I don't think that capability is in there...
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

We can add it no problem... I'll take a whack today.
On Tue, May 5, 2009 at 9:53 AM, Stephen Skory stephenskory@gmail.com wrote:
We should get rid of this initial step of partitioning the entire hierarchy, and instead move to using a DerivedQuantity.
Also, is there a DerivedQuantity for what FOF needs, the total number of particles? I'm looking at the file and I don't think that capability is in there...
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Actually.
I don't know why I said that. We don't need to. The hierarchy has a gridNumberOfParticles array, so calling sum on that should work just fine -- and I believe the mixin that is applied to the AMR3DData objects does as well, so we can just call a sum on that array and it'll be fine.
On Tue, May 5, 2009 at 9:54 AM, Matthew Turk matthewturk@gmail.com wrote:
We can add it no problem... I'll take a whack today.
On Tue, May 5, 2009 at 9:53 AM, Stephen Skory stephenskory@gmail.com wrote:
We should get rid of this initial step of partitioning the entire hierarchy, and instead move to using a DerivedQuantity.
Also, is there a DerivedQuantity for what FOF needs, the total number of particles? I'm looking at the file and I don't think that capability is in there...
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

I don't know why I said that. We don't need to. The hierarchy has a gridNumberOfParticles array, so calling sum on that should work just fine -- and I believe the mixin that is applied to the AMR3DData objects does as well, so we can just call a sum on that array and it'll be fine.
Will gridNumberOfParticles differentiate between kinds of particles? Stars/DM/etc...?
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

Hi Matt,
I'm trying to use alldata.quantities to find the total number of particles for FOF, as we've discussed before. I can't get this to work:
alldata = self.pf.h.all_data() n_parts = alldata.quantities["TotalQuantity"](["gridNumberOfParticles"], lazy_reader=True, preload=True)[0]
Am I doing something wrong? Thanks!
Traceback (most recent call last): File "hop.py", line 7, in <module> hop_list_2 = FOFHaloFinder(pf,padding=0.05) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/HaloFinding.py", line 458, in __init__ preload=True)[0] File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/DerivedQuantities.py", line 72, in __call__ return self._call_func_lazy(args, kwargs) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/DerivedQuantities.py", line 79, in _call_func_lazy rv = self.func(GridChildMaskWrapper(g, self._data_source), *args, **kwargs) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/DerivedQuantities.py", line 424, in _TotalQuantity if data[field].size < 1: File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/DerivedQuantities.py", line 43, in __getitem__ return self.data_source._get_data_from_grid(self.grid, item) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/BaseDataTypes.py", line 39, in save_state tr = func(self, grid, field) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/BaseDataTypes.py", line 1290, in _get_data_from_grid if grid[field].size == 1: # dx, dy, dz, cellvolume File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/BaseDataTypes.py", line 197, in __getitem__ self.get_data(key) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/BaseGridType.py", line 96, in get_data self._generate_field(field) File "/nics/c/home/sskory/yt-cnl/lib/python2.6/site-packages/yt-1.5dev-py2.6-linux-x86_64.egg/yt/lagos/BaseGridType.py", line 66, in _generate_field raise exceptions.KeyError, field KeyError: 'gridNumberOfParticles'
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

alldata = self.pf.h.all_data() n_parts = alldata.quantities["TotalQuantity"](["gridNumberOfParticles"], lazy_reader=True, preload=True)[0]
Am I doing something wrong? Thanks!
Yes. TotalQuantity works for field quantities, which gridNumberOfParticles is not.
n_parts = self.pf.h.gridNumberOfParticles.sum()
-Matt

Matt,
n_parts = self.pf.h.gridNumberOfParticles.sum()
Perfect, thanks.
I moved a few of the preload()s in HaloFinding around since the patch I last sent and I ran tests on both FOF and HOP. I have it such that there are no calls to HDF5LightReader.ReadData() for parallel runs of either.
I've also been testing .write_particle_lists(), which I've got working using preloading. The only problem with that right now is in a parallel run, it makes multiple h5 files, as it should, but there's no output file listing which h5 file has what halo. I'm thinking of either outputting a simple text file, or outputting a modified HopAnalysis.out that lists the h5 file each halo is in using an additional column. Which is preferable?
Thanks! _______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

I'm thinking of either outputting a simple text file, or outputting a modified HopAnalysis.out that lists the h5 file each halo is in using an additional column. Which is preferable?
The more I think about it, adding a column to HopAnalysis.out is preferable. So unless someone speaks up, I'll implement that.
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

I'd prefer to keep the HopAnalysis.out file as static as possible and instead output a mapping file for the halos during writing of particle lists. Not everyone will write out the particle lists. (In fact, I suspect it may be a minority of people.)
Britton, you have done a lot with HopAnalysis.out parsing; how do you feel?
-Matt
On Sat, May 9, 2009 at 11:55 AM, Stephen Skory stephenskory@yahoo.com wrote:
I'm thinking of either outputting a simple text file, or outputting a modified HopAnalysis.out that lists the h5 file each halo is in using an additional column. Which is preferable?
The more I think about it, adding a column to HopAnalysis.out is preferable. So unless someone speaks up, I'll implement that.
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

The HaloProfiler is wired to read in the HopAnalysis.out file in its current form. The name isn't the most important thing, but I would still be in favor of leaving that alone and using new file names for new systems and functionality.
Britton
On Sun, May 10, 2009 at 11:04 PM, Matthew Turk matthewturk@gmail.comwrote:
I'd prefer to keep the HopAnalysis.out file as static as possible and instead output a mapping file for the halos during writing of particle lists. Not everyone will write out the particle lists. (In fact, I suspect it may be a minority of people.)
Britton, you have done a lot with HopAnalysis.out parsing; how do you feel?
-Matt
On Sat, May 9, 2009 at 11:55 AM, Stephen Skory stephenskory@yahoo.com wrote:
I'm thinking of either outputting a simple text file, or outputting a modified HopAnalysis.out that lists the h5 file each halo
is in
using an additional column. Which is preferable?
The more I think about it, adding a column to HopAnalysis.out is
preferable. So unless someone speaks up, I'll implement that.
sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ http://physics.ucsd.edu/%7Esskory/_.>/ _Graduate Student ________________________________(_)_(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi Britton & Matt,
The HaloProfiler is wired to read in the HopAnalysis.out file in its current form. The name isn't the most important thing, but I would still be in favor of leaving that alone and using new file names for new systems and functionality.
The way I see it is it's one fewer file that needs to be kept track of, and it's relevant information to add to each halo line. The way I have it now is it adds a column to the end of each line with the h5 file name if and only if .write_out is called by .write_particle_lists. At the top of the file, it now looks like:
# HALOS FOUND WITH (HOP/FOF). h5 FILENAME COLUMN: (True/False)
A file created by calling .write_out, like is done now, is unchanged from before except for the line above. Because the h5 filename is the last column, reading in the halo values using .split() means that HaloProfiler will work just fine with a file that has the h5 filenames anyway. The haloes file created by .write_particle_lists will have the same prefix name as the h5 files, making it easy to relate the files together.
I really think this is the more elegant solution. I can change it back and add a separate file if both of you still feel that that's the better solution. Of course I haven't comitted any of this, and I'm in no hurry to do so.
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

Hi Stephen,
I think adding a new field to be parsed, as well as a new column, is worse than just adding a new column. However, again I reiterate my opposition -- the people who will be outputting the final field will be in the vast minority, and I think really we should just output a map file *as needed*.
When YT moves to HDF1.8, we can use the symlink capabilities of HDF5 1.8 to avoid the mapping file. But I think a single map file, catering to the specific problem of outputting multiple particle files, is the way to go. I think we are verging on bike shedding at this point.
-Matt
On Mon, May 11, 2009 at 7:42 AM, Stephen Skory stephenskory@yahoo.com wrote:
Hi Britton & Matt,
The HaloProfiler is wired to read in the HopAnalysis.out file in its current form. The name isn't the most important thing, but I would still be in favor of leaving that alone and using new file names for new systems and functionality.
The way I see it is it's one fewer file that needs to be kept track of, and it's relevant information to add to each halo line. The way I have it now is it adds a column to the end of each line with the h5 file name if and only if .write_out is called by .write_particle_lists. At the top of the file, it now looks like:
# HALOS FOUND WITH (HOP/FOF). h5 FILENAME COLUMN: (True/False)
A file created by calling .write_out, like is done now, is unchanged from before except for the line above. Because the h5 filename is the last column, reading in the halo values using .split() means that HaloProfiler will work just fine with a file that has the h5 filenames anyway. The haloes file created by .write_particle_lists will have the same prefix name as the h5 files, making it easy to relate the files together.
I really think this is the more elegant solution. I can change it back and add a separate file if both of you still feel that that's the better solution. Of course I haven't comitted any of this, and I'm in no hurry to do so.
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_(_)_______________
Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Matt,
I think we are verging on bike shedding at this point.
I've never heard that term before. I understand how you're using it in context, but it's kind of strange.
I'll concede on the .write_out modifications.
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

Okay, the preload=True option to DerivedQuantities should work. I've posted an example script here:
http://hg.enzotools.org/test_scripts/file/7e6f48c3afe8/test_preload_dq.py
-Matt
On Tue, May 5, 2009 at 9:54 AM, Matthew Turk matthewturk@gmail.com wrote:
We can add it no problem... I'll take a whack today.
On Tue, May 5, 2009 at 9:53 AM, Stephen Skory stephenskory@gmail.com wrote:
We should get rid of this initial step of partitioning the entire hierarchy, and instead move to using a DerivedQuantity.
Also, is there a DerivedQuantity for what FOF needs, the total number of particles? I'm looking at the file and I don't think that capability is in there...
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi Matt,
I replaced in the HOP the first partition_heirarchy_3d stuff with this:
alldata = self.pf.h.all_data() total_mass = alldata.quantities["TotalQuantity"](["ParticleMassMsun"], lazy_reader=True, preload=True)[0]
(http://paste.enzotools.org/show/118/)
but I'm getting this error in a parallel run:
http://paste.enzotools.org/show/117/
It works running in serial from the command line. Any ideas? Is there something I forgot?
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________

Fixed in r1292.
On Tue, May 5, 2009 at 12:39 PM, Stephen Skory stephenskory@yahoo.com wrote:
Hi Matt,
I replaced in the HOP the first partition_heirarchy_3d stuff with this:
alldata = self.pf.h.all_data() total_mass = alldata.quantities["TotalQuantity"](["ParticleMassMsun"], lazy_reader=True, preload=True)[0]
(http://paste.enzotools.org/show/118/)
but I'm getting this error in a parallel run:
http://paste.enzotools.org/show/117/
It works running in serial from the command line. Any ideas? Is there something I forgot?
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_(_)_______________ _______________________________________________ Yt-dev mailing list Yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Fixed in r1292.
That appears to have fixed it. Thanks!
_______________________________________________________ sskory@physics.ucsd.edu o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________
participants (5)
-
Britton Smith
-
Matthew Turk
-
matthewturk@gmail.com
-
Stephen Skory
-
Stephen Skory