sorted particle indices when loading halos from disk
Recently I've been working with the HOP halo finder in yt 3.0. In particular I've been looking at star particles from Enzo simulations in halos of different sizes. I've been running into strange results with particle fields that are stored in the halo hdf5 files vs particle fields that have to be retrieved from the original simulation data. In particular, if I create a mask for star particles from a field saved to disk (creation_time prior to 3.0 or ParticleMassMsun now) then I get the correct values for other fields when I use this mask if they were also saved to disk (so particle positions or velocities) but not for fields that were retrieved from the simulation (such as dynamical time). Similarly, if I identify stars by creation_time in 3.0 (when it isn't saved in the hdf5 file) then I get the correct dynamical_times, but incorrect particle masses. I think I've identified the source of this problem. When the "particle_index" field is read from the halo hdf5 files, it is then sorted into ascending order. In particular, in __getitem__ in the LoadedHalo class there is the following (this is line ~867 in halo_objects in the 3.0 experimental branch): field_data = self._get_particle_data(self.id, self.fnames, self.size, key) if field_data is not None: if key == 'particle_index': field_data = field_data[field_data.argsort()] These sorted particle indices are then used when retrieving fields from the simulation data, so the fields end up being sorted in a different order than the ones that are retrieved directly from the halo hdf5 files. As a result, masks created from one set of fields don't work properly on the other set. I think that I can fix this, but before I do I want to make sure I'm not going to be breaking anything else in the process. Does anyone know why the particle_index field was being sorted? If so, do you happen to know whether it would make more sense to sort the other particle fields from disk or leave particle_index unsorted? Thanks in advance for any help. - Josh
Hi Josh, On Tue, Apr 8, 2014 at 7:17 PM, Josh Moloney <Joshua.Moloney@colorado.edu> wrote:
Recently I've been working with the HOP halo finder in yt 3.0. In particular I've been looking at star particles from Enzo simulations in halos of different sizes. I've been running into strange results with particle fields that are stored in the halo hdf5 files vs particle fields that have to be retrieved from the original simulation data. In particular, if I create a mask for star particles from a field saved to disk (creation_time prior to 3.0 or ParticleMassMsun now) then I get the correct values for other fields when I use this mask if they were also saved to disk (so particle positions or velocities) but not for fields that were retrieved from the simulation (such as dynamical time). Similarly, if I identify stars by creation_time in 3.0 (when it isn't saved in the hdf5 file) then I get the correct dynamical_times, but incorrect particle masses. I think I've identified the source of this problem. When the "particle_index" field is read from the halo hdf5 files, it is then sorted into ascending order. In particular, in __getitem__ in the LoadedHalo class there is the following (this is line ~867 in halo_objects in the 3.0 experimental branch):
field_data = self._get_particle_data(self.id, self.fnames, self.size, key) if field_data is not None: if key == 'particle_index': field_data = field_data[field_data.argsort()]
These sorted particle indices are then used when retrieving fields from the simulation data, so the fields end up being sorted in a different order than the ones that are retrieved directly from the halo hdf5 files. As a result, masks created from one set of fields don't work properly on the other set. I think that I can fix this, but before I do I want to make sure I'm not going to be breaking anything else in the process. Does anyone know why the particle_index field was being sorted? If so, do you happen to know whether it would make more sense to sort the other particle fields from disk or leave particle_index unsorted? Thanks in advance for any help.
My inclination is that we should fix the behavior -- which I believe means not sorting the particles. That being said, I am not familiar with where this gets used, so perhaps Britton or someone else can chime in? I believe Britton has envisioned a teardown of the existing functionality. -Matt
- Josh
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Josh, I agree with Matt. I don't see any reason to sort. I'm not terribly familiar with this area of the code, but please give a shot at fixing this and issue a PR if it works. This area of the code will eventually get redesigned, but I'm not sure when, so let's at least get this working right for now. Britton On Wed, Apr 9, 2014 at 4:06 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Josh,
Recently I've been working with the HOP halo finder in yt 3.0. In
I've been looking at star particles from Enzo simulations in halos of different sizes. I've been running into strange results with particle fields that are stored in the halo hdf5 files vs particle fields that have to be retrieved from the original simulation data. In particular, if I create a mask for star particles from a field saved to disk (creation_time prior to 3.0 or ParticleMassMsun now) then I get the correct values for other fields when I use this mask if they were also saved to disk (so particle
or velocities) but not for fields that were retrieved from the simulation (such as dynamical time). Similarly, if I identify stars by creation_time in 3.0 (when it isn't saved in the hdf5 file) then I get the correct dynamical_times, but incorrect particle masses. I think I've identified the source of this problem. When the "particle_index" field is read from the halo hdf5 files, it is then sorted into ascending order. In particular, in __getitem__ in the LoadedHalo class there is the following (this is line ~867 in halo_objects in the 3.0 experimental branch):
field_data = self._get_particle_data(self.id, self.fnames, self.size, key) if field_data is not None: if key == 'particle_index': field_data = field_data[field_data.argsort()]
These sorted particle indices are then used when retrieving fields from
simulation data, so the fields end up being sorted in a different order
the ones that are retrieved directly from the halo hdf5 files. As a result, masks created from one set of fields don't work properly on the other set. I think that I can fix this, but before I do I want to make sure I'm not going to be breaking anything else in the process. Does anyone know why
On Tue, Apr 8, 2014 at 7:17 PM, Josh Moloney <Joshua.Moloney@colorado.edu> wrote: particular positions the than the
particle_index field was being sorted? If so, do you happen to know whether it would make more sense to sort the other particle fields from disk or leave particle_index unsorted? Thanks in advance for any help.
My inclination is that we should fix the behavior -- which I believe means not sorting the particles. That being said, I am not familiar with where this gets used, so perhaps Britton or someone else can chime in? I believe Britton has envisioned a teardown of the existing functionality.
-Matt
- Josh
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (3)
-
Britton Smith
-
Josh Moloney
-
Matthew Turk