Small correction, the old and new halo catalogs aren't strictly identical. However, the differences between the two are very small, typically less than 1e-5. Only the center of mass positions and the max_r field occasionally show differences at the 1e-2 level, but even those fields usually are within 1e-5. I think this is consistent with roundoff error. On Thu, Feb 23, 2012 at 11:51 PM, Michael Kuhlen <mqk@astro.berkeley.edu> wrote:
Hi Matt
I cloned your repository and gave it a whirl. I can confirm that the memory usage now is as expected. Also the halo catalog is identical to the one that the earlier version produced, so it seems to me that everything is good.
Cheers, Mike
On Thu, Feb 23, 2012 at 7:31 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Mike,
Thanks for letting me take a look at the data. I have identified the problem. To convert from code-units to good-units, yt calculates the conversion factor. However, it also batches the grids to convert. To do so, it calculates -- in this case -- CellVolume for every grid. (Your 512^3 topgrid exacerbates the problem.) However, because I (and this was most definitely my fault) did not use the functionality in yt to ensure that every grid that has a supplemental field loaded then flushes that field from memory once it has been used, the CellVolume fields are all retained. So, CellVolume -- along with maybe one or two other fields -- was being generated for every grid.
Having fixed this, I see about what I would expect for memory use on this dataset.
I've issued a pull request to fix this problem, and I would request testing from both you and Stephen, as it touches the way particles are read and converted. I am leery of changes like this without a few more sets of eyes. Additionally, I have tested it, and while it gives the same answer to a very good precision, it is enough different (likely because of concatenation order and FP-roundoff; for moving7, the relative difference in a sum is ~1e-8) that the gold standard will have to be re-generated.
The PR is here:
https://bitbucket.org/yt_analysis/yt/pull-request/105/particle-io-fix
-Matt
On Wed, Feb 22, 2012 at 9:16 PM, Stephen Skory <s@skory.us> wrote:
Hi Mike,
Yes, that does the trick. However, self._data_source.quantities["TotalQuantity"]("ParticleMassMsun") returns a list, so I needed to add a '[0]' in order to get just the number.
I'm glad it helped. I will make this change soon to the source. I always forget about that list part!
It's not immediately clear to me how to implement this fix for the dm_only=True case, in which you only want the sum over DM particles.
It may be possible to write a special field or something... I'll think about it.
Lastly, does the sub_mass calculation have to be done even when subvolume is None and only a single processor is being used? It seems in this case sub_mass = total_mass and the second calculation could be skipped.
I think you're right. I'll make this change too! Thanks for pointing this out.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- ********************************************************************* * * * Dr. Michael Kuhlen Theoretical Astrophysics Center * * email: mqk@astro.berkeley.edu UC Berkeley * * cell phone: (831) 588-1468 B-116 Hearst Field Annex # 3411 * * skype username: mikekuhlen Berkeley, CA 94720 * * * *********************************************************************
-- ********************************************************************* * * * Dr. Michael Kuhlen Theoretical Astrophysics Center * * email: mqk@astro.berkeley.edu UC Berkeley * * cell phone: (831) 588-1468 B-116 Hearst Field Annex # 3411 * * skype username: mikekuhlen Berkeley, CA 94720 * * * *********************************************************************