Hi Nathan

Thanks for that suggestion. I can confirm that adding the old radius fields to my_plugins.py fixes the excessive memory issue. I'll just filed an issue about it.

Btw, I think the problem arises primarily in the use of the 'sphere' data container in the halo profiling. The first time you access any dataset from a sphere, AMRSphereBase._get_cut_mask() has to create a RadiusCode field for the top grid, which in my case is 256^3. With your current _Radius definition, you first create a position field with shape (3,256,256,256) and then you duplicate that to create the 'center' field. I haven't fully groked everything that periodic_dist() does, but I think it must be possible to do it without creating this huge and redundant center field, and possibly even without creating one huge position field either. Unfortunately I don't have the time right now to figure this out, especially since I can override the radius field with the old definition, like you suggested.

Mike


On Wed, Feb 27, 2013 at 9:00 PM, Nathan Goldbaum <goldbaum@ucolick.org> wrote:
Hi Mike,

Sorry to hear you're having issues with my changes.  Just to justify myself a little bit: the old way of generating the radius field silently produced incorrect results on non-periodic datasets.  yt is increasingly being used to examine these datasets and I wanted to make sure that the results of a simple radial profile analysis would be correct.

That being said, the memory consumption is clearly not as good now.  When I wrote the new code to handle the radius fields, I didn't realize that the Radius field was so intimately tied to halo finding.

A quick fix to allow you to continue doing your analyses would be to replace the Radius field with the old definition.  This can be done without committing any changes to the yt codebase by using the my_plugins.py file.  Just create a file called my_plugins.py, place it in the .yt folder that lives in your home directory, and enter the old Radius and ParticleRadius field definitions inside of it.  I've pasted an example my_plugins.py file that does this here: http://paste.yt-project.org/show/3212/

Fields defined in my_plugins.py will override definitions in universal_fields.py, so if the new field definitions are the primary cause of the increased memory consumption you're seeing, this should fix it.

This isn't a very good long term solution and I'd like to work with you and others who deal with large datasets to find a permanent solution.  A good first step would be to file an issue about this.  I don't have a lot of experience working with large datasets so any help figuring out how to reduce the memory needs of the Radius field would be appreciated.

Cheers,

Nathan


On Wed, Feb 27, 2013 at 8:01 PM, Michael Kuhlen <mqk@astro.berkeley.edu> wrote:
> With the last good changeset the total memory usage never gets above ~8.5GB (estimated from top).

Sorry, forgot to say: the last good changeset is 41358eecdad5.


On Wed, Feb 27, 2013 at 7:58 PM, Michael Kuhlen <mqk@astro.berkeley.edu> wrote:
Hi all and Nathan specifically

I've found that the changes having to do with periodic fields are causing excessive memory usage for me when doing parallel halo profiling, to the point where I can no longer do analysis that used to work fine on my 24GB memory workstation.

A simple HaloProfiler script (http://paste.yt-project.org/show/3211/), run in parallel on 8 processors on the halo catalog obtained from one of my cosmology simulations, currently almost immediate runs out of memory, while it used to complete without problems.

I tried to use hg bisect to find the problematic changeset but had to skip the testing of several revisions because of runtime yt errors, so in the end bisect only gave me a range of potentially bad revisions: http://paste.yt-project.org/show/3210/. As you can see, they're all related to the periodic radius mods from about a month ago.

I tried replicating this with the standard Enzo_64 example dataset from http://yt-project.org/data/, but I guess that one is too small to produce this problem on my machine. If you want to see the problem for yourself, then you can download this tarball (http://astro.berkeley.edu/~mqk/transfer/RD0003.tar 4.5GB) and run this script (http://paste.yt-project.org/show/3211/) on it, like so:

$ mpirun -np 8 python ./profile_halos.py --parallel

When I run this with the current tip (ccfe34e70803) the total memory usage grows to >24GB. With the last good changeset the total memory usage never gets above ~8.5GB (estimated from top).

It'd be great if we could find out what the problem is and fix it, because this is a major performance regression for me, that, as I already said, is seriously impacting my ability to do analysis with the current yt tip. Let me know if I should file a BB issue about this, and/or if there's some way I can assist with fixing this.

Cheers,
Mike

--
*********************************************************************
*                                                                   *
*  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
*  email: mqk@astro.berkeley.edu   UC Berkeley                      *
*  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
*  skype username: mikekuhlen      Berkeley, CA 94720               *
*                                                                   *
*********************************************************************



--
*********************************************************************
*                                                                   *
*  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
*  email: mqk@astro.berkeley.edu   UC Berkeley                      *
*  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
*  skype username: mikekuhlen      Berkeley, CA 94720               *
*                                                                   *
*********************************************************************

_______________________________________________
yt-dev mailing list
yt-dev@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org



_______________________________________________
yt-dev mailing list
yt-dev@lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org




--
*********************************************************************
*                                                                   *
*  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
*  email: mqk@astro.berkeley.edu   UC Berkeley                      *
*  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
*  skype username: mikekuhlen      Berkeley, CA 94720               *
*                                                                   *
*********************************************************************