Re: [yt-dev] profiles on non-prgio sets

Hi Dave,
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote:
Hi, all--
I have an old dataset that I'm trying to make profiles on. It's a 512^3 root grid, but was written with ParallelRootGridIO off. I find that it's using strange amounts of memory, more than 12 Gb. Is this a known problem with a straight forward work-around?
d.
-- Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi--
Sorry for being sparse with the information, I should have been much more clear. This was running in parallel, 64 cores on Nautilus. The field was Density vs. MassFraction, MassFraction being a field I wrote that in this instance returns simply CellMass. lazy_reader is True. The run is 512^3 with 4 levels by 4, with only about 100 subgrids, but was written in 2007 (? i think) without parallel root grid IO on. My suspicion (without really understanding how yt parallelism works) is that each task needed to read in the root grid, rather than using the domain decomposition (?)
Would lazy_reader off change things?
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote:
Hi, all--
I have an old dataset that I'm trying to make profiles on. It's a 512^3 root grid, but was written with ParallelRootGridIO off. I find that it's using strange amounts of memory, more than 12 Gb. Is this a known problem with a straight forward work-around?
d.
-- Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi Dave,
On Wed, Mar 14, 2012 at 10:38 AM, david collins antpuncher@gmail.com wrote:
Hi--
Sorry for being sparse with the information, I should have been much more clear. This was running in parallel, 64 cores on Nautilus. The field was Density vs. MassFraction, MassFraction being a field I wrote that in this instance returns simply CellMass. lazy_reader is True. The run is 512^3 with 4 levels by 4, with only about 100 subgrids, but was written in 2007 (? i think) without parallel root grid IO on. My suspicion (without really understanding how yt parallelism works) is that each task needed to read in the root grid, rather than using the domain decomposition (?)
64 cores, using 12GB on each? That's a problem. MassFraction it sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote:
Hi, all--
I have an old dataset that I'm trying to make profiles on. It's a 512^3 root grid, but was written with ParallelRootGridIO off. I find that it's using strange amounts of memory, more than 12 Gb. Is this a known problem with a straight forward work-around?
d.
-- Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

64 cores, using 12GB on each? That's a problem. MassFraction it
12Gb total, but still... I'll grab some more failure info and get back to you. In serial it ran fine, and took about 1 hour for both [density,cellmass] and [density,cellvolume] within the default head-node allowance on nautilus, which is 12Gb. Timing wise longer than I want to deal with, but it does run.
d.
sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote:
Hi, all--
I have an old dataset that I'm trying to make profiles on. It's a 512^3 root grid, but was written with ParallelRootGridIO off. I find that it's using strange amounts of memory, more than 12 Gb. Is this a known problem with a straight forward work-around?
d.
-- Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi Dave,
512^3, bunch of subgrids, with 64 cores totalling 12 gb (i.e., just shy of 200mb / core) sounds okay to me. I would guess you are seriously disk limited for the analysis itself, but you can verify this by running:
python2.7 -m cProfile -o ytprof.cprof my_script.py
and then using either pyprof2html (requires Jinja2, sold separately) or the stats module to examine it.
-Matt
On Wed, Mar 14, 2012 at 11:15 AM, david collins antpuncher@gmail.com wrote:
64 cores, using 12GB on each? That's a problem. MassFraction it
12Gb total, but still... I'll grab some more failure info and get back to you. In serial it ran fine, and took about 1 hour for both [density,cellmass] and [density,cellvolume] within the default head-node allowance on nautilus, which is 12Gb. Timing wise longer than I want to deal with, but it does run.
d.
sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote:
Hi, all--
I have an old dataset that I'm trying to make profiles on. It's a 512^3 root grid, but was written with ParallelRootGridIO off. I find that it's using strange amounts of memory, more than 12 Gb. Is this a known problem with a straight forward work-around?
d.
-- Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

512^3, bunch of subgrids, with 64 cores totalling 12 gb (i.e., just shy of 200mb / core) sounds okay to me. I would guess you are
Ok, I've pulled together some more details.
If I run this 8 cores running a [Density,Density] profile (so, should only have a single field) it fails claiming I'm using 31 Gb, which I have a hard time understanding-- that's nearly 4 Gb per core.
I did the same analysis with a more recent (PRGIO) 512^3 with thousands of grids (4 levels by 2) (many more than in the first set), same job script, and it runs fine.
Anyhow, this isn't a huge deal, since non-prgio datasets are far from the norm these days, and it works fine in serial.
Does cProfile work in parallel, and does it do memory work?
d.
seriously disk limited for the analysis itself, but you can verify this by running:
python2.7 -m cProfile -o ytprof.cprof my_script.py
and then using either pyprof2html (requires Jinja2, sold separately) or the stats module to examine it.
-Matt
On Wed, Mar 14, 2012 at 11:15 AM, david collins antpuncher@gmail.com wrote:
64 cores, using 12GB on each? That's a problem. MassFraction it
12Gb total, but still... I'll grab some more failure info and get back to you. In serial it ran fine, and took about 1 hour for both [density,cellmass] and [density,cellvolume] within the default head-node allowance on nautilus, which is 12Gb. Timing wise longer than I want to deal with, but it does run.
d.
sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote:
I should add that this was done on 64 cores-- in serial it works fine, just slow.
On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com wrote: > Hi, all-- > > I have an old dataset that I'm trying to make profiles on. It's a > 512^3 root grid, but was written with ParallelRootGridIO off. I find > that it's using strange amounts of memory, more than 12 Gb. Is this a > known problem with a straight forward work-around? > > d. > > -- > Sent from my computer.
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

On Wed, Mar 14, 2012 at 12:14 PM, david collins antpuncher@gmail.com wrote:
512^3, bunch of subgrids, with 64 cores totalling 12 gb (i.e., just shy of 200mb / core) sounds okay to me. I would guess you are
Ok, I've pulled together some more details.
If I run this 8 cores running a [Density,Density] profile (so, should only have a single field) it fails claiming I'm using 31 Gb, which I have a hard time understanding-- that's nearly 4 Gb per core.
Yeah, puzzling.
I did the same analysis with a more recent (PRGIO) 512^3 with thousands of grids (4 levels by 2) (many more than in the first set), same job script, and it runs fine.
Anyhow, this isn't a huge deal, since non-prgio datasets are far from the norm these days, and it works fine in serial.
Does cProfile work in parallel, and does it do memory work?
You have to manually call it to get it not to overwrite its profile files; I've meant to add something to yt to make this easier, but have not yet. It doesn't do memory work.
-Matt
d.
seriously disk limited for the analysis itself, but you can verify this by running:
python2.7 -m cProfile -o ytprof.cprof my_script.py
and then using either pyprof2html (requires Jinja2, sold separately) or the stats module to examine it.
-Matt
On Wed, Mar 14, 2012 at 11:15 AM, david collins antpuncher@gmail.com wrote:
64 cores, using 12GB on each? That's a problem. MassFraction it
12Gb total, but still... I'll grab some more failure info and get back to you. In serial it ran fine, and took about 1 hour for both [density,cellmass] and [density,cellvolume] within the default head-node allowance on nautilus, which is 12Gb. Timing wise longer than I want to deal with, but it does run.
d.
sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
Are you running in parallel? Which fields are you profiling? Do they have lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does your data have many grids?
Matt
On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote: > > I should add that this was done on 64 cores-- in serial it works fine, > just slow. > > On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com > wrote: > > Hi, all-- > > > > I have an old dataset that I'm trying to make profiles on. It's a > > 512^3 root grid, but was written with ParallelRootGridIO off. I find > > that it's using strange amounts of memory, more than 12 Gb. Is this a > > known problem with a straight forward work-around? > > > > d. > > > > -- > > Sent from my computer. > > > > -- > Sent from my computer. > _______________________________________________ > yt-dev mailing list > yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Hi Dave,
On Wed, Mar 14, 2012 at 2:18 PM, Matthew Turk matthewturk@gmail.com wrote:
On Wed, Mar 14, 2012 at 12:14 PM, david collins antpuncher@gmail.com wrote:
512^3, bunch of subgrids, with 64 cores totalling 12 gb (i.e., just shy of 200mb / core) sounds okay to me. I would guess you are
Ok, I've pulled together some more details.
If I run this 8 cores running a [Density,Density] profile (so, should only have a single field) it fails claiming I'm using 31 Gb, which I have a hard time understanding-- that's nearly 4 Gb per core.
Yeah, puzzling.
I did the same analysis with a more recent (PRGIO) 512^3 with thousands of grids (4 levels by 2) (many more than in the first set), same job script, and it runs fine.
Anyhow, this isn't a huge deal, since non-prgio datasets are far from the norm these days, and it works fine in serial.
Does cProfile work in parallel, and does it do memory work?
You have to manually call it to get it not to overwrite its profile files; I've meant to add something to yt to make this easier, but have not yet. It doesn't do memory work.
For what it's worth, in the PR I just issued for including beta-status Rockstar interfacing, I've added a parallel profiler. You use it like:
with parallel_profile("some_prefix"): do_something() something_else()
and it will write out .cprof files named as per processor in the top communicator.
-Matt
-Matt
d.
seriously disk limited for the analysis itself, but you can verify this by running:
python2.7 -m cProfile -o ytprof.cprof my_script.py
and then using either pyprof2html (requires Jinja2, sold separately) or the stats module to examine it.
-Matt
On Wed, Mar 14, 2012 at 11:15 AM, david collins antpuncher@gmail.com wrote:
64 cores, using 12GB on each? That's a problem. MassFraction it
12Gb total, but still... I'll grab some more failure info and get back to you. In serial it ran fine, and took about 1 hour for both [density,cellmass] and [density,cellvolume] within the default head-node allowance on nautilus, which is 12Gb. Timing wise longer than I want to deal with, but it does run.
d.
sounds like has a handful of dependencies -- CellMassMsun, which depends on Density and dx, but dx should be used as a single value not as a field. So that gives 3*512^3*64 bits = 3 gigabytes, but that's a far cry from 12 gb/core. Plus, the domain decomp for profiles is per-grid -- so only one of your cores should get assigned the root grid. This is very bizarre.
Would lazy_reader off change things?
Yes, but it would make them worse. :)
-Matt
Thanks, d.
> Are you running in parallel? Which fields are you profiling? Do they have > lots of dependencies, or require ghost zones? Is yt using lazy_reader? Does > your data have many grids?
> > Matt > > On Mar 14, 2012 12:37 AM, "david collins" antpuncher@gmail.com wrote: >> >> I should add that this was done on 64 cores-- in serial it works fine, >> just slow. >> >> On Tue, Mar 13, 2012 at 9:35 PM, david collins antpuncher@gmail.com >> wrote: >> > Hi, all-- >> > >> > I have an old dataset that I'm trying to make profiles on. It's a >> > 512^3 root grid, but was written with ParallelRootGridIO off. I find >> > that it's using strange amounts of memory, more than 12 Gb. Is this a >> > known problem with a straight forward work-around? >> > >> > d. >> > >> > -- >> > Sent from my computer. >> >> >> >> -- >> Sent from my computer. >> _______________________________________________ >> yt-dev mailing list >> yt-dev@lists.spacepope.org >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org > > > _______________________________________________ > yt-dev mailing list > yt-dev@lists.spacepope.org > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org >
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-- Sent from my computer. _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (2)
-
david collins
-
Matthew Turk