
Hello! I'm using the sph-viz branch of yt and I'm trying to make a density projection plot from a gadget-2 snapshot. This works like a charm when I use a snapshot with 428^3 particles in a 15 Mpc box. However, when I try to do it on a snapshot with 1024^3 particles in a 40 Mpc box, my job keeps getting killed because it is trying to exceed the available system memory. I'm submitting it to a node with 256 GB of RAM, and I'm requesting all of it, which means that more than this is trying to be used. The snapshot itself is ~ 40 GB. The memory usage appears to spike when allocating for the KDTree. Before that, it appears to generate the .ewah file without issue. I was just wondering if anyone had any thoughts on why this large spike in memory might be occurring and how I might go about fixing it? If not, no worries, I can try and investigate it myself, but it's a pain to do with pdb. My gadget simulation used a maximum of ~ 173 GB of RAM, so I feel like the kdtree shouldn't be using as much memory as it's trying to. Thanks, and sorry for the bother! Sincerely, -Jared

Hi Jared, What operation are you doing? We don’t actually need to generate the kdtree for many operations for Gadget-2 data. We do need it to load in tipsy data though. We may be generating it here unnecessarily depending on what you’re doing. In general though I think we’re going to need to make it so we can use cykdtree’s MPI-parallelized kdtree for really big datasets like yours. Thanks for the feedback! It’s good to hear about scaling issues like this. Nathan On Tue, Aug 14, 2018 at 11:44 AM Jared Coughlin <Jared.W.Coughlin.29@nd.edu> wrote:
Hello! I'm using the sph-viz branch of yt and I'm trying to make a density projection plot from a gadget-2 snapshot. This works like a charm when I use a snapshot with 428^3 particles in a 15 Mpc box. However, when I try to do it on a snapshot with 1024^3 particles in a 40 Mpc box, my job keeps getting killed because it is trying to exceed the available system memory. I'm submitting it to a node with 256 GB of RAM, and I'm requesting all of it, which means that more than this is trying to be used. The snapshot itself is ~ 40 GB. The memory usage appears to spike when allocating for the KDTree. Before that, it appears to generate the .ewah file without issue. I was just wondering if anyone had any thoughts on why this large spike in memory might be occurring and how I might go about fixing it? If not, no worries, I can try and investigate it myself, but it's a pain to do with pdb. My gadget simulation used a maximum of ~ 173 GB of RAM, so I feel like the kdtree shouldn't be using as much memory as it's trying to. Thanks, and sorry for the bother!
Sincerely, -Jared _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hi Nathan, I'm simply trying to make a projection plot: ds = GadgetDataset(file_name) p = yt.ProjectionPlot(ds, axis = 'z', fields = ('Gas', 'Density')).save(file_name + '.png') Thanks! -Jared On Tue, Aug 14, 2018 at 1:09 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Jared,
What operation are you doing? We don’t actually need to generate the kdtree for many operations for Gadget-2 data. We do need it to load in tipsy data though. We may be generating it here unnecessarily depending on what you’re doing.
In general though I think we’re going to need to make it so we can use cykdtree’s MPI-parallelized kdtree for really big datasets like yours.
Thanks for the feedback! It’s good to hear about scaling issues like this.
Nathan
On Tue, Aug 14, 2018 at 11:44 AM Jared Coughlin < Jared.W.Coughlin.29@nd.edu> wrote:
Hello! I'm using the sph-viz branch of yt and I'm trying to make a density projection plot from a gadget-2 snapshot. This works like a charm when I use a snapshot with 428^3 particles in a 15 Mpc box. However, when I try to do it on a snapshot with 1024^3 particles in a 40 Mpc box, my job keeps getting killed because it is trying to exceed the available system memory. I'm submitting it to a node with 256 GB of RAM, and I'm requesting all of it, which means that more than this is trying to be used. The snapshot itself is ~ 40 GB. The memory usage appears to spike when allocating for the KDTree. Before that, it appears to generate the .ewah file without issue. I was just wondering if anyone had any thoughts on why this large spike in memory might be occurring and how I might go about fixing it? If not, no worries, I can try and investigate it myself, but it's a pain to do with pdb. My gadget simulation used a maximum of ~ 173 GB of RAM, so I feel like the kdtree shouldn't be using as much memory as it's trying to. Thanks, and sorry for the bother!
Sincerely, -Jared _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hmm, I don’t think that should need to create a kdtree by default unless you explicitly ask to do gather smoothing. Let me see if I can reproduce. On Tue, Aug 14, 2018 at 12:29 PM Jared Coughlin <Jared.W.Coughlin.29@nd.edu> wrote:
Hi Nathan,
I'm simply trying to make a projection plot:
ds = GadgetDataset(file_name) p = yt.ProjectionPlot(ds, axis = 'z', fields = ('Gas', 'Density')).save(file_name + '.png')
Thanks! -Jared
On Tue, Aug 14, 2018 at 1:09 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Jared,
What operation are you doing? We don’t actually need to generate the kdtree for many operations for Gadget-2 data. We do need it to load in tipsy data though. We may be generating it here unnecessarily depending on what you’re doing.
In general though I think we’re going to need to make it so we can use cykdtree’s MPI-parallelized kdtree for really big datasets like yours.
Thanks for the feedback! It’s good to hear about scaling issues like this.
Nathan
On Tue, Aug 14, 2018 at 11:44 AM Jared Coughlin < Jared.W.Coughlin.29@nd.edu> wrote:
Hello! I'm using the sph-viz branch of yt and I'm trying to make a density projection plot from a gadget-2 snapshot. This works like a charm when I use a snapshot with 428^3 particles in a 15 Mpc box. However, when I try to do it on a snapshot with 1024^3 particles in a 40 Mpc box, my job keeps getting killed because it is trying to exceed the available system memory. I'm submitting it to a node with 256 GB of RAM, and I'm requesting all of it, which means that more than this is trying to be used. The snapshot itself is ~ 40 GB. The memory usage appears to spike when allocating for the KDTree. Before that, it appears to generate the .ewah file without issue. I was just wondering if anyone had any thoughts on why this large spike in memory might be occurring and how I might go about fixing it? If not, no worries, I can try and investigate it myself, but it's a pain to do with pdb. My gadget simulation used a maximum of ~ 173 GB of RAM, so I feel like the kdtree shouldn't be using as much memory as it's trying to. Thanks, and sorry for the bother!
Sincerely, -Jared _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hi Jared, I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already! I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973 Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script? Ash

Hi Ash, I think any Gadget HDF5 dataset would work. Nathan On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Apologies, I had it backwards, GadgetDataset is for Gadget binary outputs. Jared this is a separate point, but for a while now it's been possible to load Gadget binary outputs using yt.load(), there's no need anymore to explicitly use GadgetDataset. On Sat, Aug 18, 2018 at 2:56 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Ash,
I think any Gadget HDF5 dataset would work.
Nathan
On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hi Nathan and Ashley, Thank you for your help! The sph-viz branch has been really great so far for my smaller simulations. I've attached the job output file. The point where it says killed is when the memory spike occurred and the job was auto-killed by the system. -Jared On Sat, Aug 18, 2018 at 11:41 AM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Apologies, I had it backwards, GadgetDataset is for Gadget binary outputs.
Jared this is a separate point, but for a while now it's been possible to load Gadget binary outputs using yt.load(), there's no need anymore to explicitly use GadgetDataset.
On Sat, Aug 18, 2018 at 2:56 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Ash,
I think any Gadget HDF5 dataset would work.
Nathan
On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

I've opened a pull request which I believe should fix this issue: https://github.com/yt-project/yt/pull/1980 You can test it out locally with the following incantations in the yt git repo: $ git fetch https://github.com/ngoldbaum/yt proj-no-kdtree:proj-no-kdtree $ git checkout proj-no-kdtree $ pip install -e . -Nathan On Sat, Aug 18, 2018 at 1:00 PM Jared Coughlin <Jared.W.Coughlin.29@nd.edu> wrote:
Hi Nathan and Ashley,
Thank you for your help! The sph-viz branch has been really great so far for my smaller simulations. I've attached the job output file. The point where it says killed is when the memory spike occurred and the job was auto-killed by the system.
-Jared
On Sat, Aug 18, 2018 at 11:41 AM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Apologies, I had it backwards, GadgetDataset is for Gadget binary outputs.
Jared this is a separate point, but for a while now it's been possible to load Gadget binary outputs using yt.load(), there's no need anymore to explicitly use GadgetDataset.
On Sat, Aug 18, 2018 at 2:56 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Ash,
I think any Gadget HDF5 dataset would work.
Nathan
On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hi Nathan, Thank you for the help! That seemed to have worked! -Jared On Mon, Aug 20, 2018 at 3:04 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
I've opened a pull request which I believe should fix this issue:
https://github.com/yt-project/yt/pull/1980
You can test it out locally with the following incantations in the yt git repo:
$ git fetch https://github.com/ngoldbaum/yt proj-no-kdtree:proj-no-kdtree $ git checkout proj-no-kdtree $ pip install -e .
-Nathan
On Sat, Aug 18, 2018 at 1:00 PM Jared Coughlin <Jared.W.Coughlin.29@nd.edu> wrote:
Hi Nathan and Ashley,
Thank you for your help! The sph-viz branch has been really great so far for my smaller simulations. I've attached the job output file. The point where it says killed is when the memory spike occurred and the job was auto-killed by the system.
-Jared
On Sat, Aug 18, 2018 at 11:41 AM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Apologies, I had it backwards, GadgetDataset is for Gadget binary outputs.
Jared this is a separate point, but for a while now it's been possible to load Gadget binary outputs using yt.load(), there's no need anymore to explicitly use GadgetDataset.
On Sat, Aug 18, 2018 at 2:56 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Ash,
I think any Gadget HDF5 dataset would work.
Nathan
On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org

Hi Nathan and Ashley, Thank you for your help! The sph-viz branch has been really great so far for my smaller simulations. Here's the output: https://pastebin.com/a70qJbwj The point where it says killed is when the memory spike occurred and the job was auto-killed by the system. -Jared On Sat, Aug 18, 2018 at 11:41 AM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Apologies, I had it backwards, GadgetDataset is for Gadget binary outputs.
Jared this is a separate point, but for a while now it's been possible to load Gadget binary outputs using yt.load(), there's no need anymore to explicitly use GadgetDataset.
On Sat, Aug 18, 2018 at 2:56 PM Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Ash,
I think any Gadget HDF5 dataset would work.
Nathan
On Sat, Aug 18, 2018 at 8:33 AM Ashley Kelly <a.j.kelly@durham.ac.uk> wrote:
Hi Jared,
I've been working on the sph-viz branch a lot at the moment. It's exciting to see the branch being used so much already!
I have opened an issue on github for this bug: https://github.com/yt-project/yt/issues/1973
Unfortunately the KDTree and a few features that rely on it can use a lot of memory at the moment. However, in your case, you shouldn't require the KDTree, as Nathan said. I am trying to address your issue but i don't seem to have dataset to hand which works with yt.GadgetDataset, would you be able to show me what yt prints out when you run your failing script?
Ash _______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
participants (3)
-
Ashley Kelly
-
Jared Coughlin
-
Nathan Goldbaum