Interesting crash when using ("deposit", "all_cic")
Hi folks, I have been trying to make some dark matter density movies using the "all_cic" field, and have discovered that using "all_cic" can reliably cause yt to seg fault on the supercomputer I'm using (x86_64 linux cluster with kernel 2.6.32-504.8.1.el6.x86_64) using the tip of yt-dev (changeset fa08e386d0da). The script that causes the crash is here: http://paste.yt-project.org/show/5774/ After I turned up the log level and started using pdb to debug, I discovered that the line in the script that causes the seg fault is line 88: dm_dens_x = my_reg[("deposit", "all_cic")].value where my_reg is defined as (on line 82): my_reg = ds.arbitrary_grid(left, right, dims=[1,800, 800]) After stepping through with pdb, it seems that the code dies in this function: /mnt/home/oshea/yt-3/lib/python2.7/site-packages/sympy/core/basic.py(83)__new__() with the traceback shown here: http://paste.yt-project.org/show/5775/ Based on this traceback, it seems that there are problems with the unit system. Interestingly enough, though, this does not happen on every dataset - it seems to happen occasionally, but predictably. For example, I can generate two images using this script, but it will seg fault on the third. On a different dataset in the time series, it will produce one image and then seg fault on the second. If I run the script over and over, it will eventually produce quite a few images, until it runs into a situation where it seg faults on the first image I produce (at which point I went to the debugger). I don't have enough experience with the guts of yt - and the units section of yt in particular - to have a sense of what might be happening here based on the traceback. Does anybody have any suggestions? Thank you! --Brian
Any chance you can trigger this crash using a script that relies on a public dataset? I'm happy to debug, but need to be able to reproduce this locally to see what's going wrong. Nathan On Tuesday, August 4, 2015, Brian O'Shea <bwoshea@gmail.com> wrote:
Hi folks,
I have been trying to make some dark matter density movies using the "all_cic" field, and have discovered that using "all_cic" can reliably cause yt to seg fault on the supercomputer I'm using (x86_64 linux cluster with kernel 2.6.32-504.8.1.el6.x86_64) using the tip of yt-dev (changeset fa08e386d0da). The script that causes the crash is here:
http://paste.yt-project.org/show/5774/
After I turned up the log level and started using pdb to debug, I discovered that the line in the script that causes the seg fault is line 88:
dm_dens_x = my_reg[("deposit", "all_cic")].value
where my_reg is defined as (on line 82):
my_reg = ds.arbitrary_grid(left, right, dims=[1,800, 800])
After stepping through with pdb, it seems that the code dies in this function:
/mnt/home/oshea/yt-3/lib/python2.7/site-packages/sympy/core/basic.py(83)__new__()
with the traceback shown here:
http://paste.yt-project.org/show/5775/
Based on this traceback, it seems that there are problems with the unit system.
Interestingly enough, though, this does not happen on every dataset - it seems to happen occasionally, but predictably. For example, I can generate two images using this script, but it will seg fault on the third. On a different dataset in the time series, it will produce one image and then seg fault on the second. If I run the script over and over, it will eventually produce quite a few images, until it runs into a situation where it seg faults on the first image I produce (at which point I went to the debugger).
I don't have enough experience with the guts of yt - and the units section of yt in particular - to have a sense of what might be happening here based on the traceback. Does anybody have any suggestions?
Thank you!
--Brian
Absolutely. The dataset that it's crashing on is here: http://galactica.pa.msu.edu/~bwoshea/data/datasets/DD0013.tar This particular script crashes immediately upon trying to generate the first image on this dataset. --Brian On Tue, Aug 4, 2015 at 8:23 AM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Any chance you can trigger this crash using a script that relies on a public dataset? I'm happy to debug, but need to be able to reproduce this locally to see what's going wrong.
Nathan
On Tuesday, August 4, 2015, Brian O'Shea <bwoshea@gmail.com> wrote:
Hi folks,
I have been trying to make some dark matter density movies using the "all_cic" field, and have discovered that using "all_cic" can reliably cause yt to seg fault on the supercomputer I'm using (x86_64 linux cluster with kernel 2.6.32-504.8.1.el6.x86_64) using the tip of yt-dev (changeset fa08e386d0da). The script that causes the crash is here:
http://paste.yt-project.org/show/5774/
After I turned up the log level and started using pdb to debug, I discovered that the line in the script that causes the seg fault is line 88:
dm_dens_x = my_reg[("deposit", "all_cic")].value
where my_reg is defined as (on line 82):
my_reg = ds.arbitrary_grid(left, right, dims=[1,800, 800])
After stepping through with pdb, it seems that the code dies in this function:
/mnt/home/oshea/yt-3/lib/python2.7/site-packages/sympy/core/basic.py(83)__new__()
with the traceback shown here:
http://paste.yt-project.org/show/5775/
Based on this traceback, it seems that there are problems with the unit system.
Interestingly enough, though, this does not happen on every dataset - it seems to happen occasionally, but predictably. For example, I can generate two images using this script, but it will seg fault on the third. On a different dataset in the time series, it will produce one image and then seg fault on the second. If I run the script over and over, it will eventually produce quite a few images, until it runs into a situation where it seg faults on the first image I produce (at which point I went to the debugger).
I don't have enough experience with the guts of yt - and the units section of yt in particular - to have a sense of what might be happening here based on the traceback. Does anybody have any suggestions?
Thank you!
--Brian
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
Hey Brian, I'm unfortunately not able to reproduce this locally. I'm using yt changeset fffa77d2fdc2 and have made two minor changes to your script to get it to run in my test environment: http://paste.yt-project.org/show/5776/ I'll also note that I needed to make a directory named "DD_zoom_images" in the same directory as DD0013 to get the script to run. Are you able to reproduce this issue on another system besides the cluster using DD0013? Is this cluster blue waters? Unfortunately I don't really understand how an issue with the unit system would lead to a seg fault - if there really is an AttributeError being raised, even in cython code or code called by cython code, the error should be raised to the user level and there shouldn't be a seg fault... Very strange. Hopefully we can iterate on this and figure out what's going wrong. -Nathan On Tue, Aug 4, 2015 at 7:47 AM, Brian O'Shea <bwoshea@gmail.com> wrote:
Absolutely. The dataset that it's crashing on is here:
http://galactica.pa.msu.edu/~bwoshea/data/datasets/DD0013.tar
This particular script crashes immediately upon trying to generate the first image on this dataset.
--Brian
On Tue, Aug 4, 2015 at 8:23 AM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Any chance you can trigger this crash using a script that relies on a public dataset? I'm happy to debug, but need to be able to reproduce this locally to see what's going wrong.
Nathan
On Tuesday, August 4, 2015, Brian O'Shea <bwoshea@gmail.com> wrote:
Hi folks,
I have been trying to make some dark matter density movies using the "all_cic" field, and have discovered that using "all_cic" can reliably cause yt to seg fault on the supercomputer I'm using (x86_64 linux cluster with kernel 2.6.32-504.8.1.el6.x86_64) using the tip of yt-dev (changeset fa08e386d0da). The script that causes the crash is here:
http://paste.yt-project.org/show/5774/
After I turned up the log level and started using pdb to debug, I discovered that the line in the script that causes the seg fault is line 88:
dm_dens_x = my_reg[("deposit", "all_cic")].value
where my_reg is defined as (on line 82):
my_reg = ds.arbitrary_grid(left, right, dims=[1,800, 800])
After stepping through with pdb, it seems that the code dies in this function:
/mnt/home/oshea/yt-3/lib/python2.7/site-packages/sympy/core/basic.py(83)__new__()
with the traceback shown here:
http://paste.yt-project.org/show/5775/
Based on this traceback, it seems that there are problems with the unit system.
Interestingly enough, though, this does not happen on every dataset - it seems to happen occasionally, but predictably. For example, I can generate two images using this script, but it will seg fault on the third. On a different dataset in the time series, it will produce one image and then seg fault on the second. If I run the script over and over, it will eventually produce quite a few images, until it runs into a situation where it seg faults on the first image I produce (at which point I went to the debugger).
I don't have enough experience with the guts of yt - and the units section of yt in particular - to have a sense of what might be happening here based on the traceback. Does anybody have any suggestions?
Thank you!
--Brian
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
participants (2)
-
Brian O'Shea
-
Nathan Goldbaum