HI all, Nathan and Matt were both right about the cause of the issue, which was that the integer arrays containing the absolute indices were only 32 bit. This was being exceeded for this rather deep hierarchy. I have issued a pull request that fixes this, which you can see here: https://bitbucket.org/yt_analysis/yt/pull-request/713/changing-index-arrays-... Matt and Nathan, thanks again for pointing me in the right direction. Please enjoy a very preliminary piece of fruit of this labor: http://i.imgur.com/PeCBjRS.png Britton On Thu, Feb 27, 2014 at 7:55 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Britton,
On Thu, Feb 27, 2014 at 12:24 PM, Britton Smith <brittonsmith@gmail.com> wrote:
Hi Nathan, Matt,
I'm working on getting some more debugging information with your suggestions. So far, I've been able to track it to the loop inside QuadTree.initialize_chunk. This appears to loop over all the points within the chunk, calling add_to_position for each point. The loop looks like this:
for p in range(num): pos[0] = pxs[p] pos[1] = pys[p] self.add_to_position(level[p], pos, NULL, 0.0, 1)
If I print the values of p, level[p], pos[0], pos[1] inside this loop, I see the following (with a few extra lines leading up):
1893689 22 1148578047 1106259970 1893690 22 1148578047 1106259971 1893691 22 1148578047 1106259972 1893692 22 1148578047 1106259973 1893693 22 1148578047 1106259979 1893694 23 -1997811214 -2082447348
So, somehow, starting on level 23, the x and y positions are messed up in some way. Is this a precision issue? How are these positions calculated?
Yeah, this looks like the problem. These positions are computed by doing:
cell_integer_index + grid.get_global_index()
If you're on 3.0, this is all done implicitly inside icoords. One way to avoid the segfault and determine the specific place it fails is to go into grid_object.py and inside icoords, assert that the values are all positive -- no matter the domain, this is always the case. If you're using octrees, this would go into octree_subset.py.
It may be that you're on a machine where the default int is 32 rather than 64, and there is a careless assumption of this somewhere. If you try on a different machine it might work. That would help track all of this down.
-Matt
Britton
On Thu, Feb 27, 2014 at 5:54 PM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Britton,
On my machine it will tell me line numbers in the .C file if a crash happens inside a .so file, even if it's called from python. I'm not
how to get that information on your system without knowing more about your setup.
PDB doesn't know about C extensions so that won't be helpful unfortunately.
If you're running serially you should be able to run python under gdb and get a traceback that way. I'm not sure how to do that for parallel runs.
This page might be helpful: http://docs.cython.org/src/userguide/debugging.html
Nathan
On Thursday, February 27, 2014, Britton Smith <brittonsmith@gmail.com> wrote:
Hi Nathan,
I'm having a hard time getting a traceback that goes into the QuadTree source. The seg fault I get stops at QuadTree.so. Is there a way to recompile this in debug mode to get some more information? It doesn't
look
like pdb is able to step into QuadTree either.
Britton
On Thu, Feb 27, 2014 at 5:22 PM, Nathan Goldbaum < nathan12343@gmail.com> wrote:
Hi Britton,
Can you get a traceback from the seg fault? It would help to see the line number in the autogenerated QuadTree.c where the crash happens. Autogenerated C files produced by cython reproduce the original .pyx
files
line by line as comments so it's usually pretty easy to back out where the crash is happening in the original Pyrex file.
Nathan
On Thursday, February 27, 2014, Britton Smith <brittonsmith@gmail.com
wrote:
Hi all,
I'm trying to make projections of a rather large Enzo dataset and getting a segfault somewhere in Quadtree.so. This dataset is ~230
GB in
size with 27 levels of AMR. As far as I can tell, the only hard coded limit I could find in QuadTree.pyx is for 80 levels, which I am clearly below. Does anyone familiar with this part of the code have any idea if
sure there are
any other hard-coded limits in here that I might be exceeding? If not, does anyone have any advice for how I might debug this? I'm seeing this behavior in both yt-2.x and yt-3.0, so it does seem to be something intrinsic to the quadtree code.
Thanks! Britton
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org