I figured out what was going wrong, I submitted my job script from the $SCRATCH folder. apparently jobs can only be submitted through the $WORK folder on Stampede. thanks for your prompt response though

I tried with one file in my ipython notebook, it seems to work there

yes it's being run in parallel, I didn't check with one core yet, I will let you know what happens then

Is the script being run in parallel? If so, does it crash if you run it on only one core?


HI people, 

I am using yt 3.3.1 and submitting my SLURM script to stampede.
I was using this script to find abundances of C, O etc through checkpoint files in FLASH. while my script worked fine with old yt (3.1) , suddenly it crashed today and returned me the following error:

yt : [INFO     ] 2016-10-17 23:22:24,295 Parameters: current_time              = 28.1847530806
yt : [INFO     ] 2016-10-17 23:22:24,295 Parameters: domain_dimensions         = [128 128 128]
yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters: domain_left_edge          = [ -2.80000000e+10  -2.80000000e+10  -2.80000000e+10]
yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters: domain_right_edge         = [  2.80000000e+10   2.80000000e+10   2.80000000e+10]
yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters: cosmological_simulation   = 0.0
Executin lessg abundance.py
Traceback (most recent call last):
  File "abundance2.py", line 304, in <module>
  File "abundance2.py", line 59, in main
    pf = yt.load(filenames[n])
  File "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/convenience.py", line 79, in load
    if c._is_valid(*args, **kwargs): candidates.append(n)
  File "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/frontends/flash/data_structures.py", line 478, in _is_valid
    if "bounding box" not in fileh["/"].keys() \
  File "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/base.py", line 368, in keys
    return list(self)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)
  File "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/group.py", line 303, in __len__
    return self.id.get_num_objs()
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)
  File "h5py/h5g.pyx", line 321, in h5py.h5g.GroupID.get_num_objs (/home/ilan/minonda/conda-bld/work/h5py/h5g.c:4194)
RuntimeError: Can't determine (Bad symbol table node signature)
[c560-102.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 0, pid: 22137) exited with status 1
TACC: MPI job exited with code: 1

TACC: Shutdown complete. Exiting.

I was wondering if there is something wrong with my code or the new yt. I am also attaching my code here to look at. Thanks in advance


