I typed $ module avail HDF5 ------------------------------------------------- /opt/apps/intel15/mvapich2_2_1/modulefiles -------------------------------------------------- phdf5/1.8.16 -------------------------------------------------------- /opt/apps/intel15/modulefiles -------------------------------------------------------- hdf5/1.8.16 (m,L) Where: L: Module is loaded m: built for host and native MIC $ module avail h5py returns blank, if that helps On Tue, Oct 18, 2016 at 4:08 PM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
Hi Tazkera,
When I tried googling your error message last night, I found that it's associated with more than one MPI process trying to access an HDF5 file at the same time, presumably using a serial version of the HDF5 library.
I just did a quick test and I'm unable to reproduce here on my laptop. Unfortunately I don't have access to stampede so can't reproduce there.
Can you share exactly which h5py and HDF5 library versions you're using?
-Nathan
On Tue, Oct 18, 2016 at 3:05 PM, tazkera haque <h.tazkera@gmail.com> wrote:
HI Nathan,
Sorry to bother you again, but the problem seems to prevail even working from $WORK directory. I got the same error msg this morning again with a different script. Do you see anything wrong with the script I attached ? I have used it for a long time now without any sort of error.
Thanks
On Tue, Oct 18, 2016 at 2:34 AM, tazkera haque <h.tazkera@gmail.com> wrote:
HI Nathan,
I figured out what was going wrong, I submitted my job script from the $SCRATCH folder. apparently jobs can only be submitted through the $WORK folder on Stampede. thanks for your prompt response though
On Tue, Oct 18, 2016 at 1:22 AM, tazkera haque <h.tazkera@gmail.com> wrote:
Hi Nathan,
I tried with one file in my ipython notebook, it seems to work there
On Tue, Oct 18, 2016 at 12:54 AM, tazkera haque <h.tazkera@gmail.com> wrote:
yes it's being run in parallel, I didn't check with one core yet, I will let you know what happens then
On Tue, Oct 18, 2016 at 12:50 AM, Nathan Goldbaum < nathan12343@gmail.com> wrote:
Is the script being run in parallel? If so, does it crash if you run it on only one core?
Nathan
On Monday, October 17, 2016, tazkera haque <h.tazkera@gmail.com> wrote:
> HI people, > > I am using yt 3.3.1 and submitting my SLURM script to stampede. > I was using this script to find abundances of C, O etc through > checkpoint files in FLASH. while my script worked fine with old yt (3.1) , > suddenly it crashed today and returned me the following error: > > *yt : [INFO ] 2016-10-17 23:22:24,295 Parameters: current_time > = 28.1847530806* > *yt : [INFO ] 2016-10-17 23:22:24,295 Parameters: > domain_dimensions = [128 128 128]* > *yt : [INFO ] 2016-10-17 23:22:24,296 Parameters: > domain_left_edge = [ -2.80000000e+10 -2.80000000e+10 > -2.80000000e+10]* > *yt : [INFO ] 2016-10-17 23:22:24,296 Parameters: > domain_right_edge = [ 2.80000000e+10 2.80000000e+10 > 2.80000000e+10]* > *yt : [INFO ] 2016-10-17 23:22:24,296 Parameters: > cosmological_simulation = 0.0* > *Executin lessg abundance.py* > *Traceback (most recent call last):* > * File "abundance2.py", line 304, in <module>* > * main(chkFilenames_own)* > * File "abundance2.py", line 59, in main* > * pf = yt.load(filenames[n])* > * File > "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/convenience.py", > line 79, in load* > * if c._is_valid(*args, **kwargs): candidates.append(n)* > * File > "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/frontends/flash/data_structures.py", > line 478, in _is_valid* > * if "bounding box" not in fileh["/"].keys() \* > * File > "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/base.py", > line 368, in keys* > * return list(self)* > * File "h5py/_objects.pyx", line 54, in > h5py._objects.with_phil.wrapper > (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)* > * File "h5py/_objects.pyx", line 55, in > h5py._objects.with_phil.wrapper > (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)* > * File > "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/group.py", > line 303, in __len__* > * return self.id.get_num_objs()* > * File "h5py/_objects.pyx", line 54, in > h5py._objects.with_phil.wrapper > (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)* > * File "h5py/_objects.pyx", line 55, in > h5py._objects.with_phil.wrapper > (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)* > * File "h5py/h5g.pyx", line 321, in h5py.h5g.GroupID.get_num_objs > (/home/ilan/minonda/conda-bld/work/h5py/h5g.c:4194)* > *RuntimeError: Can't determine (Bad symbol table node signature)* > *[c560-102.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI > process (rank: 0, pid: 22137) exited with status 1* > *TACC: MPI job exited with code: 1* > > *TACC: Shutdown complete. Exiting.* > > I was wondering if there is something wrong with my code or the new > yt. I am also attaching my code here to look at. Thanks in advance > > Best > Tazkera >
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org