error while submitting development queue job to STAMPEDE supercomputer
Hi people, though I am not sure if this is the right place to ask this, but I have submitted a development queue job to STAMPEDE supercomputer and I was using yt toolkit in my script, I got the following error after around half an hour of running the job. [c557-702.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 0, pid: 26868) terminated with signal 9 -> abort job [c557-702.stampede.tacc.utexas.edu:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node c557-702 aborted: MPI process error (1) TACC: MPI job exited with code: 1 can anyone please shed light on the error here? with the development queue the maximum code runtime is 2 hours. thanks in advance. -Turhan
On Thursday, May 26, 2016, turhan nasri
Hi people,
though I am not sure if this is the right place to ask this, but I have submitted a development queue job to STAMPEDE supercomputer and I was using yt toolkit in my script, I got the following error after around half an hour of running the job.
[c557-702.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 0, pid: 26868) terminated with signal 9 -> abort job
This is the important bit. Signal 9 is SIGKILL, so if I had to guess, i'd say that your job ran out of memory and the operating system on the compute node killed your job. Can you request a node with more RAM?
[c557-702.stampede.tacc.utexas.edu:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node c557-702 aborted: MPI process error (1) TACC: MPI job exited with code: 1
can anyone please shed light on the error here? with the development queue the maximum code runtime is 2 hours. thanks in advance.
-Turhan
participants (2)
-
Nathan Goldbaum
-
turhan nasri