TimeSeriesData.piter() exiting early

Hi all, I'm processing some simulations that i have stored on Pleiades at NASA Ames. This machine using SGI's version of MPI so it's possible that this issue is isolated to this MPI. When I ts.piter() on this machine, it will sometimes die early, before all of the parameter files have been processed, with the following error: MPI: MPI_COMM_WORLD rank 7 has terminated without calling MPI_Finalize() MPI: aborting job This particular script saves a bunch of plots to disk. Taking a look at the list of plots that have been saved, it seems that it got about halfway through before dying. The first ~third of the plots are all on disk, most of the middle ~third are saved although there are some missing plots, and the last ~third haven't been saved at all. Any ideas what's going wrong here? Is there something simple I can do to fix it? This isn't game-ending since I can always process my data in serial but it is a bit annoying. Cheers, Nathan

Hi Nathan, I've run into a problem like this before on pleiades. I've solved it by manually calling MPI.Finalize() at the end of the script. from mpi4py import MPI [...script...] MPI.Finalize() This will allow all of the processors to finish before exiting the script. If you don't call MPI_Finalize, then it will exit once processor 0 exits. I've only seen this behavior with SGI's MPI. John On 12/06/2012 03:37 PM, Nathan Goldbaum wrote:
Hi all,
I'm processing some simulations that i have stored on Pleiades at NASA Ames. This machine using SGI's version of MPI so it's possible that this issue is isolated to this MPI.
When I ts.piter() on this machine, it will sometimes die early, before all of the parameter files have been processed, with the following error:
MPI: MPI_COMM_WORLD rank 7 has terminated without calling MPI_Finalize() MPI: aborting job
This particular script saves a bunch of plots to disk. Taking a look at the list of plots that have been saved, it seems that it got about halfway through before dying. The first ~third of the plots are all on disk, most of the middle ~third are saved although there are some missing plots, and the last ~third haven't been saved at all.
Any ideas what's going wrong here? Is there something simple I can do to fix it? This isn't game-ending since I can always process my data in serial but it is a bit annoying.
Cheers,
Nathan
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- John Wise Assistant Professor of Physics Center for Relativistic Astrophysics, Georgia Tech

Hi John and Nathan, There's a barrier-at-end option with parallel_objects; could that also fix this, if we had it for piter as well? -Matt On Thu, Dec 6, 2012 at 5:38 PM, John Wise <jwise@physics.gatech.edu> wrote:
Hi Nathan,
I've run into a problem like this before on pleiades. I've solved it by manually calling MPI.Finalize() at the end of the script.
from mpi4py import MPI [...script...] MPI.Finalize()
This will allow all of the processors to finish before exiting the script. If you don't call MPI_Finalize, then it will exit once processor 0 exits. I've only seen this behavior with SGI's MPI.
John
On 12/06/2012 03:37 PM, Nathan Goldbaum wrote:
Hi all,
I'm processing some simulations that i have stored on Pleiades at NASA Ames. This machine using SGI's version of MPI so it's possible that this issue is isolated to this MPI.
When I ts.piter() on this machine, it will sometimes die early, before all of the parameter files have been processed, with the following error:
MPI: MPI_COMM_WORLD rank 7 has terminated without calling MPI_Finalize() MPI: aborting job
This particular script saves a bunch of plots to disk. Taking a look at the list of plots that have been saved, it seems that it got about halfway through before dying. The first ~third of the plots are all on disk, most of the middle ~third are saved although there are some missing plots, and the last ~third haven't been saved at all.
Any ideas what's going wrong here? Is there something simple I can do to fix it? This isn't game-ending since I can always process my data in serial but it is a bit annoying.
Cheers,
Nathan
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- John Wise Assistant Professor of Physics Center for Relativistic Astrophysics, Georgia Tech
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

Hi Matt, I think a barrier in piter() would solve this problem. Thanks, John On 12/06/2012 05:40 PM, Matthew Turk wrote:
Hi John and Nathan,
There's a barrier-at-end option with parallel_objects; could that also fix this, if we had it for piter as well?
-Matt
On Thu, Dec 6, 2012 at 5:38 PM, John Wise <jwise@physics.gatech.edu> wrote:
Hi Nathan,
I've run into a problem like this before on pleiades. I've solved it by manually calling MPI.Finalize() at the end of the script.
from mpi4py import MPI [...script...] MPI.Finalize()
This will allow all of the processors to finish before exiting the script. If you don't call MPI_Finalize, then it will exit once processor 0 exits. I've only seen this behavior with SGI's MPI.
John
On 12/06/2012 03:37 PM, Nathan Goldbaum wrote:
Hi all,
I'm processing some simulations that i have stored on Pleiades at NASA Ames. This machine using SGI's version of MPI so it's possible that this issue is isolated to this MPI.
When I ts.piter() on this machine, it will sometimes die early, before all of the parameter files have been processed, with the following error:
MPI: MPI_COMM_WORLD rank 7 has terminated without calling MPI_Finalize() MPI: aborting job
This particular script saves a bunch of plots to disk. Taking a look at the list of plots that have been saved, it seems that it got about halfway through before dying. The first ~third of the plots are all on disk, most of the middle ~third are saved although there are some missing plots, and the last ~third haven't been saved at all.
Any ideas what's going wrong here? Is there something simple I can do to fix it? This isn't game-ending since I can always process my data in serial but it is a bit annoying.
Cheers,
Nathan
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- John Wise Assistant Professor of Physics Center for Relativistic Astrophysics, Georgia Tech
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
-- John Wise Assistant Professor of Physics Center for Relativistic Astrophysics, Georgia Tech
participants (3)
-
John Wise
-
Matthew Turk
-
Nathan Goldbaum