Rockstar on Blue Waters
Folks, I am trying to run Rickstar on Blue Waters, and I am getting an error even for a very simple setup: 1 writer and 1 reader running on a single node on a small test run. The same job runs just fine on a workstation. The script is simple: --------------- import yt, yt.funcs import glob, sys, os.path yt.enable_parallelism() from yt.analysis_modules.halo_finding.rockstar.api import RockstarHaloFinder root = "../B20" n = 1 dirs = ( "D" ) for dir in dirs: sets = glob.glob(root+"/"+dir+"/rei20c1_a0.*/*.art") sets.sort() ts = yt.DatasetSeries(sets) rh = RockstarHaloFinder(ts, num_readers=n, num_writers=n, outbase=root+"/"+dir+"/rs", dm_only=True, particle_type='N-BODY') rh.run() --------------- and the batch script is trivial: -------- #!/bin/csh #PBS -l nodes=1:ppn=32:xe #PBS -l walltime=30:00,vmem=64gb #PBS -q debug #PBS -N rshfc.1 #PBS -e $PBS_JOBID.err #PBS -o $PBS_JOBID.out #PBS -m be cd $PBS_O_WORKDIR module load bwpy module load bwpy-mpi setenv APRUN_XFER_LIMITS 1 limit stacksize unlimited aprun -n 3 -N 3 python3 rs-hfc.py >& std.rshfc1 -------- The error I get is not very informative: [Error] Couldn't open ^A:0! (Err: Name or service not known) and the ^A string varies in separate runs, c.f. [Error] Couldn't open 344ei20:0! (Err: Name or service not known) which may happen if a null or corrupted pointer is printed with a %s specification, but that's all I can guess. Is there a way to obtain more information and localize the offending code segment?
Hi Nick, I think you're running into a bug related to running rockstar with Python3. This has been fixed as of the last major yt release and the release of the yt_astro_analysis package. I just tried a small rockstar test on Blue Waters using yt_astro_analysis and it worked for me. If you want to go that route, I recommend following the install instructions here: https://yt-astro-analysis.readthedocs.io/en/latest/ and using the --user flag when you install with pip. Britton On Wed, Jan 23, 2019 at 8:50 AM Nick Gnedin <ngnedin@gmail.com> wrote:
Folks,
I am trying to run Rickstar on Blue Waters, and I am getting an error even for a very simple setup: 1 writer and 1 reader running on a single node on a small test run. The same job runs just fine on a workstation.
The script is simple: --------------- import yt, yt.funcs import glob, sys, os.path
yt.enable_parallelism() from yt.analysis_modules.halo_finding.rockstar.api import RockstarHaloFinder
root = "../B20" n = 1
dirs = ( "D" )
for dir in dirs: sets = glob.glob(root+"/"+dir+"/rei20c1_a0.*/*.art") sets.sort() ts = yt.DatasetSeries(sets) rh = RockstarHaloFinder(ts, num_readers=n, num_writers=n, outbase=root+"/"+dir+"/rs", dm_only=True, particle_type='N-BODY') rh.run() ---------------
and the batch script is trivial: -------- #!/bin/csh #PBS -l nodes=1:ppn=32:xe #PBS -l walltime=30:00,vmem=64gb #PBS -q debug #PBS -N rshfc.1 #PBS -e $PBS_JOBID.err #PBS -o $PBS_JOBID.out #PBS -m be
cd $PBS_O_WORKDIR
module load bwpy module load bwpy-mpi
setenv APRUN_XFER_LIMITS 1 limit stacksize unlimited
aprun -n 3 -N 3 python3 rs-hfc.py >& std.rshfc1 --------
The error I get is not very informative:
[Error] Couldn't open ^A:0! (Err: Name or service not known)
and the ^A string varies in separate runs, c.f.
[Error] Couldn't open 344ei20:0! (Err: Name or service not known)
which may happen if a null or corrupted pointer is printed with a %s specification, but that's all I can guess. Is there a way to obtain more information and localize the offending code segment?
_______________________________________________ yt-users mailing list -- yt-users@python.org To unsubscribe send an email to yt-users-leave@python.org
participants (2)
-
Britton Smith
-
Nick Gnedin