Here's a list of my loaded modules:
1) modules/3.1.6.5
10) xt-os/2.2.74
19) xt-asyncpe/4.9
2) torque/2.4.8
11) xt-boot/2.2.74
20) xt-pe/2.2.74
3) moab/5.4.3.s16991
12) xt-lustre-ss/2.2.74_1.6.5 21)
xt-mpt/5.2.3
4) /opt/cray/xt-asyncpe/default/modulefiles/xtpe-istanbul 13)
cray/job/1.5.5-0.1_2.0202.21413.56.7 22)
pmi/2.1.2-1.0000.8396.13.1.ss
5) tgusage/3.0-r2
14) cray/csa/3.0.0-1_2.0202.21426.77.7 23)
xt-libsci/10.5.02
6)
altd/1.0
15) cray/account/1.0.0-2.0202.19482.49.18 24) gcc/4.5.3
7) DefApps
16) cray/projdb/1.0.0-1.0202.19483.52.1 25)
cray/MySQL/5.0.64-1.0202.2899.21.1
8) xtpe-target-cnl
17) Base-opts/2.2.74
26) xt-mpt/5.0.0
9) xt-service/2.2.74
18) PrgEnv-gnu/2.2.74
27) yt/2.1
On Tue, Jun 7, 2011 at 10:59 AM, Matthew Turk
Hi Anthony,
Stephen and I have chatted about this -- he brought up that some MPI implementations are more tolerating than others. What're the contents of your module list?
It's also possible that we need to specify types in the Allreduce call. I am not sure why that would cause problems for you and not me, however.
-Matt
Hi Anthony,
Using the grid cutting method, which I thought might be causing problems, I was again unable to reproduce the issue. If you could, would you mind running with --detailed and sending me (off-list) the log file, so that I can try to examine the problematic output?
-Matt
On Mon, Jun 6, 2011 at 7:17 PM, Anthony Harness
wrote: The array shouldn't be too small. The data contain 1024^3 cells (20 million cells within the cut_region) and I am running it on 60 processors (120 doesn't work either). This is my script: from yt.mods import * from yt.analysis_modules.api import EnzoSimulation import numpy as na from krakenPlugins import * from mpi4py import MPI ########################################################### simName = '50Mpc_1024unigrid.par' dataDir = '/lustre/scratch/britton/box_size_study/50Mpc_1024/run_17f_cl_5D' es = EnzoSimulation('%s/%s' %(dataDir,simName),get_redshift_outputs=False) dataCntr = 0 numBins = 1000 allHisty = na.array([na.zeros(numBins+1)]) allHistx = na.array([na.zeros(numBins+1)]) es = es.allOutputs[:85] for output in es: pf = load('%s%s' %(dataDir,output['filename'][1:])) dd = pf.h.all_data() pc = PlotCollection(pf) cut = dd.cut_region(["grid['Metallicity'] <= 1.e-6","grid['Temperature'] <= 10.**5.","grid['Temperature'] >= 300.","grid['Baryon_Overdensity'] >= 1.","grid['Baryon_Overdensity'] <= 100."]) pc.add_profile_object(cut, ['Density','Ones'], weight=None, x_bins=numBins,x_log=True) ones = pc.plots[-1].data["Ones"] bod = pc.plots[-1].data["Density"] allHisty = na.concatenate((allHisty,[ones])) allHistx = na.concatenate((allHistx,[bod])) dataCntr += 1 del pf,dd,pc,cut,ones,bod if MPI.COMM_WORLD.rank == 0: print '***Saving to .npy file. UTC Time: %s***' na.save('%s/histograms_y.npy'%saveDir,allHisty) na.save('%s/histograms_x.npy'%saveDir,allHistx)
On Mon, Jun 6, 2011 at 6:24 PM, Matthew Turk
wrote: Hi Anthony,
I tried it on a small dataset and I was unable to reproduce it. Do you think that the array is small enough that some of the processors aren't getting any data? I was able to get the profile command to work all the way down to arrays of size 19, run on 20 processors.
Could you post the entirety of your script?
-Matt
On Mon, Jun 6, 2011 at 5:15 PM, Anthony Harness
wrote: Hello,
I am trying to add a profile object to a Plot Collection (via pc.add_profile_object(data,fields) ) while running in parallel on Kraken. I get the following error: "TypeError: message: expecting a list or
tuple"
which ultimately comes from mpi4py.MPI.Comm.Allreduce which is called by ParallelAnalysisInterface._mpi_allsum(). In ._mpi_allsum() there is
On Mon, Jun 6, 2011 at 7:40 PM, Matthew Turk
wrote: the following comment: "# We use old-school pickling here on the assumption the arrays are relatively small ( < 1e7 elements )". The dataset I am working with is larger than 1e7 elements, so is _mpi_allsum not able to pass such a large array to Comm.Allreduce?
Thanks, Anthony
Here is the traceback:
File "/yt-2.1stable-py2.7-linux-x86_64.egg/yt/data_objects/profiles.py", line 146, in add_fields self._lazy_add_fields(fields, weight, accumulation) File "/yt-2.1stable-py2.7-linux-x86_64.egg/yt/data_objects/profiles.py", line 94, in _lazy_add_fields for gi,grid in enumerate(self._get_grids(fields)): File
"/yt-2.1stable-py2.7-linux-x86_64.egg/yt/utilities/parallel_tools/parallel_analysis_interface.py",
line 134, in __iter__ if not self.just_list: self.pobj._finalize_parallel() File "/yt-2.1stable-py2.7-linux-x86_64.egg/yt/data_objects/profiles.py", line 122, in _finalize_parallel self.__data[key] = self._mpi_allsum(self.__data[key]) File
"/yt-2.1stable-py2.7-linux-x86_64.egg/yt/utilities/parallel_tools/parallel_analysis_interface.py",
line 185, in passage return func(self, data) File
"/yt-2.1stable-py2.7-linux-x86_64.egg/yt/utilities/parallel_tools/parallel_analysis_interface.py",
line 1124, in _mpi_allsum MPI.COMM_WORLD.Allreduce(data, tr, op=MPI.SUM) File "Comm.pyx", line 530, in mpi4py.MPI.Comm.Allreduce (src/mpi4py_MPI.c:43646) File "message.pxi", line 426, in mpi4py.MPI._p_msg_cco.for_allreduce (src/mpi4py_MPI.c:14446) File "message.pxi", line 33, in mpi4py.MPI.message_simple (src/mpi4py_MPI.c:11108) TypeError: message: expecting a list or tuple
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
_______________________________________________ yt-users mailing list yt-users@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org