You may find this interesting.
Yesterday I did two runs of the halo profiler on L7. I took the full HopAnalysis.out file and split it up into 16 files in a round-robin fashion, and also in a spatial decomposition. I then ran with 16 separate parameter files for these 16 separate HOP files like this (for each setup):
from mpi4py import MPI
if MPI.COMM_WORLD.rank == 0: q = HP.HaloProfiler(blah, 'round-robin01.par') q.makeProfiles() if MPI.COMM_WORLD.rank == 1: ...'round-robin02.par' q.makeProfiles() ... ...
The round-robin took a little under ten hours to finish, while the spatial decomposition took closer to eleven hours. This is because the numbers of haloes per task wasn't as balanced in the spatial case. The haloes per minute per process was more or less identical for both methods.
_______________________________________________________ firstname.lastname@example.org o__ Stephen Skory http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student ________________________________(_)_\(_)_______________