Mailman 3 Fwd: [mpi4py] Python on 10K of cores on BG/P - yt-dev

10 Feb 2010

      Just as a note, moving forward.

---------- Forwarded message ----------
From: Brian Granger 
Date: Wed, Feb 10, 2010 at 11:34 AM
Subject: Re: [mpi4py] Python on 10K of cores on BG/P
To: mpi4py 
...
We have been developing an electronic structure simulation software
GPAW (https://wiki.fysik.dtu.dk/gpaw/).
The software is written mostly in Python with the core computational
routines in C-extensions. For parallel
calculations we use MPI which is called both from C and Python
(through our own Python interfaces for the
MPI calls we need).
Nice!
...
We have run the code successfully on different supercomputing
architectures such as Cray XT5 and Blue Gene,
however as we are moving to thousands or tens of thousands processes
one limitation of the current
approach has become evident: at start-up time, the imports of python
modules are starting to take
increasing amount of time as huge number of processors try to read the
same .py/.pyc files and the
filesystem cannot naturally handle this efficiently.
Yes, I can imagine that if the .py files are on a shared filesystem,
things would grind to a halt.
The best way to fix this is to simply install all the .py files on the
local disks of the compute
nodes....assuming the compute nodes have local disks :-).

If they don't have local disks, you are in a really tough situation.
In some cases, it is feasible to think about
saving the state of the python interpreter (along with imported
modules), but in this case, I am doubtful that
will work.  If you are importing Python modules that link to
C/C++/Fortran code, this will be very difficult.
Furthermore, if your Python code is calling to MPI, you will also have
to handle to fact that you have a live MPI universe
with open sockets and so on.  Separating out the parts that you
can/want to send from the parts you can't/don't
want to send will be quite a mess.

AND, even if you are able to serialize the entire state of the Python
inperpreter, you will still have to scatter
it to all compute nodes (and unserialize it), which is what the shared
filesystem is doing to begin with.
While this scatter all may take place over a faster interconnect, you
don't be able to get rid of it.

Thus, in my mind, using a local disk is the only reasonable way to go.
 I realize it is likely that the local disk
solution is not an option for you.  In that case, I think you should
go back to Cray and ask for an upgrade ;-)

Cheers,

Brian
...
Is it possible to modify the Python interpreter in order to have a
single process do the import and then
broadcast the data to the rest of the tasks?
...
--
Nichols A. Romero, Ph.D.
Argonne Leadership Computing Facility
Argonne, IL 60490
(630) 252-3441 (O)
(630) 470-0462 (C)
--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To post to this group, send email to mpi4py@googlegroups.com.
To unsubscribe from this group, send email to mpi4py+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.
--
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu
ellisonbg@gmail.com

--
You received this message because you are subscribed to the Google
Groups "mpi4py" group.
To post to this group, send email to mpi4py@googlegroups.com.
To unsubscribe from this group, send email to
mpi4py+unsubscribe@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/mpi4py?hl=en.

Fwd: [mpi4py] Python on 10K of cores on BG/P

Matthew Turk

tags

participants (1)