[IPython-dev] Performance sanity check: 7.21s to scatter 5000X1000 float array to 7 engines

Fernando Perez fperez.net at gmail.com
Fri Jan 11 14:09:54 EST 2008


Hi Anand,

I'm keeping the list CC'd here so that this is archived for the
benefit of others.

On Jan 11, 2008 11:43 AM, Anand Patil <anand.prabhakar.patil at gmail.com> wrote:
> Thanks Fernando,
>
>
> > One question: is it possible for you to organize your code runs so
> > that the engines get their local data via other small parameters?
> > Sometimes you can instead of generating a large random array and
> > scattering, seed the local RNGs differently and do the generation
> > locally (on the engines), or you can have each engine read its data
> > over a network filesystem, etc.
> >
>
> Unfortunately I'm using the engines to fill in the entries of a matrix,
> which I then need to gather for some linear algebra that would be really
> hard to do in pieces without passing around large amounts of data.
>
>
> > Another possibility is to start your engine group via mpirun and then
> > have say engine 0 do an *MPI* scatter of an array, which is then used
> > by the others.
>
> This would solve my problem. I actually tried doing like this in two
> different ways (with OpenMPI):
>
> 1) I started the IPengines using mpirun, then started an IPython session and
> imported mpi4py. MPI.COMM_WORLD.size was 1, which I took to mean that mpi4py
> couldn't see the IPengines, although the IPython1 remote controller instance
> was able to work with them.
>
> 2) I started several instances of IPython with mpirun:
>
> mpirun -n 4 ipython
>
> However, the IPython shell acted really weird, to the point of being
> unusable. I can make a video of it if it would help.
>
> What would be the correct way to use mpi4py with IPython and/or the
> IPengines?

I'm a bit swamped right now so I'll only give you a very rough idea.
I'll try, tonight, to put together a small document/mini tutorial on
how to do this type of work.

The basic idea is to start your engine group as an mpi world, then
have them connect to the controller, and from the controller, tell
engine 0 to do the array creation and scatter.  So the engines are an
MPI world, but the controller and client don't need to be.  Is that
clear enough?  If not, I'll provide step by step (with code)
instructions later...

regards,


f



More information about the IPython-dev mailing list