[IPython-dev] IPython1 parallel questions
ellisonbg.net at gmail.com
Sat Dec 16 12:10:28 EST 2006
On 12/15/06, Douglas Jones <dfj225 at gmail.com> wrote:
> Thank you for the response.
> I've been moving forward with the integration of my library with
> IPython. Unfortunately, I've come across an issue that has prevented
> our library from functioning correctly.
> First, let me give a little background. Right now, our library is
> loaded in each engine by importing it as a python module. There are
> cases in our library when a function will be executed on each node,
> but one node may take slightly longer to process. For instance, when
> reading input node 0 is responsible for scanning the file and
> distributing data to each process. Code of this fashion takes
> advantage of MPI_Win_lock and MPI_Win_unlock when sending data to
> other processes.
I have never used these MPI calls, but hopefully I can help clarify
what ipython1 does and the issues to watch out for (see below).
> The issue I've come across is that at times these calls to MPI code
> will deadlock in certain instances. What I think is going on is that
> the functions in our library that do relatively little will return,
> causing the IPEngine to continue to its input handler code (in effect
> blocking on the engine's socket). If, after this has happened, another
> node tries to do MPI_Win_lock or unlock the call will deadlock because
> MPI is unable to communicate with the other nodes that are now blocked
> in IPEngine code.
> At least, this is my educated guess as to what is happening.
This is a very plausible scenarrio.
> Is this an issue that you've had to work out before?
At some level yes. I think it is best to describe the execution flow
in each engine and how that relates to the synchronization issues of
MPi. Each engine repeats the following loop:
1. Wait for the controller to send it a command. During this period,
it doesn't have anything to do so it is idle. But, it doesn't "block"
on the socket as it uses non-blocking sockets. Underneath it is
iterating through a select/poll loop that is watching for incoming
2. When the controller send the engine a command to execute, the
engine's event loop begins the execution of the command. While the
command is running, the event loop that is watching for network
traffic effectively stops.
3. When the command is done, the engine sends the results back to the engine.
4. The engine waits for further commands.
The one important point is that each user command can be many atomic
commands underneath. For instance, you might make one python function
call that in turns calls a C++ class method, which in turn makes
dozens of MPi calls.
In terms of synchronizing the different engines for "safe" MPI usage,
ipython1 makes the same guarentees that you get in regular MPi
programs: you must use explicit MPI Barrier calls to enforce
synchronization. At that level the only difference between using
ipython1 compared with a normal MPI code is where the commands are
coming from (over the network vs from compiled code).
> Is there some way to ensure that each engine will not block on its
> socket until every engine has had a chance to execute the user code
Each command that is sent to an engine is always completed before the
engine begins to get another command.
> I guess what I'm thinking of is similar to an MPI_Barrier such that
> each engine would
> #once each engine is at this point, the engines can return to their internal
> #communication code
I would instead, put the MPI_Barrier() in the user code at the right places.
> I greatly appreciate your feedback.
Let me know if this solves some of your problems. If the problems
persist, you might want to email the mpi4py mailing list. Lisandro
Dalcin is a true MPI pro.
> Thank you,
> On 12/14/06, Brian Granger <ellisonbg.net at gmail.com> wrote:
> > Douglas,
> > See my comments inline below.
> > > I recently discovered IPython1 and its parallel capabilities. Once I
> > > read about the feature set, I became extremely excited as IPython1
> > > seems to solve many of the interactive parallel computing problems
> > > that I had hoped to solve for a project that I am a developer on.
> > >
> > > Let me start by framing my project. Our code currently exists as
> > > library written in C++. Our library does parallel computation using
> > > MPI. We currently have code that exposes this to python in an
> > > interactive manner, and have been looking for ways to expand this to a
> > > fuller more robust implementation. I'm hoping that IPython1 will be
> > > the perfect piece to create our collaborative, interactive
> > > environment.
> > Nice, this is one of the main usage cases we had in mind when
> > designing ipython1. It should work well for this.
> > > That said, as I investigate IPython further and start to develop some
> > > prototype code, I have some questions for the community and
> > > developers.
> > >
> > > 1) Are there any outstanding issues or problems with the IPython1 code
> > > base that inhibits parallel computation? I've been experimenting with
> > > the code and its features for the past few days in addition to looking
> > > through the bug list, and from what I can tell there are none. Are
> > > there any features mentioned in IPython's documentation or the
> > > presentations on the site that haven't been implemented yet?
> > Not really. IPython1 is already being used for real science and
> > people don't seem to be limited in any significant way. With that
> > said, there are a number of things we are still working on. Here is a
> > taste:
> > 1. Optimizing how large objects are send between the client and
> > engines. In our current approach, the controller become a bottleneck
> > when you try to use push/pull to send really big things (> 100's of
> > MBs). With that said, if you are wanting to send large objects
> > around, you might want to rethink how you are parallelizing you
> > algorithm.
> > 2. Task farming. While the architecture is setup for task farming,
> > we have not implemented a few parts of it. We are actively (as in
> > this week) working on this.
> > 3. Security.
> > 4. Scalability. Currently all the engines connect to a single
> > controller. We have tested this on 128 processors and it works fine.
> > The problem is that most systems have a per process file-descriptor
> > limit that is not much higher than that (like 256). We would like to
> > explore ways (multiple controllers?) of getting it to scale beyond
> > 128-256.
> > > 2) Is there any authentication or security between the client shell
> > > and the IPython controller?
> > Not yet, but we are working on it.
> > > 3) What are the proper ways to configure the operation of the client
> > > and the controller or the engines? From what I've seen it looks like
> > > there is an API that allows the user to set certain options. Can
> > > configuration files be used?
> > Yes, we have a very powerful configuration system. For examples see
> > the ipython1/configfiles directory. You can simply copy these over to
> > your ~/.ipython directory and start playing around.
> > The documentation on this part of things is still not great though.
> > Please let us know if you have questions.
> > > Some configuration issues I'm thinking of right now involve output.
> > > For instance, is there a way to turn off the feedback from each node
> > > for every line of code? I would imagine that if you had a cluster with
> > > many nodes running, you would not want to see this feedback. Another
> > > issue is how output on stdout from C++ code is handled. I noticed that
> > > right now, any output on stdout for a simple C++ module that I have
> > > exposed to python and loaded in each engine simply is simply written
> > > by the engine to the console. Is there some way to redirect this
> > > output?
> > There are two ways of executing code: blocking and non-blocking:
> > rc.executeAll('a=10', block=True)
> > This mode will wait until the command has been run and it will bring
> > back the stdout/stderr and print it to the screen. If the remote
> > command is long running, your local ipython session will remain
> > blocked until the command is complete.
> > rc.executeAll('time.sleep(1000000)', block=False)
> > In non-blocking mode, execute returns immediately after _submitting_
> > the command. Furthermore, it won't automatically bring back and print
> > the stdout/stderr of the command.
> > You can also set all command to block or not by using the block
> > attribute of the RemoteController object:
> > rc.block = False
> > I usually use block=True for debugging, but then set block=False for
> > long running things. Also, you can always get the stdout/stderr of a
> > previously run command by using the %result magic:
> > %result # print the stdout/stderr of the
> > last remote commad
> > %result 10 # print the stdout/stderr of the 10th
> > remote command
> > > 4) When objects are sent from the kernel to the client, is the only
> > > prerequisite to this that the object being sent is able to be pickled?
> > > I suppose the client would also have to have the class code for any
> > > objects imported?
> > Yes. Numpy arrays are sent using their raw buffers so for that case
> > you don't have to pay the price of pickling.
> > > 5) Are there any known projects that use IPython1 for interactive
> > > scientific computing? I'd be really interested to see ones that also
> > > support visualization of distributed data.
> > Yes. Two examples:
> > 1. At my company (Tech-X), there are a group of people using ipython1
> > to do parallel data analysis on supercomputers. They start with
> > 50-100 GBs of data in 1000-2000 hdf5 data files and need to do a bunch
> > of calculations that involve data from multiple files. They use
> > ipython1 to first reduce the data they want (in parallel) to a single
> > file and then then run an algorithm (again in parallel) over a set of
> > parameters. There were able to parallelize this code in 2 days and
> > its shows nice linear scaling.
> > 2. Fernando Perez (the creator of ipython) is using ipython1 in his
> > research in applied mathematics. His algorithm uses multiresolution
> > analysis to solve high dimensional partial differential equations. He
> > has just begun (in the last 2 weeks) to parallelize his code using
> > ipython1, so I don't think he is in production mode. His case is also
> > non-trivial as it 1) needs automatic load balancing and 2) has lots of
> > communications.
> > The reason we started working on this, is that both Fernando and I are
> > scientists (both got our Ph.D.'s in Theoretical Physics) and we wanted
> > these tools to exist so we could use them for our own research.
> > There are some other folks on the list that have playing around with
> > ipython1, but I am not sure if anyone else has moved into production
> > mode yet.
> > > Well, I think that covers all my initial questions. Thank you for
> > > bearing this long post and my newbie questions. I'm really looking
> > > forward to working more with IPython as it seems like an amazing piece
> > > of software.
> > Thanks! Please let us know if you have more questions/comments or ideas.
> > Brian
> > > Thank you,
> > > ~doug
> > > _______________________________________________
> > > IPython-dev mailing list
> > > IPython-dev at scipy.org
> > > http://projects.scipy.org/mailman/listinfo/ipython-dev
> > >
More information about the IPython-dev