[IPython-dev] Parallel map
Fernando Perez
fperez.net at gmail.com
Sat Mar 8 05:28:02 EST 2008
On Sat, Mar 8, 2008 at 2:03 AM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> I succeeded (I had a good night(s sleep, in between), by piggy backing
> the ipcluster script. It is a bit ugly, but I post the code here for
> future reference.
>
> What made my task hard was both the fact that there is no obvious way of
> creating a cluster from Python, and the fact that ipython1.kernel.api was
> suppressed and that all the information I can find on the web uses
> ipython1.kerenl.api.RemoteControler.
We need to summarize the recent work done at the sage/scipy/ipython
sprint. In particular, Min did a lot of excellent work *precisely* on
this issue, most (if not all) of which is already committed, to
provide a full ipython daemon for process control. This allows you to
do exactly that, to create/control/destroy engines and/or controllers
from within python scripts.
The good thing is that you're already a bzr launchpad team member, so
I'm sure you'll soon be contributing this code and docs :)
> Now the irony is that I ended up not beeing able to use ipython1 for the
> problem I was interested in, as the objects I wanted to send to my
> parallel map where not picklable. I wrote a small hack using threading
> and os.system to do the work. I suspect this is a limitation people are
> going to bump into quite often. Ideas to make a workaround more or less
> part of ipython1 natively would be great. In my case, the object I had to
> scatter where directly imported from a module, so scattering a module
> path as a string (eg 'ipython1.kernel.client.MultiEngineClient') waould
> have been an option. I have no hindsight on these problems, so I don't
> pretend suggesting a good solution.
We finally figured out a trick using the 'with' statement to allow you
to write code like
with all_engines:
do_in_remote_engines(..)
or
with task_controller_using_engines(1,3,5):
do_this()
and_that()
etc...
so that the control scripts can read much more naturally. This is
something you may recall Eric Jones and I started thinking about at
Scipy'07, and finally on the plane back from Sage/Scipy Days 8 I was
able to understand how it needed to be structured. That code isn't
ready for inclusion anywhere yet, but the proof of concept is in my
local copy and hopefully at the Paris sprint we can test it a bit and
put it in. Being able to write pure-python (instead of the ugly
code-inside-strings) parallel code will make the experience much more
natural, I think, especially coupled with the new facilities for
pushing function objects and the daemon-based ability to do clean
multi-process interactive control.
The pieces are falling in place, now we just need a breather to
document all this and gradually communicate things better to everyone.
I apologize if the process has been a bit opaque lately, but it's
been just a very hectic time for me. After the Paris IPython sprint
(March 22-23) I hope to catch my breath a bit, sort out the
distributed VCS situation (bzr/launchpad is looking pretty good so
far), and communicate better a set of clear plans so that all those
who so kindly expressed interest and willingness to pitch in can do so
with an understanding of where the various pieces of the puzzle are.
Cheers,
f
More information about the IPython-dev
mailing list