[IPython-dev] iPython cluster on multiple Windows servers, without Windows HPC Server

Tue May 6 15:36:31 EDT 2014

An important thing to note about ipcluster is that it’s a very complicated
way to do something that’s not very complicated. All it sets out to do is:

   1. start a controller with ipcontroller
   2. start 0-many engines with ipengine

All of the complexity comes from abstracting how processes actually start,
including where machines are, batch systems, etc. But in the end, it’s just
doing:

$> ipcluster
$> for i in {1..n}; do ipengine; done

ipcluster makes some simple cases easier, but if it doesn’t do what you
want, you can always start the controller and engines yourself, with no
loss of functionality. Plus, a tool that only deploys a cluster on your own
system is much simpler than one that tries to work in a wide variety of
contexts like ipcluster.

The basic steps in getting a cluster up and running:

   1. configure the controller to listen on an IP visible to the other
   machines (c.HubFactory.ip = '1.2.3.4' in ipcontroller_config.py on the
   controller machine.
   2. start the controller with ipcontroller
   3. copy .ipython/profile_default/security/ipcontroller-*.json to all of
   the various machines on which you plan to start engines.
   4. start ipengine as many times as is appropriate on each machine.

Step 3. is unnecessary if your systems are on a shared filesystem.

For instance, here is a simple version that starts a controller and engines
with ssh on Linux or OS X machines, putting processes in the background
with screen:

$> ssh controller_host screen -dmS ipcontroller
$> for host in host1 host2; do
> scp ~/.ipython/profile_default/security/ipcontroller-*.json $host:.ipython/profile_default/security/
> ssh $host 'for n in {1..3}; do screen -dmS ipengine; done'
> done

Which is *a lot* simpler than the hundreds of lines of ipcluster, and,
frankly, better behaved than the SSH launchers that ship with IPython.

If you have Windows analogues for ‘tell machine X to run command Y,’ you
can make a similar script, tailored to your use.

-MinRK

On Tue, May 6, 2014 at 11:35 AM, Jason Roberts <jason.roberts at duke.edu>
wrote:

I have a situation where I have to use MS Windows for a big parallel
> processing job, due to Windows dependencies on some steps in the job. I
> have successfully used iPython on a single 16-processor machine for this
> purpose. Thank you very much for making this so easy to use! It has saved
> me a huge amount of time.
>
>
>
> Now, if possible, I would like to set up a cluster that has multiple
> Windows servers (Windows Server 2008 R2 Standard). The iPython
> documentation (
> http://ipython.org/ipython-doc/dev/parallel/parallel_process.html)
> describes several options. The one that seems best oriented for Windows, at
> least under the assumption that Microsoft technologies are the best choice
> for Windows, is to use Microsoft HPC Pack 2008 (
> http://ipython.org/ipython-doc/dev/parallel/parallel_winhpc.html). I
> tried this. Unfortunately HPC Pack appears to require Active Directory to
> be deployed. My shop runs a mixture of different operating systems, and
> while we have LDAP, we do not have a full-blown deployment of Active
> Directory. This appears to rule out the HPC Pack option.
>
>
>
> Are there other alternatives for running an iPython cluster composed of
> multiple Windows servers, and which is best? Should I look at mpiexec with
> Open MPI? Is there some way to do it with SSH, despite the iPython
> documentation saying not?
>
>
>
> Thanks for any advice you can provide, and thanks again for iPython’s
> parallel processing infrastructure. It truly is a time saver.
>
>
>
> Jason
>
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140506/f3c8dd7a/attachment.html>