[IPython-dev] iPython cluster on multiple Windows servers, without Windows HPC Server

John Gill jgill at tokiomillennium.com
Tue May 6 16:47:41 EDT 2014

Thanks MinRK – that is a very helpful explanation + suggestions.

This question was pretty timely.   I actually had ipython 1.x running fine with HPC, but was unable to get it to work under ipython2.0.    I am working towards using ssh to start the cluster up, but as you point out with a shared file system that is a whole lot easier + I think the HPC is a pretty big hammer to crack a tiny nut – and it tends to make a mess of the nut ;)

As mentioned in a previous post I am also interested in creating non-homogeneous clusters eg engines with different cpu and memory resources, on different OS’es and with different software installed + have the scheduler deal with task dependencies whilst at the same time respecting any restrictions tasks have as to what kind of engine they need run on.


From: ipython-dev-bounces at scipy.org [mailto:ipython-dev-bounces at scipy.org] On Behalf Of MinRK
Sent: Tuesday, May 06, 2014 4:37 PM
To: IPython developers list
Subject: Re: [IPython-dev] iPython cluster on multiple Windows servers, without Windows HPC Server

An important thing to note about ipcluster is that it’s a very complicated way to do something that’s not very complicated. All it sets out to do is:
1.      start a controller with ipcontroller
2.      start 0-many engines with ipengine

All of the complexity comes from abstracting how processes actually start, including where machines are, batch systems, etc. But in the end, it’s just doing:

$> ipcluster

$> for i in {1..n}; do ipengine; done

ipcluster makes some simple cases easier, but if it doesn’t do what you want, you can always start the controller and engines yourself, with no loss of functionality. Plus, a tool that only deploys a cluster on your own system is much simpler than one that tries to work in a wide variety of contexts like ipcluster.

The basic steps in getting a cluster up and running:
1.      configure the controller to listen on an IP visible to the other machines (c.HubFactory.ip = '' in ipcontroller_config.py on the controller machine.
2.      start the controller with ipcontroller
3.      copy .ipython/profile_default/security/ipcontroller-*.json to all of the various machines on which you plan to start engines.
4.      start ipengine as many times as is appropriate on each machine.

Step 3. is unnecessary if your systems are on a shared filesystem.

For instance, here is a simple version that starts a controller and engines with ssh on Linux or OS X machines, putting processes in the background with screen:

$> ssh controller_host screen -dmS ipcontroller

$> for host in host1 host2; do

> scp ~/.ipython/profile_default/security/ipcontroller-*.json $host:.ipython/profile_default/security/

> ssh $host 'for n in {1..3}; do screen -dmS ipengine; done'

> done

Which is a lot simpler than the hundreds of lines of ipcluster, and, frankly, better behaved than the SSH launchers that ship with IPython.

If you have Windows analogues for ‘tell machine X to run command Y,’ you can make a similar script, tailored to your use.


On Tue, May 6, 2014 at 11:35 AM, Jason Roberts <jason.roberts at duke.edu<mailto:jason.roberts at duke.edu>> wrote:
I have a situation where I have to use MS Windows for a big parallel processing job, due to Windows dependencies on some steps in the job. I have successfully used iPython on a single 16-processor machine for this purpose. Thank you very much for making this so easy to use! It has saved me a huge amount of time.

Now, if possible, I would like to set up a cluster that has multiple Windows servers (Windows Server 2008 R2 Standard). The iPython documentation (http://ipython.org/ipython-doc/dev/parallel/parallel_process.html) describes several options. The one that seems best oriented for Windows, at least under the assumption that Microsoft technologies are the best choice for Windows, is to use Microsoft HPC Pack 2008 (http://ipython.org/ipython-doc/dev/parallel/parallel_winhpc.html). I tried this. Unfortunately HPC Pack appears to require Active Directory to be deployed. My shop runs a mixture of different operating systems, and while we have LDAP, we do not have a full-blown deployment of Active Directory. This appears to rule out the HPC Pack option.

Are there other alternatives for running an iPython cluster composed of multiple Windows servers, and which is best? Should I look at mpiexec with Open MPI? Is there some way to do it with SSH, despite the iPython documentation saying not?

Thanks for any advice you can provide, and thanks again for iPython’s parallel processing infrastructure. It truly is a time saver.


IPython-dev mailing list
IPython-dev at scipy.org<mailto:IPython-dev at scipy.org>

This communication and any attachments contain information which is confidential and may also be legally privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying, printing or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful. If you have received this communication in error, please return it with the title "received in error" to postmaster at tokiomillennium.com and then permanently delete the email and any attachments from your system.

E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses. It is the recipient's responsibility to ensure that e-mail transmissions and any attachments are virus free. We do not accept liability for any damages or other consequences caused by information that is intercepted, corrupted, amended, lost, destroyed, arrives late or incomplete or contains viruses.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140506/07777cf7/attachment.html>

More information about the IPython-dev mailing list