[IPython-dev] Using IPython Cluster with SGE -- help needed
Abraham D. Flaxman
abie at uw.edu
Tue Sep 10 20:15:23 EDT 2013
Have you had more success with this Andreas? I've tried to use my SGE today as well, and had moderate success. I started where you were:
ipcluster start --profile=sge -n 12 # starts one engine, sometimes
which I can confirm via:
In [48]: len(p.Client(profile='sge'))
Out[48]: 1
Then I qsub more engines with:
for i in {1..100}; do qsub sge.engine.template; done # starts 101 more
and this gives me more remote clients:
In [49]: len(p.Client(profile='sge'))
Out[49]: 101
To shut down the cluster:
qstat | awk {'print $1'} | xargs qdel
Also, it shut down on its own one time, which I appreciated, but perhaps was not intended.
Any tips on how I can make this work more smoothly?
--Abie
Abraham D. Flaxman
Assistant Professor
Institute for Health Metrics and Evaluation | University of Washington
2301 5th Avenue, Suite 600 | Seattle, WA 98121| USA
Tel: +1-206-897-2800 | Fax: +1-206-897-2899 UW
abie at uw.edu | http://healthmetricsandevaluation.org | http://healthyalgorithms.com
-----Original Message-----
From: ipython-dev-bounces at scipy.org [mailto:ipython-dev-bounces at scipy.org] On Behalf Of Andreas Hilboll
Sent: Monday, August 05, 2013 7:19 AM
To: Matthieu Brucher
Cc: IPython developers list
Subject: Re: [IPython-dev] Using IPython Cluster with SGE -- help needed
Thanks, Mathieu,
answers inline:
Am 05.08.2013 16:02, schrieb Matthieu Brucher:
> Hi,
>
> I don't know why the registration was not complete. Is your home
> folder the same on all nodes and on the login node?
Yes, it is. Could this be some firewall issue?
> You won't see 12 jobs. You asked for 12 engines, and they will all be
> submitted in one job and the 12 engines will be started by mpiexec -n
> 12. This is the standard way of using batch schedulers. Ask for some
> cores, run an mpi application on these cores.
Well, then I guess our IT department doesn't like "the standard way". We have a multi-node cluster, comprising 12 nodes, one 'management' and 11 'computing' nodes. And we don't have/use mpi usually.
What I would need in order to use our multi-node cluster the way our sysadmins want us to, I'd need to submit a total of {n} ipengines via {n} calls to ``qsub``.
Any idea how I can accomplish this?
Thanks for your help!
Andreas.
More information about the IPython-dev
mailing list