[IPython-dev] ipcluster (LSF) timing (check if all engines are running)
MinRK
benjaminrk at gmail.com
Mon Aug 19 10:28:38 EDT 2013
Something like this should work:
from IPython import parallel
def wait_for_cluster(engines=1, **kwargs):
"""Wait for an IPython cluster to startup and register a minimum number
of engines"""
# wait for the controller to come up
while True:
try:
client = parallel.Client(**kwargs)
except IOError:
print "No ipcontroller-client.json, waiting..."
time.sleep(10)
except TimeoutError:
print "No controller, waiting..."
time.sleep(10)
if not engines:
return
# wait for engines to register
print "waiting for %i engines" % engines,
running = len(client)
sys.stdout.write('.' * running)
while running < engines:
time.sleep(1)
previous = running
running = len(client)
sys.stdout.write('.' * (running - previous))
On Mon, Aug 19, 2013 at 6:34 AM, Florian M. Wagner <wagnerfl at student.ethz.ch
> wrote:
> Hey all,
>
> I am using IPython.parallel on a large cluster, where controller and
> engines are launched via LSF. My current workflow is as follows:
>
> #!/bin/bash
> python pre_processing.py
> ipcluster start --profile=cluster --n=128 > ipcluster.log 2>&1
> sleep 120
> python main_computation.py
> python post_processing.py
>
>
> I am not entirely happy with this, since the 2 minutes are not always
> enough depending on the load of the cluster. I believe that there is a much
> more elegant way to launch the cluster and check if all the eninges are
> running, before proceeding with the main computation. I would highly
> appreciate any help.
>
> Best regards
> Florian
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20130819/0d2c6792/attachment.html>
More information about the IPython-dev
mailing list