[execnet-dev] remote process not terminating..
holger krekel
holger at merlinux.eu
Sat Jun 23 15:34:15 CEST 2012
On Thu, Jun 21, 2012 at 13:58 -0700, solr nps wrote:
> I am using exec net to run a java job on a remote server, the process runs
> fine and exits cleanly under normal circumstances, however when I hit
> Ctrl-C on the python process the remote java job keeps on running, the
> remote python dies in a few seconds. I am pasting all relevant code below,
> there are some self explanatory functions that I have left out. What am I
> doing wrong? I have read the documentation at
> http://codespeak.net/execnet/example/test_group.html#robust-termination-of-ssh-popen-processesand
> I think I am doing the same thing in my code.
>
> Thanks!
>
> ****************
>
> def poll_data(channel, host, output_file, service, threads):
> """ Function that is run remotely on the servers on a python
> interpretor, by virtue of
> the way we execve() the child process, if we go away, it goes away
> """
>
> import subprocess
>
> java_location = '/home/user1/app/jdk/bin/java'
> jar_name = '/home/user1/data-poller.jar'
>
> command = [java_location, "-jar", jar_name, "-s", host, "-b", \
> service, "-o", output_file, "-t", threads]
>
> channel.send("Running command: %s" % ' '.join(command))
> p = subprocess.Popen(command, stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT)
>
> while True:
> retcode = p.poll()
> channel.send(p.stdout.readline())
> if retcode is not None and retcode != 0:
> raise OSError('Unable to run child process: %s' % str(retcode))
>
> def printData(data):
> """Simply prints anything it is given"""
> print data
>
> if __name__ == '__main__':
> group = execnet.Group()
>
> # Rsync can be used to move the jar to the right place, which makes
> keeping
> # up to date easier
> rsync = execnet.RSync('./libs')
>
> # Install signal handlers that will kill things if this process dies
> terminate = lambda: group.terminate(timeout=5.0)
>
> signal.signal(signal.SIGTERM, terminate)
> signal.signal(signal.SIGQUIT, terminate)
> signal.signal(signal.SIGHUP, terminate)
This should not be neccessary as groups terminate by default
through python's atexit mechanism.
Instead of trying to handle termination yourself - what traces do
you see if you leave all termination code out and run your code with
the environment variable EXECNET_DEBUG=2 (which will send traces
to stderr) or EXECNET_DEBUG=1 (which will write traces to
/tmp/execnet-debug-PID files)?
To help me or others try things ourselves it would be great if
you can strip down the (remaining) problem such that i only need Jython
and Python and execnet to reproduce it.
best & thanks,
holger
> servers,external_service,output_file,threads,port,endpoint =
> process_conf_file()
> try:
> channels = []
> for server in servers:
> gw = group.makegateway('ssh=%s' %(server))
> rsync.add_target(gw, '/tmp/libs')
>
> rsync.send()
>
> for gw, server in zip(group, servers):
> host = "http://%s:%s/%s" %(server, port, endpoint)
> channel = gw.remote_exec(poll_data, \
> host = host, \
> output_file = output_file, \
> service = service, \
> threads = threads)
>
> channel.setcallback(printData, endmarker=None)
> channels.append(channel)
>
> mch = execnet.MultiChannel(channels)
> mch.waitclose()
>
> except KeyboardInterrupt:
> terminate()
> _______________________________________________
> execnet-dev mailing list
> execnet-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/execnet-dev
More information about the execnet-dev
mailing list