[execnet-dev] remote process not terminating..

holger krekel holger at merlinux.eu
Sat Jun 23 15:34:15 CEST 2012


On Thu, Jun 21, 2012 at 13:58 -0700, solr nps wrote:
> I am using exec net to run a java job on a remote server, the process runs
> fine and exits cleanly under normal circumstances, however when I hit
> Ctrl-C on the python process the remote java job keeps on running, the
> remote python dies in a few seconds. I am pasting all relevant code below,
> there are some self explanatory functions that I have left out. What am I
> doing wrong? I have read the documentation at
> http://codespeak.net/execnet/example/test_group.html#robust-termination-of-ssh-popen-processesand
> I think I am doing the same thing in my code.
> 
> Thanks!
> 
> ****************
> 
> def poll_data(channel, host, output_file, service, threads):
>     """ Function that is run remotely on the servers on a python
> interpretor, by virtue of
>         the way we execve() the child process, if we go away, it goes away
> """
> 
>     import subprocess
> 
>     java_location = '/home/user1/app/jdk/bin/java'
>     jar_name = '/home/user1/data-poller.jar'
> 
>     command = [java_location, "-jar", jar_name, "-s", host, "-b", \
>                service, "-o", output_file, "-t", threads]
> 
>     channel.send("Running command: %s" % ' '.join(command))
>     p = subprocess.Popen(command, stdout=subprocess.PIPE,
> stderr=subprocess.STDOUT)
> 
>     while True:
>         retcode = p.poll()
>         channel.send(p.stdout.readline())
>         if retcode is not None and retcode != 0:
>             raise OSError('Unable to run child process: %s' % str(retcode))
> 
> def printData(data):
>     """Simply prints anything it is given"""
>     print data
> 
> if __name__ == '__main__':
>     group = execnet.Group()
> 
>     # Rsync can be used to move the jar to the right place, which makes
> keeping
>     # up to date easier
>     rsync = execnet.RSync('./libs')
> 
>     # Install signal handlers that will kill things if this process dies
>     terminate = lambda: group.terminate(timeout=5.0)
> 
>     signal.signal(signal.SIGTERM, terminate)
>     signal.signal(signal.SIGQUIT, terminate)
>     signal.signal(signal.SIGHUP, terminate)


This should not be neccessary as groups terminate by default
through python's atexit mechanism.

Instead of trying to handle termination yourself - what traces do
you see if you leave all termination code out and run your code with
the environment variable EXECNET_DEBUG=2 (which will send traces
to stderr) or EXECNET_DEBUG=1 (which will write traces to 
/tmp/execnet-debug-PID files)?

To help me or others try things ourselves it would be great if
you can strip down the (remaining) problem such that i only need Jython
and Python and execnet to reproduce it.

best & thanks,
holger


>     servers,external_service,output_file,threads,port,endpoint =
> process_conf_file()
>     try:
>         channels = []
>         for server in servers:
>             gw = group.makegateway('ssh=%s' %(server))
>             rsync.add_target(gw, '/tmp/libs')
> 
>         rsync.send()
> 
>         for gw, server in zip(group, servers):
>             host = "http://%s:%s/%s" %(server, port, endpoint)
>             channel = gw.remote_exec(poll_data, \
>                     host = host, \
>                     output_file = output_file, \
>                     service = service, \
>             threads = threads)
> 
>             channel.setcallback(printData, endmarker=None)
>             channels.append(channel)
> 
>         mch = execnet.MultiChannel(channels)
>         mch.waitclose()
> 
>     except KeyboardInterrupt:
>         terminate()

> _______________________________________________
> execnet-dev mailing list
> execnet-dev at codespeak.net
> http://codespeak.net/mailman/listinfo/execnet-dev




More information about the execnet-dev mailing list