[Python-Dev] test_multiprocessing: test_listener_client flakiness

Trent Nelson tnelson at onresolve.com
Wed Jun 18 18:20:14 CEST 2008

I gave my Windows buildbots a little bit of TLC last night.  This little chestnut in test_multiprocessing.py around line 1346 is causing my buildbots to wedge more often than not:

    def test_listener_client(self):
        for family in self.connection.families:
            l = self.connection.Listener(family=family)
            p = self.Process(target=self._test, args=(l.address,))
            conn = l.accept()
            self.assertEqual(conn.recv(), 'hello')

The wedging will be a result of that accept() call.  Not knowing anything about the module or the test suite, I can only assume that there's a race condition introduced between when the subprocess attempts to connect to the listener, versus when the l.accept() call is actually entered.  (On the basis that a race condition would explain why sometimes it wedges and sometimes it doesn't.)

Just FYI, the error in the buildbot log (http://www.python.org/dev/buildbot/all/x86%20W2k8%20trunk/builds/810/step-test/0) when this occurs is as follows:


command timed out: 1200 seconds without output
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1
remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process

(The fact it can't be killed cleanly is a bug in Twisted's signalProcess('KILL') method, which doesn't work against Python processes that have entered accept() calls on Windows (which present the 'wedged' behaviour and have to be forcibly killed with OpenProcess/TerminateProcess).)


