[Python-Dev] socket.SOL_REUSEADDR: different semantics between Windows vs Unix (or why test_asynchat is sometimes dying on Windows)
tnelson at onresolve.com
Thu Apr 3 23:40:04 CEST 2008
I started looking into this:
command timed out: 1200 seconds without output
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1
remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process
I tried to replicate it on the buildbot in order to debug, which, surprisingly, I could do consistently by just running rt.bat -q -d -uall test_asynchat. As the log above indicates, the python process becomes completely and utterly wedged, to the point that I can't even attach a remote debugger and step into it.
Digging further, I noticed that if I ran the following code in two different python consoles, EADDRINUSE was *NOT* being raised by socket.bind():
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
However, take out the setsockopt line, and wallah, the second s.bind() will raise EADDRINUSE, as expected. This manifests into a really bizarre issue with test_asynchat in particualr, as subsequent sock.accept() calls on the socket put python into the uber wedged state (can't even ctrl-c out at the console, need to kill the process directly).
Have to leave the office and head home so I don't have any more time to look at it tonight -- just wanted to post here for others to mull over.
More information about the Python-Dev