[issue11812] transient socket failure to connect to 'localhost'

Charles-François Natali report at bugs.python.org
Tue Nov 8 09:15:10 CET 2011


Charles-François Natali <neologix at free.fr> added the comment:

> The server thread only waits for 3 seconds for the connection. If a connection is not created before 3 seconds, the server suicides and when the connection is tried, it will fail. This probably explain why the problem is sporadic and seems to depend of name resolving. If the DNS resolver is "slow", we have a problem.

Indeed, but 3 seconds to resolve localhost is not "slow", it's really
that the name lookup service is broken.

> So, I would propose:
>
> 1. Use 127.0.0.1 instead of "localhost".
>

As noted, this might break on IPv6-only hosts. Not sure this will be a
problem, though. Another, less intrusive solution has been suggested
by Victor: use the address returned by getsockname() instead of
support.HOST when connecting.

> 2. Delete the timeout in the server. I don't see the purpose of it, except be sure the server thread dies eventually. Lets configure the thread as "daemon", and don't mind with the thread join.
>

Sounds like a recipe for masking bugs. Having a dangling thread is
probably not a good idea. If 1) is not enough to fix this failure,
then you can first try to increase this to a "huge" value (e.g. 30
seconds).

> 3. Cleanup the Event signaling.
>

OK.

> 4. "time.sleep(0.1)?"... Please... :-)
>

Yeah, this should be removed.

> I have seen this issue too in 2.7, in my buildbots (OpenIndiana).
>

Please check your name resolution setting :-)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11812>
_______________________________________


More information about the Python-bugs-list mailing list