[ python-Bugs-1001018 ] setdefaulttimeout causes unnecessary timeouts on connect err

SourceForge.net noreply at sourceforge.net
Tue Aug 3 03:18:12 CEST 2004


Bugs item #1001018, was opened at 2004-07-31 11:07
Message generated for change (Comment added) made by mhammond
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001018&group_id=5470

Category: Windows
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Guido van Rossum (gvanrossum)
Summary: setdefaulttimeout causes unnecessary timeouts on connect err

Initial Comment:
This looks like a bug to me:

>>> import socket, httplib
>>> socket.setdefaulttimeout(10)
>>> httplib.HTTPConnection("www.python.org",
9999).connect()
[.... 10 second delay ....]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "e:\src\python-cvs\lib\httplib.py", line 548, in
connect
    raise socket.error, msg
socket.timeout: timed out
>>>

On Linux, there is no significant delay, and the
traceback reads:
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.3/httplib.py", line 548,
in connect
    raise socket.error, msg
socket.error: (111, 'Connection refused')

The linux result is what I expected on Windows. 
Sockets aren't my strong point, so I'd prefer someone
confirming it is a real bug before I burn too much time
on it.

----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2004-08-03 11:18

Message:
Logged In: YES 
user_id=14198

It is true select does return 1 for either the "can't
connect" or "connected" cases.  In the "connected" case, the
getsockopt() returns 0 - hence the function returns 0, and
WSASetLastError(0) has been called.  ie, as far as I can see
and test, the code works for both success and failure.

However, I do agree that the MS docs don't explicitly state
anywhere that what I am doing is OK, so I'm attaching a new
patch as you suggest, and which also seems to work <wink>


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-08-03 00:29

Message:
Logged In: YES 
user_id=31435

I suspect the patch is close but not quite there yet.  I 
believe select will return 1 now if the socket is in *either* of 
the writable or exception sets upon select's return, so that 
the patch loses the distinction between "ok, we finally 
connected" and "oops -- we can't connect".  If so, to 
untangle that we need to pass in *distinct* sets to select, 
and when the return is > 0 it's an error case if and only if 
FS_SET then says the socket is in the exception set.  If the 
socket is in the writable set instead, then the connect 
succeeded.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2004-08-02 20:07

Message:
Logged In: YES 
user_id=14198

Thanks Tim!  It looks like this patch works.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-08-02 10:31

Message:
Logged In: YES 
user_id=31435

I can confirm that Guido certainly didn't intend for a refused 
connection to wait for the timeout on Windows.  A problem is 
that the attempt to connect here isn't returning 
WSAECONNREFUSED on Windows, it's returning 
WSAEWOULDBLOCK.

If you set the default timeout back to None, the attempt to 
connect *does* return WSAECONNREFUSED on Windows.  
But for whatever reason, the Windows implementation of 
sockets appears to turn that into WSAEWOULDBLOCK if (and 
only if) the socket is in non-blocking mode.

The problem then is trying to guess some way to figure out 
whether WSAEWOULDBLOCK on a Windows non-blocking 
socket connect *means* "there's no chance this will ever 
succeed" or "I can't connect immediately, but maybe I can 
later".  It appears to mean both things <grrrrr>.

Note this:

>>> s = socket.socket()
>>> s.setblocking(0)
>>> s.connect(("www.python.org", 9999))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 1, in connect
socket.error: (10035, 'The socket operation could not 
complete without blocking')

Now at this point, the code essentially does this:

>>> select.select([], [s], [], 10.0)
([], [], [])
>>>

and select waits 10 seconds before returning.

However, if we do this instead (I'm adding a non-
empty "error/exception" list argument):

>>> select.select([], [s], [s], 10.0)
([], [], [<socket._socketobject object at 0x008EBA80>])
>>>

then it returns immediately, with the socket in the exception 
list.

So that's a clue.  How can we tell *what* error occurred?  
Hmm.  For the exception list, MS select docs say a socket will 
appear there when:

"If processing a connect call (nonblocking), connection 
attempt failed "

So the behavior so far matches the docs.  Later it says

"""
If a socket is processing a connect call (nonblocking), failure 
of the connect attempt is indicated in exceptfds (application 
must then call getsockopt SO_ERROR to determine the error 
value to describe why the failure occurred). This document 
does not define which other errors will be included.
"""

So there you go <wink>:  we have to add the socket to the 
select call's exception set.  Then the select call won't wait 
forever.  When it comes back, and there is an exception, we 
have to call getsockopt() with SO_ERROR to determine the 
cause.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2004-08-02 09:38

Message:
Logged In: YES 
user_id=14198

Guido - it looks like this change was made by you in Rev
1.257.  Can you please confirm the new behaviour is not
correct and I will try and dig a little deeper.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1001018&group_id=5470


More information about the Python-bugs-list mailing list