[Python-bugs-list] [ python-Bugs-591349 ] httplib throws a TypeError when the target host disconnects

noreply@sourceforge.net noreply@sourceforge.net
Wed, 25 Sep 2002 21:40:56 -0700


Bugs item #591349, was opened at 2002-08-05 19:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=591349&group_id=5470

Category: Python Library
Group: Python 2.1.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Rob Green (rbgrn)
Assigned to: Jeremy Hylton (jhylton)
Summary: httplib throws a TypeError when the target host disconnects

Initial Comment:
This is the output that occurs about 1 in 500 hits to any 
particular URL. I've only seen it so far hitting servers 
running Apache 1.3.20, but I don't have enough data to 
limit it to that.

Python 2.1.2 (#1, Mar 16 2002, 18:24:08)
[GCC 2.95.3 [FreeBSD] 20010315 (release)] on freebsd4

h = httplib.HTTPConnection(host)
response = h.getresponse()
data = response.read()
File "/usr/local/lib/python2.1/httplib.py", line 246, in read
value = value + self._safe_read(chunk_left)
File "/usr/local/lib/python2.1/httplib.py", line 314, in 
_safe_read
chunk = self.fp.read(amt)
TypeError: an integer is required



----------------------------------------------------------------------

>Comment By: Rob Green (rbgrn)
Date: 2002-09-25 23:40

Message:
Logged In: YES 
user_id=590105

I have not tried the latest version, however I defined the test 
case which reproduces the problem. It takes under a minute 
to try, so please try it on your version as I do not currently 
have time to checkout and build it myself. Thank-you.

Here's the test case again

`nc -l -p 8888` on some terminal

then in python have your test prog connect to the IP of that 
terminal port 8888 and GET / HTTP/1.0 or whatever

then on the terminal hit Ctrl-C instead of handing back a 
response

the test prog should throw the exception if affected.

Thanks

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2002-09-03 17:20

Message:
Logged In: YES 
user_id=31392

A bunch of questions.  You seem to have narrowed this
problem down a lot, but I'm not sure I understand the diagnosis.

First off, have you tried the latest version of the code
from CVS?  It has changed in several ways, so it would be
helpful if you could test with that version.  I've been
running a test driver for about an hour now without seeing
any errors caused by httplib.

The real problem I have is understanding what path through
the code leaves chunked set to a true value and chunk_left
set to something invalid.
The begin() method of an HTTPResponse always sets chunked to
1 or 0.  If  it sets chunked to 1, it sets chunk_left to None.


----------------------------------------------------------------------

Comment By: Rob Green (rbgrn)
Date: 2002-08-07 23:14

Message:
Logged In: YES 
user_id=590105

Line 245 should be "if chunk_left is not _UNKNOWN:"
And the next block up to line 259 should be indented. This 
causes an IncompleteRead exception to be thrown, which is 
IMO the correct one in this case.

I attached a diff that patches the Python 2.1.2 httplib.

----------------------------------------------------------------------

Comment By: Rob Green (rbgrn)
Date: 2002-08-07 22:51

Message:
Logged In: YES 
user_id=590105

Ok I figured out what causes this. It's not a threading issue or 
anything like that, basically what happens is this:

When httplib connects to a server and the hostname is good 
and the port is open and connects, but the server 
immediately disconnects without printing any text or 
anything, this exception is then thrown.

I was able to reproduce it by running netcat -l -p <port> 
locally and starting an HTTPConnection to that port, then 
punting netcat which causes python to throw the exception.

----------------------------------------------------------------------

Comment By: Rob Green (rbgrn)
Date: 2002-08-07 01:16

Message:
Logged In: YES 
user_id=590105

Ok I put together some test code that reproduced the bug in 
under an hour on my machine. Here it is...

----------------------------------------------------------------------

Comment By: Rob Green (rbgrn)
Date: 2002-08-07 00:14

Message:
Logged In: YES 
user_id=590105

I'd give you the URLs but I don't think it matters that much, 
I've seen this problem now hitting 4 different machines, all 
running linux/apache. It's not something very easy for me to 
reproduce because I only see the exception thrown once or 
maybe twice during the day where there are 20,000 hits from 
my daemon that day. I suppose that the way to reproduce it 
would be to have a list of urls, and to just have an app sit 
there and cycle through them doing a GET every 5 seconds 
or so, and eventually it should show up. I suppose to be more 
accurate the app would have to be threaded as well, maybe 
having a thread for each url that just opens an 
HTTPConnection every 5 seconds.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2002-08-06 07:42

Message:
Logged In: YES 
user_id=31392

Can you provide any more information about what URLs cause
problems?  You could call set_debuglevel(1) to enable output
of all  HTTP trafffic.  Or just a list of some of the URLs
that failed.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=591349&group_id=5470