[ python-Bugs-1486335 ] httplib: read/_read_chunked failes with ValueError sometime

SourceForge.net noreply at sourceforge.net
Tue Aug 8 02:23:35 CEST 2006


Bugs item #1486335, was opened at 2006-05-11 10:14
Message generated for change (Comment added) made by jjlee
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1486335&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: kxroberto (kxroberto)
Assigned to: Greg Ward (gward)
Summary: httplib: read/_read_chunked failes with ValueError sometime

Initial Comment:
This occasionally shows up in a logged trace, when a
application crahes on ValueError on a
http(s)_response.read() :

(py2.3.5 - yet relevant httplib code is still the same
in current httplib) 


.... \'  File "socket.pyo", line 283,
in read\\n\', \'  File "httplib.pyo", line 389, in
read\\n\', \'  File "httplib.pyo", line 426, in
_read_chunked\\n\', \'ValueError: invalid literal for
int(): \\n\']  :::

its the line:

chunk_left = int(line, 16)


Don't know what this line is about. Yet, that should be
protected, as a http_response.read() should not fail
with ValueError, but only with
IOError/EnvironmentError, socket.error - otherwise
Error Exception handling becomes a random task.

-Robert


Side note regarding IO exception handling: See also FR
#1481036 (IOBaseError): why socket.error.__bases__ is
(<class exceptions.Exception at 0x011244E0>,)  ?


----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2006-08-08 01:23

Message:
Logged In: YES 
user_id=261020

I think it's only worth worrying about bad chunking that a)
has been observed in the wild (though not necessarily by us)
and b) popular browsers can cope with.

Greg: """If there is an error here, it's at EOF, so it's not
that big a deal."""

That's only if the response will be closed at the end of the
current transaction.  Quoting from 1411097:

"""if the connection will not close at the end of the
transaction, the behaviour should not change from what's
currently in SVN (we should not assume that the chunked
response has ended unless we see the proper terminating
CRLF)."""

Perhaps we don't need to be quite as strict as that, but the
point is that otherwise, how do we know the server hasn't
already sent that last CRLF, and that it will turn up in
three weeks' time?-)  If that happens, not sure exactly how
httplib will treat the CRLF and possible chunked encoding
trailers, but I suspect something bad happens.  Perhaps we
could just always close the connection in this case?

I'm not confident I know yet how best to fix these issues. 
I just tried reading curl's transfer.c and http_chunks.c.  I
discovered only that I have to be fully awake to read a 1200
line function :-/


----------------------------------------------------------------------

Comment By: Greg Ward (gward)
Date: 2006-07-26 03:13

Message:
Logged In: YES 
user_id=14422

OK, I've been working on this some more and I have a very
crude addition to test_httplib.py.  I'm going to attach it
here and solicit feedback on python-dev: I'm not sure how
many kinds of bad response chunking I really want to worry
about.  

----------------------------------------------------------------------

Comment By: Greg Ward (gward)
Date: 2006-07-24 20:38

Message:
Logged In: YES 
user_id=14422

I'm seeing this with Python 2.3.5 and 2.4.3 hitting a PHP
app and getting a large error page.  It looks as though the
server is incorrectly chunking the response: lwp-request at
least gives a better error message than httplib.py:

  $ GET "http://..."
  500 EOF when chunk header expected

I'm unclear on precisely what the server is doing wrong. 
The response looks like this:

HTTP/1.1 200 OK
Date: Mon, 24 Jul 2006 19:18:47 GMT
Server: Apache/2.0.54 (Fedora)
X-Powered-By: PHP/4.3.11
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

2169\r\n
\r\n
[...first 0x2169 bytes of response...]\r\n
20b2\r\n
[...next 0x20b2 bytes...]
[...repeat many times...]
20b2\r\n
[...the last 0x20b2 bytes...]
\r\n

The blank line at eof appears to be confusing httplib.py: it
bombs because 

  int('', 16)

raises ValueError.

Observation #1: if this is indeed a protocol error (ie. the
server is in the wrong), httplib.py should turn the
ValueError into an HTTPException.  Perhaps it should define
a new exception class for low-level protocol errors (bad
chunking).  Maybe it should reuse IncompleteRead.

Observation #2: gee, my web browser doesn't barf on this
response, so why should httplib.py?  If there is an error
here, it's at EOF, so it's not that big a deal.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1486335&group_id=5470


More information about the Python-bugs-list mailing list