httplib raises ValueError reading chunked content

philip20060308 at gmail.com philip20060308 at gmail.com
Thu Mar 9 00:21:11 CET 2006


Hi all,
Has anyone ever seen Python 2.4.1's httplib choke when reading chunked
content? I'm using it via urrlib2, and I ran into a particular server
that returns something that httplib doesn't expect. Specifically, in
the code below where the error occurs, line == ''.

Python 2.4.1 (#2, Oct 12 2005, 01:36:32)
[GCC 3.4.4 [FreeBSD] 20050518] on freebsd6
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> req = urllib2.Request("http://www.mistyshaven.com/")
>>> f = urllib2.urlopen(req)
>>> content = f.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.4/socket.py", line 285, in read
    data = self._sock.recv(recv_size)
  File "/usr/local/lib/python2.4/httplib.py", line 456, in read
    return self._read_chunked(amt)
  File "/usr/local/lib/python2.4/httplib.py", line 495, in
_read_chunked
    chunk_left = int(line, 16)
ValueError: invalid literal for int():
>>>

I'm running Python 2.4.1 under FreeBSD 6.0. Interestingly, I can't
recreate the problem using Python 2.3 under OS X.

I've done a little digging for clues. First, the response headers
include:
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322

I reckon that if that popular server was sending out broken chunked
content, it'd be a well-known problem but that doesn't seem to be the
case. So I assume (big assumption) that it is sending correct
responses. Another clue is that the content fits all in one chunk.
Under my 2.3 installation (where I can fetch the content successfully),
len(content) == 0x303. The first chunk size reported by the server is
0x311, so I guess that adds up when one adds a fudge factor for \r\n
and so forth.

My guess is that httplib is somehow reading the blank line that
signifies the end of chunked content as part of the content. I don't
know enough about debugging HTTP conversations to go any further. Can
anyone at least confirm the problem elsewhere?

Thanks
Philip

PS - The email address with which this was posted is live; you can also
email Philip Semanchuk: my first name @ my last name .com




More information about the Python-list mailing list