[Python-Dev] Python 3.0 urllib fails with chunked HTTP responses

Jeremy Hylton jeremy at alum.mit.edu
Thu Dec 18 14:22:29 CET 2008


On Wed, Dec 17, 2008 at 1:05 PM, Guido van Rossum <guido at python.org> wrote:
> The inheritance from io.RawIOBase seems fine.

There is a small problem with the interaction between HTTPResponse and
RawIOBase, but I think the problem is more on the http side.  You may
recall that the HTTP code has a habit of closing the connection for
you.  In a variety of cases, once you've read the last bytes of the
response, the HTTPResponse object calls its own close() method.  This
interacts poorly with RawIOBase, because it raises a ValueError for
any operation on a closed io object.  This prevents iterators from
working correctly.  The iterator implementation expects the final call
to readline() to return an empty string and converts that to a
StopIteration.  Instead, it's seeing a ValueError that propagates out.

It's always been odd to me that the connection closed itself.  It's
going to be tricky to fix the current bug (chunked responses) and keep
the self-closing behavior, but I worry that change the self-closing
behavior too dramatically isn't appropriate for a bug fix.  Will look
some more at this tomorrow.

Jeremy

> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>
> On Mon, Dec 15, 2008 at 11:19 AM, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>> I have a patch that appears to fix this bug
>> http://bugs.python.org/file12361/urllib-chunked.diff
>> but I'm not sure about its interaction with the io module and
>> RawIOBase.  Is there a new IO expert who could take a look at it for
>> me?
>>
>> Jeremy
>>
>> On Sun, Dec 14, 2008 at 11:06 PM, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>>> This bug is pretty serious, because urllib will insert garbage into
>>> the application-visible data for a chunked response.  It simply
>>> ignores the fact that it's reading a chunked response and includes the
>>> chunked header data is payload data.  The original bug was reported in
>>> September, but no one noticed it.  It was reported again recently.
>>>
>>> http://bugs.python.org/issue3761
>>> http://bugs.python.org/issue4631
>>>
>>> I suspect we'd want to get a 3.0.1 out as soon as this is fixed, but
>>> that's not my call.
>>>
>>> Jeremy
>>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>


More information about the Python-Dev mailing list