[Python-Dev] Python 3.0 urllib fails with chunked HTTP responses
Guido van Rossum
guido at python.org
Thu Dec 18 18:27:42 CET 2008
It sounds like the self-closing is an implementation detail, meant to
make sure the socket is closed as early as possible (which I suppose
is a good thing if there's a server waiting for the final ACK on the
other side). Perhaps it should not use close() but something slightly
lower level that affects the socket directly?
--Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, Dec 18, 2008 at 5:22 AM, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On Wed, Dec 17, 2008 at 1:05 PM, Guido van Rossum <guido at python.org> wrote:
>> The inheritance from io.RawIOBase seems fine.
> There is a small problem with the interaction between HTTPResponse and
> RawIOBase, but I think the problem is more on the http side. You may
> recall that the HTTP code has a habit of closing the connection for
> you. In a variety of cases, once you've read the last bytes of the
> response, the HTTPResponse object calls its own close() method. This
> interacts poorly with RawIOBase, because it raises a ValueError for
> any operation on a closed io object. This prevents iterators from
> working correctly. The iterator implementation expects the final call
> to readline() to return an empty string and converts that to a
> StopIteration. Instead, it's seeing a ValueError that propagates out.
> It's always been odd to me that the connection closed itself. It's
> going to be tricky to fix the current bug (chunked responses) and keep
> the self-closing behavior, but I worry that change the self-closing
> behavior too dramatically isn't appropriate for a bug fix. Will look
> some more at this tomorrow.
>> --Guido van Rossum (home page: http://www.python.org/~guido/)
>> On Mon, Dec 15, 2008 at 11:19 AM, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>>> I have a patch that appears to fix this bug
>>> but I'm not sure about its interaction with the io module and
>>> RawIOBase. Is there a new IO expert who could take a look at it for
>>> On Sun, Dec 14, 2008 at 11:06 PM, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
>>>> This bug is pretty serious, because urllib will insert garbage into
>>>> the application-visible data for a chunked response. It simply
>>>> ignores the fact that it's reading a chunked response and includes the
>>>> chunked header data is payload data. The original bug was reported in
>>>> September, but no one noticed it. It was reported again recently.
>>>> I suspect we'd want to get a 3.0.1 out as soon as this is fixed, but
>>>> that's not my call.
>>> Python-Dev mailing list
>>> Python-Dev at python.org
>>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
More information about the Python-Dev