Re: [Python-Dev] how to debug httplib slowness

4 Sep 2009

      Simon Cross wrote:
...
Well, since the source for _read_chunked includes the comment
# XXX This accumulates chunks by repeated string concatenation,
        # which is not efficient as the number or size of chunks gets big.
you might gain some speed improvement with minimal effort by gathering
the read data chunks into a list and then returning "".join(chunks) at
the end.
True, I'll be trying that and reporting back, but, more interestingly, I 
did some analysis with wireshark (only 200MB-odd of .pcap logs that was 
fun ;-) to see the differences in the http conversation and noticed more 
interestingness...

So, httplib does this:

GET /<blah> HTTP/1.1
Host: <blah>
Accept-Encoding: identity
Authorization: Basic <blah>

HTTP/1.1 200 OK
Date: Fri, 04 Sep 2009 11:44:22 GMT
Server: Apache-Coyote/1.1
ContentLength: 116245504
Content-Type: application/vnd.excel
Transfer-Encoding: chunked

While wget does this:

<snip 401 conversation>
GET /<blah> HTTP/1.0
User-Agent: Wget/1.11.4
Accept: */*
Host: <blah>
Connection: Keep-Alive
Authorization: Basic <blah>

HTTP/1.1 200 OK
Date: Fri, 04 Sep 2009 11:35:19 GMT
Server: Apache-Coyote/1.1
ContentLength: 116245504
Content-Type: application/vnd.excel
Connection: close

Interesting points:

- Apache in this instance responds with HTTP 1.1, even though the wget 
request was 1.0, is that allowed?

- Apache responds with a chunked response only to httplib. Why is that?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk

Re: [Python-Dev] how to debug httplib slowness

Chris Withers