[python-uk] urllib latency
matth at netsight.co.uk
Mon Dec 20 15:41:12 CET 2010
On 20 Dec 2010, at 14:15, Alex Willmer wrote:
> On 20 December 2010 13:54, Matt Hamilton <matth at netsight.co.uk> wrote:
> Anyone know why urllib.urlopen() can be so much slower than using ab to do the same thing? I seem to be getting an extra 100ms latency on a simple HTTP GET request of a static, small image.
> Just some possibles:
> - How many DNS lookups is each doing? Have you timed it by IP address, or with example.com in the hosts file?
Neither is doing DNS lookups, the DNS entry is cached. Adding it to /etc/host makes no difference either.
> - I understand ab isn't reusing the HTTP connection - could it be reusing the TCP connection or avoiding the 3-way handshake? (My understanding of repeated HTTP requests is sketchy)
Indeed, neither are using http keepalive, and from what I can see from wireshark, they are creating and closing a TCP connection for each request.
> - Are ab and urllib transferring the same number of bytes? Is one or the other (not) using compression?
Yes, same number of bytes, same number of packets 1514+67+1514+356 (bytes)
> - Is ab somehow causing fewer, larger packets to be used in either the request or the response?
Nope, same number.
> I'd probably be reaching for a packet capture about now.
That is what I'm doing... but slightly getting beyond me now in terms of TCP closing connections. Not really sure how best to share the capture data... but let me try and summarize some of the flows:
s -> c last packet (shows as HTTP 200)
c -> s ack
s -> c fin,ack
c -> s ack
c -> s fin,ack
s -> c ack
Actually, I was wrong in my previous post... the close actually goes quite quickly. *The following SYN* from the client to open the next tcp connection seems to take 100ms.
I'm off to go dig in the urllib code and see if I can see anything there. I'm wondering if urllib is taking some time to process the data after it receives it before doing anything.
This is on OSX, but I'm going to go try it on a FreeBSD box as I can then use ktrace to see what might be happening.
Matt Hamilton matth at netsight.co.uk
Netsight Internet Solutions, Ltd. Business Vision on the Internet
http://www.netsight.co.uk +44 (0)117 9090901
Web Design | Zope/Plone Development and Consulting | Co-location | Hosting
More information about the python-uk