Simple Python web proxy stalls for some web sites

bryanjugglercryptographer at bryanjugglercryptographer at
Thu Oct 7 23:10:35 CEST 2004

Carl Waldbieser wrote:
> I have written a simple web proxy using the Python standard library
> BaseHTTPRequestHandler.
> Some web sites work fine (e.g.  However, some web
> sites simply seem to stall indefinitely (e.g.  If I
> the same browser to connect directly to the Internet, the site comes
> close to immediately.
> If anybody has any ideas about why this happens, or any coding
mistakes I
> may have made, I would appreciate the feedback.

>             f = urllib2.urlopen(request)
>             print "Reading..."
>             s =

This is trying to read until the connection closes, but it's an
HTTP/1.1 connection (and Google usually even sends "connection:
Keep-Alive"), so it won't close after it responds to this one
request.  The "content-length" header tells you how much to read
in this case.

Google HTTP/1.1 query results are even trickier; they typically
come back with "Transfer-Encoding: chunked", and, if you sent
the right Accept-Encoding header, will also usually have
"Content-Encoding: gzip".

See RFC 2616 for the requirements to be a true HTTP/1.1 proxy.
BaseHTTPRequestHandler and urllib2 are not really up to the job.

More information about the Python-list mailing list