Trouble with proxies

Jeremy Hylton jeremy at cnri.reston.va.us
Mon May 3 22:07:00 EDT 1999


So here's the problem: There is a bug in the WinProxy.  It will return 
the 403 error you are getting whenever the initial packet from the
client does not contain a full HTTP request.  I installed a copy
locally and could easily reproduce the problem; you can check yourself 
by telneting directly to the proxy server and trying to type an HTTP
request -- 'GET http://www.dejanews.com/ HTTP/1.0'.  As soon as you
hit the carriage return, you'll get the error.

This is definitely a problem with the proxy, and they ought to fix it.
On my machine, Netscape sends the whole request in the first packet,
so it doesn't have a problem.  Python triggers the bug because it
sends each line of the request separately.  (I think there was a
thread about this behavior in the newsgroup a while back, but I can't
think of the right search terms to turn it up.  It is inefficient, but 
not incorrect.)

You can work around the bug, if you must, by modifying httplib.  There 
isn't any particularly clean solution, but here's an example of an
httplib.HTTP subclass that delays sending a request until one of the
following happens: (1) it sees the '\r\n\r\n' that ends the headers,
(2) it has buffered more than 1K of data, or (3) the send is
explicitly forced.

import httplib
import string

class HTTP(httplib.HTTP):
    def __init__(self, host='', port=0):
	httplib.HTTP.__init__(self, host, port)
	self.__buf = ''

    def send(self, str, force=0):
	self.__buf = self.__buf + str
	if (string.find(self.__buf, '\r\n\r\n') != -1) \
	   or (len(self.__buf) >= 1024) \
	   or force:
	    if self.debuglevel > 0: print 'send:', `str`
	    self.sock.send(self.__buf)
	    self.__buf = ''
	
    def endheaders(self):
	self.send('\r\n', 1)


If you wire your urllib to use this http implementation, the proxy
bug should remain safely hidden.

Jeremy








More information about the Python-list mailing list