Making a socket connection via a proxy server

Alan Kennedy alanmk at hotmail.com
Fri Jul 30 19:28:09 CEST 2004


[Fuzzyman]

> In a nutshell - the question I'm asking is, how do I make a socket
> conenction go via a proxy server ?
> All our internet traffic has to go through a proxy-server at location
> 'dav-serv:8080' and I need to make a socket connection through it.

> I am hacking "Tiny HTTP Proxy" by SUZUKI Hisao to make an http proxy
> that modifies URLs. I haven't got very far - having started from zero
> knowledge of the 'hyper text transfer protocol'.
> 
> It looks like the Tiny HTTP Proxy (using BaseHTTPServer as it's
> foundation) intercepts all requests to local addresses and then
> re-implements the request (whether it is CONNECT, GET, PUT or
> whatever). It logs everything that goes through it - I will simply
> edit it to amend the URL that is being asked for.

Yes, that is exactly what the proxy should do. It relays requests 
between client and server. However, there is one vital detail you're 
probably missing that is preventing you from chaining client + proxy*N 
+ server together.

When sending a HTTP GET request to a server, a client sends a request 
line containing a URI without a server component. This is because the 
socket connection to the server is already formed, therefore the 
server connection details do not need to be repeated. So a standard 
GET will look like this

GET /index.html HTTP/1.1

However, it's different when a client connects to a proxy, because the 
socket no longer connects directly to the server, but to the proxy 
instead. The proxy still needs to know to which server it should send 
the request. So the correct format for sending requests to a proxy is 
to use the "absoluteURI" form, which includes the server details, e.g.

GET http://www.python.org:80/index.html HTTP/1.1

Any proxy that receives such a request now knows that the server to 
forward to is "www.python.org:80". It will open a connection to 
www.python.org:80, and send it a GET request for the URI.

Since you want your proxy to forward to another proxy, i.e. your proxy 
is a client from your external-access-proxy's point of view, you 
should also use the absoluteURI form when making requests from your 
python proxy to your external proxy.

> It looks like the CONNECT and GET requests are just implemented using
> simple socket commands. (I say simple because there isn't a lot of
> code - I'm not familiar with the actual behaviour of sockets, but it
> doesn't look too complicated).
> 
> What I need to do is rewrite the soc.connect(host_port) line in the
> following example so that it connects *via* my proxy-server. (which it
> doesn't by default).
> 
> I think the current format of host_port is a tuple : (host_domain,
> port_no)
> 
> Below is a summary of the GET command (I've inlined all the method
> calls - this example starts from the do_GET method) :
> 
> soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> soc.connect(host_port)

What is the value of host_port at this point? It *should* be the 
address of your external access proxy, i.e. dav-serv:8080

> soc.send("%s %s %s\r\n" % (
>     self.command,
>     urlparse.urlunparse(('', '', path, params, query, '')),
>     self.request_version))

And you're not sending an absoluteURI: this should be amended to 
contain the server details of the the server that is finally going to 
service the request. For the python.org example above, this code would be

soc.send("%s %s %s\r\n" % (
     self.command,
     urlparse.urlunparse(('http', 'www.python.org:80', path, params, 
query, '')),
     self.request_version))

though of course, these values should be made available to you by 
TinyHTTPProxy. Taking a brief look at the code, these values should 
available through the variables "scm" and "netloc". So your outgoing 
connection code from TinyHTTPProxy should look something like this

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.connect( ('dav-serv', 8080) )
soc.send("%s %s %s\r\n" % (
     self.command,
     urlparse.urlunparse((scm, netloc, path, params, query, '')),
     self.request_version))

HTH,

-- 
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/contact/alan



More information about the Python-list mailing list