[Python-Dev] Re: Patched Transport
Bill Bumgarner
bbum@codefab.com
Fri, 6 Dec 2002 11:04:31 -0500
On Friday, December 6, 2002, at 04:21 AM, Fredrik Lundh wrote:
> hi bill,
>
>
>> As I move to a Transport subclass that uses urllib2, I first created a
>> Transport subclass whose request() method is patched to delegate all
>> interaction with the connection to methods on the Transport class.
>
> looks fine.
>
> just one nit: changing the make_connection signature may break
> existing programs. how about adding yet another hook:
>
> h = self.make_connection(host)
> if verbose:
> self.set_debuglevel(h, verbose)
>
> ...
>
> def set_debuglevel(self, connection, verbosity):
> ...
>
> cheers /F
I made the change as recommended -- I was being lazy...
XMLRPC proxying now works and, along with it, so does ProxyHandler
ordering. The fix I have does not modify urllib2 -- for my current
needs, I'm trying to stick with stock python as much as possible. As
such, I created a _fixUpHandlers() that will fix the handler order of
any Opener instance -- likely, it should be eliminated and a similar
fix integrated into add_handler() on Opener directly.
I will test to make sure it works with a mix of HTTPS and with proxy
servers that require authentication [I'm currently testing without a
network connection on my laptop while on the train -- thankfully, OS
X's apache configuration includes mod_proxy in the apache
configuration... just have to turn it on].
Caveats are noted in the code. In particular, using urllib2 really
wants the full URL so that it can do protocol resolution internally.
I also ended up with a dummy class (_TransportConnection) that can
cache-and-carry the hunks of information passed to the various send_*
and get_* methods as the granularity of those methods simply does not
match the calling semantics of the urllib2 library.
To initialize a ServerProxy with HTTPTransport, I use the following
function. Note that the arguments passed in are identifical to the
initializer for ServerProxy. proxyUrl, proxyUser, proxyPass are
currently stored as module variables -- this will change in the very
near future, but is largely irrelevant in the context of xmlrpclib /
urllib2. Of relevancy, I believe that proxyUrl could be of the form
http://<user>:<pass>@host:port/path/ and it would be no different than
creating a HTTPTransport by passing in the user/pass seperately.
def serverProxyForUrl(uri, transport=None, encoding=None, verbose=0):
if not transport:
transport = HTTPTransport.HTTPTransport(uri, proxyUrl,
proxyUser, proxyPass)
return xmlrpclib.ServerProxy(uri, transport, encoding, verbose)
(4 space indent, not tabs-- I believe this is the recommended standard
for the python library?)
In operation, HTTPTransport could completely replace the Transport and
SafeTransport classes found within xmlrpclib. However, the one
incompatible change is in the way the transport class is instantiated.
The __init__ method requires a single argument; the full URI of the
XML-RPC server. ServerProxy's __init__() could easily be modified to
address this issue.
Next, I'll probably and compressed content support -- in my app, I can
reduce the bytecount sent between client and server by upwards of 90%
by simply gzip'ing the XML prior to dropping it on the wire. I haven't
even remotely looked at how to do this. The actually
compression/decompression is trivial, but there are a lot of possible
spots to stick the compression/decompression mechanism.
b.bum
---
"""
HTTPTransport provides an urllib2 based communications channel for use
in xmlrpclib.
Created by Bill Bumgarner <bbum@mac.com>.
Using HTTPTransport allows the XML-RPC library to take full advantage
of the features perpetuated by urllib2, including HTTP proxy support
and a number of different authentication schemes. If urllib2 does not
provide a handler capable of meeting the developer's need, the
developer can create a custom Handler without requiring any changes to
the XML-RPC code.
Example usage:
def serverProxyForUrl(uri, transport=None, encoding=None, verbose=0):
if not transport:
transport = HTTPTransport.HTTPTransport(uri, proxyUrl,
proxyUser, proxyPass)
return xmlrpclib.ServerProxy(uri, transport, encoding, verbose)
"""
import xmlrpclib
from xmlrpclib import ProtocolError
from urllib import splittype, splithost
import urllib2
import sys
class _TransportConnection:
pass
def _fixHandlerArrayOrder(handlers):
insertionPoint = 0
for handlerIndex in range(0, len(handlers)):
aHandler = handlers[handlerIndex]
if isinstance(aHandler, urllib2.ProxyHandler):
del handlers[handlerIndex]
handlers.insert(insertionPoint, aHandler)
insertionPoint = insertionPoint + 1
def _fixUpHandlers(anOpener):
### Moves proxy handlers to the front of the handlers in anOpener
#
# This function preserves the order of multiple proxyhandlers, if
present.
# This appears to be wasted effort in that build_opener() chokes if
there
# is more than one instance of any given handler class in the
arglist.
_fixHandlerArrayOrder(anOpener.handlers)
map(lambda x: _fixHandlerArrayOrder(x),
anOpener.handle_open.values())
class HTTPTransport(xmlrpclib.Transport):
"""Handles an HTTP transaction to an XML-RPC server using urllib2
[eventually]."""
def __init__(self, uri, proxyUrl=None, proxyUser=None,
proxyPass=None):
### this is kind of nasty. We need the full URI for the
host/handler we are connecting to
# to properly use urllib2 to make the request. This does not
mesh completely cleanly
# with xmlrpclib's initialization of ServerProxy.
self.uri = uri
self.proxyUrl = proxyUrl
self.proxyUser = proxyUser
self.proxyPass = proxyPass
def request(self, host, handler, request_body, verbose=0):
# issue XML-RPC request
h = self.make_connection(host)
self.set_verbosity(h, verbose)
self.send_request(h, handler, request_body)
self.send_host(h, host)
self.send_user_agent(h)
self.send_content(h, request_body)
errcode, errmsg, headers = self.get_reply(h)
if errcode != 200:
raise ProtocolError(
host + handler,
errcode, errmsg,
headers
)
self.verbose = verbose
return self.parse_response(self.get_file(h))
def make_connection(self, host, verbose=0):
return _TransportConnection()
def set_verbosity(self, connection, verbose):
connection.verbose = verbose
def send_request(self, connection, handler, request_body):
connection.request = urllib2.Request(self.uri, request_body)
def send_host(self, connection, host):
connection.request.add_header("Host", host)
def send_user_agent(self, connection):
# There is no way to override the 'user-agent' sent by the
UrlOpener.
# This will cause a second User-agent header to be sent.
# This is both different from the urllib2 documentation of
add_header()
# and would seem to be a bug.
#
# connection.request.add_header("User-agent", self.user_agent)
pass
def send_content(self, connection, request_body):
connection.request.add_header("Content-Type", "text/xml")
def get_reply(self, connection):
proxyHandler = None
if self.proxyUrl:
if self.proxyUser:
type, rest = splittype(self.proxyUrl)
host, rest = splithost(rest)
if self.proxyPass:
user = "%s:%s" % (self.proxyUser, self.proxyPass)
else:
user = self.proxyUser
uri = "%s://%s@%s%s" % (type, user, host, rest)
else:
uri = self.proxyUrl
proxies = {'http':uri, 'https':uri}
proxyHandler = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxyHandler)
_fixUpHandlers(opener)
try:
connection.response = opener.open(connection.request)
except urllib2.HTTPError, c:
return c.code, c.msg, c.headers
return 200, "OK", connection.response.headers
def get_file(self, connection):
return connection.response