[Python-Dev] Re: Patched Transport

Bill Bumgarner bbum@codefab.com
Fri, 6 Dec 2002 11:04:31 -0500


On Friday, December 6, 2002, at 04:21 AM, Fredrik Lundh wrote:

> hi bill,
>
>
>> As I move to a Transport subclass that uses urllib2, I first created a
>> Transport subclass whose request() method is patched to delegate all
>> interaction with the connection to methods on the Transport class.
>
> looks fine.
>
> just one nit: changing the make_connection signature may break
> existing programs.  how about adding yet another hook:
>
>     h = self.make_connection(host)
>     if verbose:
>         self.set_debuglevel(h, verbose)
>
>     ...
>
>     def set_debuglevel(self, connection, verbosity):
>         ...
>
> cheers /F

I made the change as recommended -- I was being lazy...

XMLRPC proxying now works and, along with it, so does ProxyHandler 
ordering.    The fix I have does not modify urllib2 -- for my current 
needs, I'm trying to stick with stock python as much as possible.  As 
such, I created a _fixUpHandlers() that will fix the handler order of 
any Opener instance -- likely, it should be eliminated and a similar 
fix integrated into add_handler() on Opener directly.

I will test to make sure it works with a mix of HTTPS and with proxy 
servers that require authentication [I'm currently testing without a 
network connection on my laptop while on the train -- thankfully, OS 
X's apache configuration includes mod_proxy in the apache 
configuration... just have to turn it on].

Caveats are noted in the code.   In particular, using urllib2 really 
wants the full URL so that it can do protocol resolution internally.   
I also ended up with a dummy class (_TransportConnection) that can 
cache-and-carry the hunks of information passed to the various send_* 
and get_* methods as the granularity of those methods simply does not 
match the calling semantics of the urllib2 library.

To initialize a ServerProxy with HTTPTransport, I use the following 
function.  Note that the arguments passed in are identifical to the 
initializer for ServerProxy.   proxyUrl, proxyUser, proxyPass are 
currently stored as module variables -- this will change in the very 
near future, but is largely irrelevant in the context of xmlrpclib / 
urllib2.  Of relevancy, I believe that proxyUrl could be of the form 
http://<user>:<pass>@host:port/path/ and it would be no different than 
creating a HTTPTransport by passing in the user/pass seperately.

def serverProxyForUrl(uri, transport=None, encoding=None, verbose=0):
     if not transport:
         transport = HTTPTransport.HTTPTransport(uri, proxyUrl, 
proxyUser, proxyPass)
     return  xmlrpclib.ServerProxy(uri, transport, encoding, verbose)

(4 space indent, not tabs-- I believe this is the recommended standard 
for the python library?)

In operation, HTTPTransport could completely replace the Transport and 
SafeTransport classes found within xmlrpclib.  However, the one 
incompatible change is in the way the transport class is instantiated.  
The __init__ method requires a single argument;   the full URI of the 
XML-RPC server.   ServerProxy's __init__() could easily be modified to 
address this issue.

Next, I'll probably and compressed content support -- in my app, I can 
reduce the bytecount sent between client and server by upwards of 90% 
by simply gzip'ing the XML prior to dropping it on the wire.  I haven't 
even remotely looked at how to do this.  The actually 
compression/decompression is trivial, but there are a lot of possible 
spots to stick the compression/decompression mechanism.

b.bum

---

"""
HTTPTransport provides an urllib2 based communications channel for use 
in xmlrpclib.

Created by Bill Bumgarner <bbum@mac.com>.

Using HTTPTransport allows the XML-RPC library to take full advantage 
of the features perpetuated by urllib2, including HTTP proxy support 
and a number of different authentication schemes.  If urllib2 does not 
provide a handler capable of meeting the developer's need, the 
developer can create a custom Handler without requiring any changes to 
the XML-RPC code.

Example usage:

def serverProxyForUrl(uri, transport=None, encoding=None, verbose=0):
     if not transport:
         transport = HTTPTransport.HTTPTransport(uri, proxyUrl, 
proxyUser, proxyPass)
     return  xmlrpclib.ServerProxy(uri, transport, encoding, verbose)
"""
import xmlrpclib
from xmlrpclib import ProtocolError
from urllib import splittype, splithost
import urllib2
import sys

class _TransportConnection:
     pass

def _fixHandlerArrayOrder(handlers):
     insertionPoint = 0
     for handlerIndex in range(0, len(handlers)):
         aHandler = handlers[handlerIndex]
         if isinstance(aHandler, urllib2.ProxyHandler):
             del handlers[handlerIndex]
             handlers.insert(insertionPoint, aHandler)
             insertionPoint = insertionPoint + 1


def _fixUpHandlers(anOpener):
     ### Moves proxy handlers to the front of the handlers in anOpener
     #
     # This function preserves the order of multiple proxyhandlers, if 
present.
     # This appears to be wasted effort in that build_opener() chokes if 
there
     # is more than one instance of any given handler class in the 
arglist.
     _fixHandlerArrayOrder(anOpener.handlers)
     map(lambda x: _fixHandlerArrayOrder(x), 
anOpener.handle_open.values())

class HTTPTransport(xmlrpclib.Transport):
     """Handles an HTTP transaction to an XML-RPC server using urllib2 
[eventually]."""
     def __init__(self, uri, proxyUrl=None, proxyUser=None, 
proxyPass=None):
         ### this is kind of nasty.  We need the full URI for the 
host/handler we are connecting to
         # to properly use urllib2 to make the request.  This does not 
mesh completely cleanly
         # with xmlrpclib's initialization of ServerProxy.
         self.uri = uri
         self.proxyUrl = proxyUrl
         self.proxyUser = proxyUser
         self.proxyPass = proxyPass

     def request(self, host, handler, request_body, verbose=0):
         # issue XML-RPC request

         h = self.make_connection(host)
         self.set_verbosity(h, verbose)

         self.send_request(h, handler, request_body)
         self.send_host(h, host)
         self.send_user_agent(h)
         self.send_content(h, request_body)

         errcode, errmsg, headers = self.get_reply(h)

         if errcode != 200:
             raise ProtocolError(
                 host + handler,
                 errcode, errmsg,
                 headers
                 )

         self.verbose = verbose

         return self.parse_response(self.get_file(h))

     def make_connection(self, host, verbose=0):
         return  _TransportConnection()

     def set_verbosity(self, connection, verbose):
         connection.verbose = verbose

     def send_request(self, connection, handler, request_body):
         connection.request = urllib2.Request(self.uri, request_body)

     def send_host(self, connection, host):
         connection.request.add_header("Host", host)

     def send_user_agent(self, connection):
         # There is no way to override the 'user-agent' sent by the 
UrlOpener.
         # This will cause a second User-agent header to be sent.
         # This is both different from the urllib2 documentation of 
add_header()
         # and would seem to be a bug.
         #
         # connection.request.add_header("User-agent", self.user_agent)
         pass

     def send_content(self, connection, request_body):
         connection.request.add_header("Content-Type", "text/xml")

     def get_reply(self, connection):
         proxyHandler = None
         if self.proxyUrl:
             if self.proxyUser:
                 type, rest = splittype(self.proxyUrl)
                 host, rest = splithost(rest)

                 if self.proxyPass:
                     user = "%s:%s" % (self.proxyUser, self.proxyPass)
                 else:
                     user = self.proxyUser

                 uri = "%s://%s@%s%s" % (type, user, host, rest)
             else:
                 uri = self.proxyUrl
             proxies = {'http':uri, 'https':uri}
             proxyHandler = urllib2.ProxyHandler(proxies)

         opener = urllib2.build_opener(proxyHandler)
         _fixUpHandlers(opener)
         try:
             connection.response = opener.open(connection.request)
         except urllib2.HTTPError, c:
             return c.code, c.msg, c.headers

         return 200, "OK", connection.response.headers

     def get_file(self, connection):
         return connection.response