gzip HTTP results problem

John J. Lee jjl at pobox.com
Tue Jul 29 17:39:36 EDT 2003


"Fredrik Lundh" <fredrik at pythonware.com> writes:

> Bill Loren wrote:
> 
> > I've encountered a problem trying to decode gzip data
> > returned from an HTTP server I communicate with (I use urllib2).
> > I've tried to use both the gzip and zlib come-along python libraries but
> > alas.
> >
> > Have anyone of you ppl succeeded in talking gzip with an HTTP
> > server ?
> 
> this might help:
> 
>     http://effbot.org/zone/consumer-gzip.htm
> 
> (that piece of code is used in production code, so it should
> work...)

That would go nicely with my unreleased latest version of
ClientCookie, which is plug-compatible with urllib2. but makes it
easier to add new functionality like this (I submitted a patch to
Python library based on this a while back, and am hoping for
comments).

The idea is that you pass processor objects to build_opener just as if
they were handler objects.  Processors pre-process requests and
post-process responses.  This stops you having to subclass things like
AbstractHTTPHandler.  Ask if you want a copy.

(code below is completely untested, unworking, purely illustrative!)

import ClientCookie
from GzipConsumer import GzipConsumer
from cStringIO import StringIO

class stupid_gzip_consumer:
    def __init__(self): self.data = []
    def feed(self, data): self.data.append(data)

class stupid_gzip_wrapper:
    def __init__(self, response):
        self._response = response

        c = stupid_gzip_consumer()
        gzc = GzipConsumer(c)
        gzc.feed(response.read())
        self.__data = StringIO("".join(c.data))

    def __getattr__(self, key):
        # delegate unknown methods/attributes
        if key in ("read", "readline", "readlines"):
            return getattr(self.__data, key)
        else:
            return getattr(self._response, key)

class HTTPGzipProcessor(ClientCookie.BaseProcessor):
    def http_response(self, request, response):
        # post-process response
        enc = response.hdrs.get["content-encoding"]
        if ("gzip" in enc) or ("compress" in enc):
            return stupid_gzip_wrapper(response)
        else:
            return response

    https_response = http_response


opener = ClientCookie.build_opener(HTTPGzipProcessor)
ClientCookie.install_opener(opener)

response = urlopen("http://www.example.com/")
print response.read()
response.close()


John




More information about the Python-list mailing list