zlib, gzip and HTTP compression.

Fri Jan 11 14:38:29 EST 2002

jj <jj at void.si> wrote in message news:<svct3uck4rce72mdl7drndt5v2a3965k45 at 4ax.com>...
> dont use zlib, 
> http requires data in gzip format, zlip.compress returns quite
> different structure.
> so import gzip ...

Thanks JJ.

OK, I understand the problem better now. I read RFC 1952, which
explains the structure of gzip files. And looking at
python/Lib/gzip.py, it appears to construct exactly the structure
required.

So I reworked my code to use gzip, and I'm almost there. The first
~200 bytes of the HTML file now appear exactly as they should, but
then it corrupts after that. Obviously the mechanism for communicating
from the server to the client that I am sending gzipped data is
working, but it looks like I'm sending gzipped data that is slightly
corrupt, or I'm telling the client the wrong length, or some such.
Close, but no cigar!

There is obviously some small detail that I am missing, such as
character translation during the print statement(?), one extra byte
need somewhere, etc?

Any hints anyone?

The new code is presented below.

TIA,

Alan.

------------------------------------------
#! C:/python21/python.exe

import string
import os
import gzip
import StringIO

# Use any old HTML file (which displays fine standalone)

f = open("test.html")
buf = f.read()
f.close()

def compressBuf(buf):
	zbuf = StringIO.StringIO()
	zfile = gzip.GzipFile(None, 'wb', 9, zbuf)
	zfile.write(buf)
	zfile.close()
	return zbuf.getvalue()

acceptsGzip = 0
if string.find(os.environ["HTTP_ACCEPT_ENCODING"], "gzip") != -1:
	acceptsGzip = 1
	zbuf = compressBuf(buf)

print "Content-type: text/html"
if acceptsGzip:
	print "Content-Encoding: gzip"
	print "Content-Length: %d" % (len(zbuf))
	print                                     # end of headers
	print zbuf                                # and then the buffer
else:
	print                                     # end of headers
	print buf                                 # and then the buffer

# end of script
------------------------------------------------------