New subject: [Twisted-Python] Asynchronous gzipped content decompression: best approach

30 Jul 2010

      Hi,

I have written a small utility function to replace
"twisted.web.client.getPage", to be able to read the response header.

I have to say that the ever improving documentation made it quite easy for
me to do it using the new twisted.web.client.Agent, so well done to all!

Since my wrapper works quite well, I decided to add gzip response support,
as it's another feature lacking from the original getPage. Again, it was
quite simple and it looks it works quite well, in proof of concept scenario.

Then it came my dilemma. What I'm doing now is a synchronous decompression
as shown below:

compressedstream = StringIO.StringIO(inzip)
gzipper = gzip.GzipFile(fileobj=compressedstream)
_data = gzipper.read()
return _data

This works quite well, but I wanted to add support for arbitrary large
compressed responses, and I wanted to ask your opinion on the best approach
for this:

-a separate thread? it has it's limit, as it's not scaling well at all, but,
in the possible scenario of a getPage usege shouldn't be a big issue (i.e.
not many concurrent calls)

-a Producer/consumer? That sounded like the modern twisted way of doing it,
but I didn't manage to be able to implement it properly, as I could create a
proper "consumer" class by looking to the example in the documentation...

-twisted.python.zipstream.DeflatedZipFileEntry?
I found this and seemed a potential way of doing it, with may be the use
of inline generators?
But then, I thought, is it a too  complex approach for a simple  problem?

I guess that decompressing data in twisted should be a fairly common task,
but I have not found a sample that looked like the "best" way for doing it,
so... here is this email

Thanks for you help, and I'll be happy to post the final code for future
reference if anyone is interested

Michele

[Twisted-Python] Asynchronous gzipped content decompression: best approach

Michele -

Itamar Turner-Trauring

tags

participants (2)