Gzipped Response with web.client.Agent
Hello, I've putted together examples from the web (see below) to replace getPage with web.client.Agent and it works fine. Now, I'd like to get gzipped response, but I can't get gunzipping to work before returning result. Thanks for help! class StringGzipReceiver(Protocol): def __init__(self): self.string = None self.deferred = defer.Deferred() def dataReceived(self, bytes): print "dataReceived" print type(bytes) if self.string: self.string += bytes else: self.string = bytes def connectionLost(self, reason): if reason.check(ResponseDone) or reason.check(PotentialDataLoss): gzipper = gzip.GzipFile(fileobj=self.string) gz = gzipper.read() result = unicode(gz, 'UTF-8') self.deferred.callback(result) else: self.deferred.errback(reason) class StringReceiver(Protocol): def __init__(self): self.string_io = codecs.getwriter('utf_8')(StringIO()) self.deferred = defer.Deferred() def dataReceived(self, bytes): self.string_io.write(bytes) def connectionLost(self, reason): if reason.check(ResponseDone) or reason.check(PotentialDataLoss): self.deferred.callback(self.string_io.getvalue()) else: self.deferred.errback(reason) class StringProducer(object): implements(IBodyProducer) def __init__(self, body): self.body = body self.length = len(body) def startProducing(self, consumer): consumer.write(self.body) return succeed(None) def pauseProducing(self): pass def stopProducing(self): pass def SearchHotelsByID(): host = 'demo.com' postdata = 'some data' headers = { 'Host' : [host], 'Accept-Encoding' : ['gzip'] } def cbRequest(response): stringReceiver = StringGzipReceiver() response.deliverBody(stringReceiver) return stringReceiver.deferred def _noPage(failure): print "Error: %s" % failure.getErrorMessage() print failure.getTraceback() return failure agent = Agent(reactor) d = agent.request( 'POST', url, headers=Headers(headers), bodyProducer=StringProducer(postdata) ) d.addCallback(cbRequest) d.addErrback(_noPage) d.addBoth(finish) return d
On 9 Aug, 11:34 am, sergei.vokdin@yandex.ru wrote:
Hello,
I've putted together examples from the web (see below) to replace getPage with web.client.Agent and it works fine. Now, I'd like to get gzipped response, but I can't get gunzipping to work before returning result.
Thanks for help!
What are you having trouble with? Your code looks okay, more or less. If I were writing it, I'd try to make the gzip support more transparent, but the way you've done it seems like it should probably work. A few simple things I notice that could cause problems, but won't necessarily... * Using repeated string concatenation to buffer the response is going to be extremely slow for responses of any significant size. * In general, there's no guarantee you'll be able to decode the un- gzipped bytes using utf-8. You could easily have downloaded a gzipped TIFF image. * Similarly, there's no guarantee that the un-gzipping will succeed if you got a truncated response (represented by the PotentialDataLoss failure). * The server might have sent back un-gzipped contents. You have to check one of the response headers to see if it's appropriate to do the decompression. Jean-Paul
class StringGzipReceiver(Protocol): def __init__(self): self.string = None self.deferred = defer.Deferred()
def dataReceived(self, bytes): print "dataReceived" print type(bytes) if self.string: self.string += bytes else: self.string = bytes
def connectionLost(self, reason): if reason.check(ResponseDone) or reason.check(PotentialDataLoss): gzipper = gzip.GzipFile(fileobj=self.string) gz = gzipper.read() result = unicode(gz, 'UTF-8') self.deferred.callback(result) else: self.deferred.errback(reason)
class StringReceiver(Protocol): def __init__(self): self.string_io = codecs.getwriter('utf_8')(StringIO()) self.deferred = defer.Deferred()
def dataReceived(self, bytes): self.string_io.write(bytes)
def connectionLost(self, reason): if reason.check(ResponseDone) or reason.check(PotentialDataLoss): self.deferred.callback(self.string_io.getvalue()) else: self.deferred.errback(reason)
class StringProducer(object): implements(IBodyProducer)
def __init__(self, body): self.body = body self.length = len(body)
def startProducing(self, consumer): consumer.write(self.body) return succeed(None)
def pauseProducing(self): pass
def stopProducing(self): pass
def SearchHotelsByID(): host = 'demo.com' postdata = 'some data' headers = { 'Host' : [host], 'Accept-Encoding' : ['gzip'] }
def cbRequest(response): stringReceiver = StringGzipReceiver() response.deliverBody(stringReceiver) return stringReceiver.deferred
def _noPage(failure): print "Error: %s" % failure.getErrorMessage() print failure.getTraceback() return failure
agent = Agent(reactor) d = agent.request( 'POST', url, headers=Headers(headers), bodyProducer=StringProducer(postdata) ) d.addCallback(cbRequest) d.addErrback(_noPage) d.addBoth(finish)
return d
_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
10.08.10, 16:13, exarkun@twistedmatrix.com:
On 9 Aug, 11:34 am, sergei.vokdin@yandex.ru wrote:
Hello,
I've putted together examples from the web (see below) to replace getPage with web.client.Agent and it works fine. Now, I'd like to get gzipped response, but I can't get gunzipping to work before returning result.
Thanks for help!
What are you having trouble with? Your code looks okay, more or less. If I were writing it, I'd try to make the gzip support more transparent, but the way you've done it seems like it should probably work.
Hi, returned string should be wrapped into file object before passing to gzip module, that was the problem.
A few simple things I notice that could cause problems, but won't necessarily...
* Using repeated string concatenation to buffer the response is going to be extremely slow for responses of any significant size.
Nice catch, I'm forgetting this over and over again
* In general, there's no guarantee you'll be able to decode the un- gzipped bytes using utf-8. You could easily have downloaded a gzipped TIFF image.
* Similarly, there's no guarantee that the un-gzipping will succeed if you got a truncated response (represented by the PotentialDataLoss failure).
* The server might have sent back un-gzipped contents. You have to check one of the response headers to see if it's appropriate to do the decompression.
Thanks, for hints. This was next step in implementation. One issue I've still have is, how to set or implement timeout, so if remote host does not answer in some period of time, request would be cancelled. Thanks a lot!
participants (2)
-
exarkun@twistedmatrix.com -
Vokdin Sergei