[Twisted-Python] web2: http Content-Length header

Hi all, I'm using twisted.web2.client.HTTPClientProtocol to implement a HTTP downloader object. In order to show the download progress, I need the Content-Length header, which is found in the server response. But it turns out this header is being removed from the response, because it is listed in connHeaderNames, twisted/web2/channel/http.py line 217. Leaving 'content-length' out of this list gives me the valid content-length header in the response. Is it really necessary to remove the content-length header from the response? Regards, Pieter

On Monday 04 December 2006 22:16, Pieter Grimmerink wrote:
Is it really necessary to remove the content-length header from the response?
Correction, it turns out indeed it has to be removed. If it isn't stripped from the response, somehow the download stops after a single block of data has been received. Since the content-length is saved in self.length, the following could be a workaround (and it does work fine here): Add the Content-Length header again after calling setConnectionParams, at the bottom of allHeadersReceived (line 195 twisted/web2/channel/http.py) self.inHeaders.setHeader('Content-Length', str(self.length)) What exactly goes wrong with setConnectionParams when the Content-Length header is not (temporarily) removed is not 100% clear to me, so there probably are better solutions for this problem. Regards, Pieter

On Dec 5, 2006, at 1:55 PM, Pieter Grimmerink wrote:
Why don't you just use self.length when showing the progress? -David David Reid http://dreid.org/

On Wednesday 06 December 2006 01:54, David Reid wrote:
Why don't you just use self.length when showing the progress?
The length is not passed to the Response object. twisted/web2/client/http.py line 168: self.response = http.Response(self.code, self.inHeaders, self.stream) This response object is then passed to the responseDefer callback, which means we only have resultcode, headers and stream to work with. Rgds, Pieter

On Dec 6, 2006, at 4:35 AM, Pieter Grimmerink wrote:
Sorry for not responding before. The Content-Length is removed from the set of headers and moved to an attribute of the stream. This is so that if anybody does content transforms before you get the data (e.g. transparent uncompressing/etc), the length and the data is in the same place, among other reasons. You're supposed to get the length from the stream object. stream.length is an integer if the content length is known, and None if it is not known. James

On Wednesday 06 December 2006 17:23, James Y Knight wrote:
But stream.length is always None, even though the Content-Length is specified. Pasted below is an example which demonstrates what I'm doing. This is the response I'm getting: <twisted.web2.http.Response code=302, streamlen=None> And this is the response, when I add the line self.stream.length = self.length in twisted/web2/client/http.py, line 168: <twisted.web2.http.Response code=302, streamlen=228> Rgds, Pieter --------------------------------------------------- from twisted.internet import protocol from twisted.web2 import stream as stream_mod, http, http_headers, responsecode from twisted.web2.client.http import ClientRequest, HTTPClientProtocol def testConn(host): from twisted.internet import reactor d = protocol.ClientCreator(reactor, HTTPClientProtocol).connectTCP(host, 80) def gotResp(resp): def print_(n): print "DATA" def printdone(n): print "DONE" print "GOT RESPONSE %s" % resp stream_mod.readStream(resp.stream, print_).addCallback(printdone) def sendReqs(proto): proto.submitRequest(ClientRequest("GET", "/index.html", {'Host':host}, None)).addCallback(gotResp) d.addCallback(sendReqs) del d reactor.run() testConn("www.google.com")

On Monday 04 December 2006 22:16, Pieter Grimmerink wrote:
Is it really necessary to remove the content-length header from the response?
Correction, it turns out indeed it has to be removed. If it isn't stripped from the response, somehow the download stops after a single block of data has been received. Since the content-length is saved in self.length, the following could be a workaround (and it does work fine here): Add the Content-Length header again after calling setConnectionParams, at the bottom of allHeadersReceived (line 195 twisted/web2/channel/http.py) self.inHeaders.setHeader('Content-Length', str(self.length)) What exactly goes wrong with setConnectionParams when the Content-Length header is not (temporarily) removed is not 100% clear to me, so there probably are better solutions for this problem. Regards, Pieter

On Dec 5, 2006, at 1:55 PM, Pieter Grimmerink wrote:
Why don't you just use self.length when showing the progress? -David David Reid http://dreid.org/

On Wednesday 06 December 2006 01:54, David Reid wrote:
Why don't you just use self.length when showing the progress?
The length is not passed to the Response object. twisted/web2/client/http.py line 168: self.response = http.Response(self.code, self.inHeaders, self.stream) This response object is then passed to the responseDefer callback, which means we only have resultcode, headers and stream to work with. Rgds, Pieter

On Dec 6, 2006, at 4:35 AM, Pieter Grimmerink wrote:
Sorry for not responding before. The Content-Length is removed from the set of headers and moved to an attribute of the stream. This is so that if anybody does content transforms before you get the data (e.g. transparent uncompressing/etc), the length and the data is in the same place, among other reasons. You're supposed to get the length from the stream object. stream.length is an integer if the content length is known, and None if it is not known. James

On Wednesday 06 December 2006 17:23, James Y Knight wrote:
But stream.length is always None, even though the Content-Length is specified. Pasted below is an example which demonstrates what I'm doing. This is the response I'm getting: <twisted.web2.http.Response code=302, streamlen=None> And this is the response, when I add the line self.stream.length = self.length in twisted/web2/client/http.py, line 168: <twisted.web2.http.Response code=302, streamlen=228> Rgds, Pieter --------------------------------------------------- from twisted.internet import protocol from twisted.web2 import stream as stream_mod, http, http_headers, responsecode from twisted.web2.client.http import ClientRequest, HTTPClientProtocol def testConn(host): from twisted.internet import reactor d = protocol.ClientCreator(reactor, HTTPClientProtocol).connectTCP(host, 80) def gotResp(resp): def print_(n): print "DATA" def printdone(n): print "DONE" print "GOT RESPONSE %s" % resp stream_mod.readStream(resp.stream, print_).addCallback(printdone) def sendReqs(proto): proto.submitRequest(ClientRequest("GET", "/index.html", {'Host':host}, None)).addCallback(gotResp) d.addCallback(sendReqs) del d reactor.run() testConn("www.google.com")
participants (3)
-
David Reid
-
James Y Knight
-
Pieter Grimmerink