Re: [Twisted-Python] WSGI Thread-management strategy

James Y Knight wrote:
(BTW, the correct mailing list for twisted webbish stuff is twisted- web@twistedmatrix.com) On Dec 15, 2005, at 4:22 PM, Jim Fulton wrote:
The strategy used by twisted WSGI, as I understand it, doesn't meet our needs. Currently, a thread is created for each request. The total number of threads is throttled, I gather using a general Twisted thread limit. WSGI applications are called as soon as input headers have been received completely. An application may be called before all body input is received. We need application calls to be delayed until all request input has been received,
[...]
I propose that the default thread-management strategy should be to delay calling an application until all request input has been received. If this isn't the default, then there should at least be an option to get this behavior. (Of course, the buffering strategy needs to be clever enough to switch to a file when the input gets over some size.)
Sounds sensible, and is doable external to the WSGI wrapper. Here's a little bit I whipped up. (works on the 2.1.x branch and head). Could be smarter, by starting out the buffer in memory and switching to a file if necessary. Also shows off a couple of minor bugs I need to fix. :)
def simple_wsgi_app(environ, start_response): print "Starting wsgi app" start_response("200 OK", [('Content-type','text/html; charset=ISO-8859-1')]) data = environ['wsgi.input'].read() return ['<pre>', data, '</pre>']
class Prebuffer(resource.WrapperResource): def hook(self, ctx): req = iweb.IRequest(ctx) temp = tempfile.TemporaryFile() def done(_): temp.seek(0) # Replace the request's stream object with the tempfile req.stream = stream.FileStream(temp) # Hm, this shouldn't be required: req.stream.doStartReading = None return stream.readStream(req.stream, temp.write).addCallback (done)
# Oops, fix missing () in lambda in WrapperResource def locateChild(self, ctx, segments): x = self.hook(ctx) if x is not None: return x.addCallback(lambda data: (self.res, segments)) return self.res, segments
if __name__ == '__builtin__': from twisted.application import service, strports from twisted.web2 import server, channel
res = Prebuffer(wsgi.WSGIResource(simple_wsgi_app)) site = server.Site(res) application = service.Application("demo") s = strports.service('tcp:8080', channel.HTTPFactory(site)) s.setServiceParent(application)
This doesn't work for me if the input gets over a 8K, It turns out the MMap wrapper is broken. When the resulting stream is read, the first line of output seems to be a mmap repr:
<mmap.mmap object at 0xb68ef6c0>
and subsequent calls to readline return '\n'.
Disabling MMap support makes this work.
Debugging this was a bit frightening. To read a line from a temporary file, we end up calling the reactor. Is all this indirection and machinery really necessary to read a temporary file? That question is mostly rhetorical. :)
Jim
participants (1)
-
Jim Fulton