[Twisted-Python] http server performance

hi, my project involves lot of I/O over the network.. one part of my project involves a server(http) which is listening on the port for many client . this sever fetches an image from the web and and send it to clients .... and many clients will request the server concurrently .. to implement concurrent serving to clients i used threaded http server like this class HTTPServer(SocketServer.ThreadingMixIn,BaseHTTPServer.HTTPServer): pass class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler): def do_GET(self): print "received connection from: ",self.client_address image=image_retreive() #where image retreive is a function that retrieves image from the web and works fine self.send_response(200) self.send_header("Content-type",format) self.send_header("Content-Length",len(image_data)) self.end_headers() self.request.sendall(image_data) httpd = HTTPServer(('',port_number), RequestHandler) httpd.serve_forever() this code worked fine but this performance was very bad ... it workes fine if the clients requested for small n medium size images as the server sends the response immediately and also workes fine if one client is requesting a large image (obviously server takes time to send response as it takes time to fetch the image from the web ) and other clients concurrently request for small and medium images these clients will be served immediately even if the other client is waiting but problem crops up when 2 clients concurrently request for an large image .. while these two clients are waiting for the response fromthe server . The server doesn't accept any other client request ... i can see this as i am printing the address of the client that connects with server in the 1st line of get method of the request handler if two clients concurrently request for an large image and only two clients address gets printed that means only 2 clients receives connection to the server even if other clients are requesting the server at the same time and other servers are served only after the those 2 server releases the connection or get the response . that means server servers only 2 clients at a time .this is very undesirable as even if 3rd client is requesting for very small image and 2 clients are waiting for large image .. 3rd client won't receive the response until those 2 clients are served . to make thing worst my server should serve 10 to 15 clients concurrently to solve this i did some searching and found about cherrypy and twisted also implemented my server in cherrypy like this from cherrypy import wsgiserver def image_httpserver_app(environ, start_response): print >>sys.stdout,"received connection from: (%s : %s ) \nthe image url is: %s " % (environ["REMOTE_ADDR"],environ["REMOTE_PORT"],environ["QUERY_STRING"]) status = '200 OK' response_headers = [('Content-type',format)] image=image_retreive() response_headers = [("Content-Length",`len(image_data)`)] start_response(status, response_headers) return [image_data] mappings=[('/', image_httpserver_app)] wsgi_apps = mappings server = wsgiserver.CherryPyWSGIServer(('localhost', 8888), wsgi_apps, server_name='localhost',numthreads=20) if __name__ == '__main__': try: server.start() except KeyboardInterrupt: server.stop() this didn't solve the problem at all .. same thing is happening only 2 clients is served at a time ..even if no of threads is assigned to 20 .. i have did lot of searching and reading .. and hoping to find a solution ..can anyone make it easier for me i have heard of twisted deffered object .. will it solved the problem ? if not pls suggest me alternative..

On Tue, 4 Mar 2008 20:11:08 +0530, bharath venkatesh <bharathv6.project@gmail.com> wrote:
Deferreds won't directly solve your problem, but using Twisted as your HTTP server (and client) should. It is generally the case that Twisted applications continue to perform well under increasing load - more so than thread-per-connection based systems. Jean-Paul

I am not aware of the scope of your project, or on your experience with C but if you are looking for high performance based on an ascynchronous events you might do well to take a look at the lighttpd web server. In the past I have had a lot of success, hacking lighttpd modules (such as the proxy module) for my own particular needs. The code base is small and easy to comprehend when compared to the monolith that is apache. I'm not saying that twisted won't be able to cater for your needs (it caters for most of mine). Just pointing you to possible alternatives... Matt Matthew Glubb Technical Partner email: matthew.glubb@madebykite.com phone: 44 (0) 7715 754017 skype: mglubb Kite http://madebykite.com -- GPG: 96FF DE0E 0B7B 37F0 7F8D C54C E285 3D8F 5625 9244 On 4 Mar 2008, at 15:41, Jean-Paul Calderone wrote:

Matthew Glubb wrote:
if you are looking for high performance based on an asynchronous events you might do well to take a look at the lighttpd web server.
And to nginx, also asynchronous, and without a history of leaking memory. Notice how I actually said nothing about Lighttpd: you may have imagined it. ;-) -- Nicola Larosa - http://www.tekNico.net/ The [European] Parliamentary Assembly therefore urges the member states, and especially their education authorities: [...] to firmly oppose the teaching of creationism as a scientific discipline on an equal footing with the theory of evolution and in general resist presentation of creationist ideas in any discipline other than religion. -- European Parliament, resolution 1580 (2007)

Clearly image_retrieve should also be async. All the beauty of twisted is destroy by locking it for such a retrieve. Almost anything that goes offboard (incl. filesystem) should be a deferred. The only exception i've found is memcached, because it's so dang fast ;^) I presume your needs are slightly beyond your code sample, which may have been modified for clarity, but if not, other web servers and proxys suggested may be a good way to go. m On Tue, Mar 4, 2008 at 11:47 AM, Nicola Larosa <nico@teknico.net> wrote:

I loooooove memcache, but I still defer it to a thread :/ Matthew Glubb Technical Partner email: matthew.glubb@madebykite.com phone: 44 (0) 7715 754017 skype: mglubb Kite http://madebykite.com -- GPG: 96FF DE0E 0B7B 37F0 7F8D C54C E285 3D8F 5625 9244 On 4 Mar 2008, at 20:22, Marc Byrd wrote:

On 4 Mar, 08:52 pm, matt@madebykite.com wrote:
I loooooove memcache, but I still defer it to a thread :/
Why? http://twistedmatrix.com/trac/browser/trunk/twisted/protocols/memcache.py

On Tue, 4 Mar 2008 20:11:08 +0530, bharath venkatesh <bharathv6.project@gmail.com> wrote:
Deferreds won't directly solve your problem, but using Twisted as your HTTP server (and client) should. It is generally the case that Twisted applications continue to perform well under increasing load - more so than thread-per-connection based systems. Jean-Paul

I am not aware of the scope of your project, or on your experience with C but if you are looking for high performance based on an ascynchronous events you might do well to take a look at the lighttpd web server. In the past I have had a lot of success, hacking lighttpd modules (such as the proxy module) for my own particular needs. The code base is small and easy to comprehend when compared to the monolith that is apache. I'm not saying that twisted won't be able to cater for your needs (it caters for most of mine). Just pointing you to possible alternatives... Matt Matthew Glubb Technical Partner email: matthew.glubb@madebykite.com phone: 44 (0) 7715 754017 skype: mglubb Kite http://madebykite.com -- GPG: 96FF DE0E 0B7B 37F0 7F8D C54C E285 3D8F 5625 9244 On 4 Mar 2008, at 15:41, Jean-Paul Calderone wrote:

Matthew Glubb wrote:
if you are looking for high performance based on an asynchronous events you might do well to take a look at the lighttpd web server.
And to nginx, also asynchronous, and without a history of leaking memory. Notice how I actually said nothing about Lighttpd: you may have imagined it. ;-) -- Nicola Larosa - http://www.tekNico.net/ The [European] Parliamentary Assembly therefore urges the member states, and especially their education authorities: [...] to firmly oppose the teaching of creationism as a scientific discipline on an equal footing with the theory of evolution and in general resist presentation of creationist ideas in any discipline other than religion. -- European Parliament, resolution 1580 (2007)

Clearly image_retrieve should also be async. All the beauty of twisted is destroy by locking it for such a retrieve. Almost anything that goes offboard (incl. filesystem) should be a deferred. The only exception i've found is memcached, because it's so dang fast ;^) I presume your needs are slightly beyond your code sample, which may have been modified for clarity, but if not, other web servers and proxys suggested may be a good way to go. m On Tue, Mar 4, 2008 at 11:47 AM, Nicola Larosa <nico@teknico.net> wrote:

I loooooove memcache, but I still defer it to a thread :/ Matthew Glubb Technical Partner email: matthew.glubb@madebykite.com phone: 44 (0) 7715 754017 skype: mglubb Kite http://madebykite.com -- GPG: 96FF DE0E 0B7B 37F0 7F8D C54C E285 3D8F 5625 9244 On 4 Mar 2008, at 20:22, Marc Byrd wrote:

On 4 Mar, 08:52 pm, matt@madebykite.com wrote:
I loooooove memcache, but I still defer it to a thread :/
Why? http://twistedmatrix.com/trac/browser/trunk/twisted/protocols/memcache.py
participants (6)
-
bharath venkatesh
-
glyph@divmod.com
-
Jean-Paul Calderone
-
Marc Byrd
-
Matthew Glubb
-
Nicola Larosa