On 10-12-21 10:59 AM, exarkun@twistedmatrix.com wrote:
On 02:34 pm, psanchez@fosstel.com wrote:
On 10-12-21 09:00 AM, exarkun@twistedmatrix.com wrote:
On 05:19 am, psanchez@fosstel.com wrote:
Hello,
Here's a demo HTTP server that returns 10 MB of random data each time a client connects.
import os from twisted.internet import reactor from twisted.web.server import Site from twisted.web.resource import Resource
data = os.urandom(10*1024*1024)
class TestPage(Resource): isLeaf = True def render_GET(self, request): return data
root = Resource() root.putChild('test', TestPage()) reactor.listenTCP(8880, Site(root)) reactor.run()
Now, when I run N clients simultaneously from a different host I see that the server's memory consumption increases by N*10 MB. I can't reproduce this example when running the clients from the same host as the server; the test goes so fast that I can't gather any useful data.
I run the test using the following httperf command on a different host and looking at the Gnome system monitor in the server (top will do as well).
httperf --server 192.168.1.10 --port 8880 --uri /test \ --rate 10 --num-conn 10
When the server is idle memory consumption is 17.1 MB, but during the test it jumps to 117.2 MB. My questions are then:
1. Given that 'data' is a global variable, eventually read-only as well, why is it replicated for each request? And who is replicating it?
It's copied as part of the process of writing it to the socket. You can't write 10MB at once, and you can't slice a string (to throw away the part that you did manage to write) without making a copy of part of it.
2, What would be the proper way to re-write this example so that there is one and only one 'data' structure at any time?
Split data up into ~32kB-64kB chunks and write them to the request individually. Then each chunk can just be dropped with no copying.
Jean-Paul
_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
Thanks Jean-Paul,
Here is my modified example, unfortunately with the same bad results regarding memory consumption.
import os from twisted.internet import reactor from twisted.web.server import Site from twisted.web.resource import Resource
CHUNK_SIZE = 32*1024 data = os.urandom(10*1024*1024)
class TestPage(Resource): isLeaf = True
def render_GET(self, request): s = 0 for chunk in iter(lambda: data[s:s+CHUNK_SIZE], ''): request.write(chunk) s = s + CHUNK_SIZE
You've just moved the copying-by-slicing out of the transport and into your Resource. :) You need to do all that copying/slicing at the beginning, where it only needs to happen once, so that every render can share those allocated strings.
Jean-Paul
_______________________________________________ Twisted-web mailing list Twisted-web@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
OK, I guess I'm being slow :-( Here's another version, same results. import os from twisted.internet import reactor from twisted.web.server import Site from twisted.web.resource import Resource CHUNK_SIZE = 32*1024 data = os.urandom(10*1024*1024) chunks = [] def make_chunks(): s = 0 for chunk in iter(lambda: data[s:s+CHUNK_SIZE], ''): chunks.append(chunk) s = s + CHUNK_SIZE class TestPage(Resource): isLeaf = True def render_GET(self, request): for chunk in chunks: request.write(chunk) make_chunks() root = Resource() root.putChild('test', TestPage()) reactor.listenTCP(8880, Site(root)) reactor.run() I tried also preparing the chunks in a TestPage.__init__() implementation. Same results. So, where exactly do I have to put the make_chunks() steps? Thanks, -- Pedro