Re: [Twisted-web] render_GET and memory consumption

21 Dec 2010

      On 10-12-21 10:59 AM, exarkun@twistedmatrix.com wrote:
...
On 02:34 pm, psanchez@fosstel.com wrote:
...
On 10-12-21 09:00 AM, exarkun@twistedmatrix.com wrote:
...
On 05:19 am, psanchez@fosstel.com wrote:
...
Hello,
Here's a demo HTTP server that returns 10 MB of random data each time
a
client connects.
import os
from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.resource import Resource
data = os.urandom(10*1024*1024)
class TestPage(Resource):
      isLeaf = True
      def render_GET(self, request):
          return data
root = Resource()
root.putChild('test', TestPage())
reactor.listenTCP(8880, Site(root))
reactor.run()
Now, when I run N clients simultaneously from a different host I see
that the server's memory consumption increases by N*10 MB. I can't
reproduce this example when running the clients from the same host as
the server; the test goes so fast that I can't gather any useful
data.
I run the test using the following httperf command on a different
host
and looking at the Gnome system monitor in the server (top will do as
well).
httperf --server 192.168.1.10 --port 8880 --uri /test \
          --rate 10 --num-conn 10
When the server is idle memory consumption is 17.1 MB, but during the
test it jumps to 117.2 MB. My questions are then:
1. Given that 'data' is a global variable, eventually read-only as
well,
why is it replicated for each request? And who is replicating it?
It's copied as part of the process of writing it to the socket.  You
can't write 10MB at once, and you can't slice a string (to throw away
the part that you did manage to write) without making a copy of part
of
it.
...
2, What would be the proper way to re-write this example so that
there
is one and only one 'data' structure at any time?
Split data up into ~32kB-64kB chunks and write them to the request
individually.  Then each chunk can just be dropped with no copying.
Jean-Paul
_______________________________________________
Twisted-web mailing list
Twisted-web@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
Thanks Jean-Paul,
Here is my modified example, unfortunately with the same bad results
regarding memory consumption.
import os
from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.resource import Resource
CHUNK_SIZE = 32*1024
data = os.urandom(10*1024*1024)
class TestPage(Resource):
      isLeaf = True
def render_GET(self, request):
          s = 0
          for chunk in iter(lambda: data[s:s+CHUNK_SIZE], ''):
              request.write(chunk)
              s = s + CHUNK_SIZE
You've just moved the copying-by-slicing out of the transport and into
your Resource. :)  You need to do all that copying/slicing at the
beginning, where it only needs to happen once, so that every render can
share those allocated strings.
Jean-Paul
_______________________________________________
Twisted-web mailing list
Twisted-web@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
OK, I guess I'm being slow :-( Here's another version, same results.

import os
from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.resource import Resource

CHUNK_SIZE = 32*1024
data = os.urandom(10*1024*1024)
chunks = []

def make_chunks():
     s = 0
     for chunk in iter(lambda: data[s:s+CHUNK_SIZE], ''):
         chunks.append(chunk)
         s = s + CHUNK_SIZE

class TestPage(Resource):
      isLeaf = True

      def render_GET(self, request):
          for chunk in chunks:
              request.write(chunk)

make_chunks()
root = Resource()
root.putChild('test', TestPage())
reactor.listenTCP(8880, Site(root))
reactor.run()

I tried also preparing the chunks in a TestPage.__init__() 
implementation. Same results. So, where exactly do I have to put the 
make_chunks() steps?

Thanks,

-- 
Pedro

Re: [Twisted-web] render_GET and memory consumption

Pedro I. Sanchez