[Twisted-Python] Reducing memory footprint of twisted app?

Not an exact question, but rather 'searching for ideas'. I have some twisted app, which uses more memory than I would like it to. I tried analysing it a bit (mainly using gc module object list and enumerating items of different types) and it seem to me that there is something 'twistedish' in it. My application uses in a lot of places generator idiom (functions/methods which yield wrapped with defer.deferredGenerator). And, as there seem to be a lot of anonymous functions and tuples allocated, I suspect that maybe those functions, deferreds and related params and closures live longer then I would like to. Any ideas of what could I do to track it down? In particular, is it possible to somehow use introspection to find which lambdas and deferreds are allocated while the program is running? Are there any suggestions on how to code deferredGenerators to reduce allocated memory (maybe, for instance, I should try to turn local variables into object attributes, or opposite, or ...) Also, if anybody could me point to any interesting resources about tracking python momory usage, I would be grateful. Tried googling for some time, but apart of zope trackRef I did not found anything.

Marcin Kasperski wrote:
Here are a couple of possibly interesting tools (I don't know whether they work with Twisted): PySizer - a memory profiler for Python http://pysizer.8325.org/ Heapy http://guppy-pe.sourceforge.net/#Heapy -- Nicola Larosa - http://www.tekNico.net/ In the '70s I was a huge fan of various bands; now I'm a huge fan of software. I've gone from gazing at record covers to gazing at online help files, which is a little sick, now that I think about it. -- Wayne Lytle, March 2006

On Tue, Aug 15, 2006 at 12:36:16PM +0200, Marcin Kasperski wrote:
Well, objects in Python will live as long as they are referenced. If you have large objects (or many objects) referenced from a function scope or object that's still live, then of course the referenced objects will still be live too.
Object attributes would tend to be worse than locals, because typically objects (and thus their attributes) outlive a function's scope. As a thought experiment, if you transform a generator function into a class, moving the state from locals in the generator to instance variables of the class, what have you changed about the lifetimes of those objects? Answer: nothing. If some of those generator locals become locals in the __next__ and other methods of the class, but *not* instance variables, then those lifetimes will be shorter -- but you can achieve exactly the same effect by adding "del foo" or "foo = None" statements to the original generator function. Thinking about the problem as somehow inherent to generator functions (and by extension, deferredGenerator), is a red herring. The best idea I can offer you is this: first find out what's taking the memory before you try to change your code to fix it. Blindly rewriting some code in a different style without understanding why (or even if) it's taking up so much memory will get you nowhere. Even if you think you have a pretty good guess, you're probably wrong (at least, I find that's what happens to me when I try to optimise based only on guesses).
I use http://twistedmatrix.com/users/spiv/countrefs.py occasionally when I'm trying to figure out what's using memory in a Python program. It uses the ref count on class/type objects as an approximation of the number of instances; which is close enough. If there are 100000 references to a class, it's almost certain that at least 999990 of them are instances of that class. The other thing to do is to reproduce the problem as simply as possible. Do you have a test suite? Does the memory usage get too high during the test run? Also, can you reproduce it just by starting the web server? If so, try running just half the code involved to start it up -- still see it? And so on. Or, if it only consumes unacceptably large amounts of memory after serving 10000 requests, write a script to issue 10000 requests, change the server to only do first half the processing, hit it with 10000 requests, and you'll see if the problem is in the first half or the second half. You get the idea. Reproduce your problem, then simplify things as much as possible until you can analyse it. I hope these ideas help you. -Andrew.

one of the things we did, and saw an approximately 30% REDUCTION in memory foot print was add ing __slots__ definations to all he objects we were creating in graphs. This isn't twisted specific, so it should apply to any python application. Granted we have hundreds of thousands of objects in the graph. But it did make a noticiable change in the foot print.

On Thu, Aug 17, 2006 at 09:37:39PM -0400, jarrod roberson wrote:
Right, __slots__ can be helpful. Some more advice that isn't Twisted specific: It's very helpful to understand which objects are taking up the memory. If you know that, not can have a good idea if __slots__ will actually help before you clutter your code with them, but you can perhaps realise that you shouldn't even have 100000 simultaneous Request objects when you only have 1000 connections at the time -- in my experience helping people on IRC, it's quite common that there's accidentally a reference being kept to every request object (or similar), thus causing memory leaks despite Python's garbage collection. Saving 30% of memory on 100000 objects isn't anywhere near as good as saving 99% of those objects from being needed in the first place! If you understand what the culprits are, you can also decide that not only are __slots__ helpful, you can also analyse those objects to figure out if they are keeping more state than they really need. And in fact, you can try speculatively adding __slots__ to a type of object as an indirect way to see if a particular type is a major contributor to your memory use or not -- if adding __slots__ to Foo doesn't help, there probably aren't a significant number of instances contributing to the memory use. Basically, I really strongly think people should *understand* their performance issues so they can fix them better, rather than just blindly doing the equivalent of "gcc -O9" and considering it solved. It depends on the available time and requirements, of course; if a quick band-aid is all that's needed, then fair enough. But I find it usually pays off to throughly understand what you're fixing. That said, if you need hundreds of thousands of objects in memory, __slots__ is one of the simplest ways to improve memory consumption I know of :) -Andrew.

Marcin Kasperski wrote:
Here are a couple of possibly interesting tools (I don't know whether they work with Twisted): PySizer - a memory profiler for Python http://pysizer.8325.org/ Heapy http://guppy-pe.sourceforge.net/#Heapy -- Nicola Larosa - http://www.tekNico.net/ In the '70s I was a huge fan of various bands; now I'm a huge fan of software. I've gone from gazing at record covers to gazing at online help files, which is a little sick, now that I think about it. -- Wayne Lytle, March 2006

On Tue, Aug 15, 2006 at 12:36:16PM +0200, Marcin Kasperski wrote:
Well, objects in Python will live as long as they are referenced. If you have large objects (or many objects) referenced from a function scope or object that's still live, then of course the referenced objects will still be live too.
Object attributes would tend to be worse than locals, because typically objects (and thus their attributes) outlive a function's scope. As a thought experiment, if you transform a generator function into a class, moving the state from locals in the generator to instance variables of the class, what have you changed about the lifetimes of those objects? Answer: nothing. If some of those generator locals become locals in the __next__ and other methods of the class, but *not* instance variables, then those lifetimes will be shorter -- but you can achieve exactly the same effect by adding "del foo" or "foo = None" statements to the original generator function. Thinking about the problem as somehow inherent to generator functions (and by extension, deferredGenerator), is a red herring. The best idea I can offer you is this: first find out what's taking the memory before you try to change your code to fix it. Blindly rewriting some code in a different style without understanding why (or even if) it's taking up so much memory will get you nowhere. Even if you think you have a pretty good guess, you're probably wrong (at least, I find that's what happens to me when I try to optimise based only on guesses).
I use http://twistedmatrix.com/users/spiv/countrefs.py occasionally when I'm trying to figure out what's using memory in a Python program. It uses the ref count on class/type objects as an approximation of the number of instances; which is close enough. If there are 100000 references to a class, it's almost certain that at least 999990 of them are instances of that class. The other thing to do is to reproduce the problem as simply as possible. Do you have a test suite? Does the memory usage get too high during the test run? Also, can you reproduce it just by starting the web server? If so, try running just half the code involved to start it up -- still see it? And so on. Or, if it only consumes unacceptably large amounts of memory after serving 10000 requests, write a script to issue 10000 requests, change the server to only do first half the processing, hit it with 10000 requests, and you'll see if the problem is in the first half or the second half. You get the idea. Reproduce your problem, then simplify things as much as possible until you can analyse it. I hope these ideas help you. -Andrew.

one of the things we did, and saw an approximately 30% REDUCTION in memory foot print was add ing __slots__ definations to all he objects we were creating in graphs. This isn't twisted specific, so it should apply to any python application. Granted we have hundreds of thousands of objects in the graph. But it did make a noticiable change in the foot print.

On Thu, Aug 17, 2006 at 09:37:39PM -0400, jarrod roberson wrote:
Right, __slots__ can be helpful. Some more advice that isn't Twisted specific: It's very helpful to understand which objects are taking up the memory. If you know that, not can have a good idea if __slots__ will actually help before you clutter your code with them, but you can perhaps realise that you shouldn't even have 100000 simultaneous Request objects when you only have 1000 connections at the time -- in my experience helping people on IRC, it's quite common that there's accidentally a reference being kept to every request object (or similar), thus causing memory leaks despite Python's garbage collection. Saving 30% of memory on 100000 objects isn't anywhere near as good as saving 99% of those objects from being needed in the first place! If you understand what the culprits are, you can also decide that not only are __slots__ helpful, you can also analyse those objects to figure out if they are keeping more state than they really need. And in fact, you can try speculatively adding __slots__ to a type of object as an indirect way to see if a particular type is a major contributor to your memory use or not -- if adding __slots__ to Foo doesn't help, there probably aren't a significant number of instances contributing to the memory use. Basically, I really strongly think people should *understand* their performance issues so they can fix them better, rather than just blindly doing the equivalent of "gcc -O9" and considering it solved. It depends on the available time and requirements, of course; if a quick band-aid is all that's needed, then fair enough. But I find it usually pays off to throughly understand what you're fixing. That said, if you need hundreds of thousands of objects in memory, __slots__ is one of the simplest ways to improve memory consumption I know of :) -Andrew.
participants (4)
-
Andrew Bennetts
-
jarrod roberson
-
Marcin Kasperski
-
Nicola Larosa