[Twisted-Python] Twisted client memory leak

I have an Twisted client app that makes hundreds of connections per minute. I discover that I have a memory leak un my app and I'm almost sure that is related to the ClientFactory() derived class that is never deleted. I reproduce the problem with a modification of Echo client example from Twisted documentation: from twisted.internet.protocol import Protocol, ClientFactory from twisted.internet import reactor from twisted.internet.task import LoopingCall from sys import stdout class Echo(Protocol): def connectionMade(self): print 'MADE' self.transport.write('XXXX') def dataReceived(self, data): print 'RECV', data self.transport.loseConnection() def __del__(self): print 'DEL PROTOCOL' class EchoClientFactory(ClientFactory): def startedConnecting(self, connector): print 'Started to connect.' def buildProtocol(self, addr): print 'Connected.' return Echo() def clientConnectionLost(self, connector, reason): print 'Lost connection. Reason:', reason def clientConnectionFailed(self, connector, reason): print 'Connection failed. Reason:', reason def __del__(self): print 'DEL FACTORY' def connector(): print 'CONNECTOR' factory = EchoClientFactory() reactor.connectTCP('localhost', 7, factory) #reactor.callLater(2, connector) register_loop = LoopingCall(connector) register_loop.start(1) reactor.run() With this code I discover that the instances of EchoClientFactory() are only deleted when the program shutdowns. They are not deleted when the connections finish. I haven't found in the documentation if I need to do some to get factory instances deleted. -- Diego Woitasen

Here is the server code if you want to have a test: from twisted.internet.protocol import Protocol from twisted.internet.protocol import Factory from twisted.internet import reactor class Echo(Protocol): def __init__(self, factory): self.factory = factory def connectionMade(self): print 'MADE' def connectionLost(self, reason): print 'LOST' def dataReceived(self, data): self.transport.write(data) class EchoFactory(Factory): def buildProtocol(self, addr): return Echo(self) reactor.listenTCP(8007, EchoFactory()) reactor.run() Thanks! On Tue, Jan 22, 2013 at 10:06 AM, Diego Woitasen <diego@woitasen.com.ar> wrote:
-- Diego Woitasen

On Tue, Jan 22, 2013 at 10:38 AM, Marco Giusti <marco.giusti@gmail.com> wrote:
Ok, that's work. Thanks. My question is now, why is this done automatically for EchoProtocol() and not for EchoFactory()? Looks like the references are dropped why Python is taking too much time to frees the memory. An explanation around this is welcome :) Regards, Diego -- Diego Woitasen

On Tue, Jan 22, 2013 at 3:06 PM, Diego Woitasen <diego@woitasen.com.ar> wrote:
I reproduce the problem with a modification of Echo client example from Twisted documentation:
Adding __del__ methods causes the object to become uncollectable if it forms part of a reference cycle. Thus it is almost always a bad idea to add __del__ methods when attempting to debug a space leak, as the most likely outcome is that you introduce a _new_, _different_ leak to the one you are trying to debug. -- mithrandi, i Ainil en-Balandor, a faer Ambar

On Tue, Jan 22, 2013 at 9:15 AM, Tristan Seligmann <mithrandi@mithrandi.net>wrote:
Thanks for pointing this out. I was reading through this whole thread and going "why hasn't anyone else pointed out how wrong it is to add __del__ to debug this yet???". -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/

On 22/01/13 15:15, Tristan Seligmann wrote:
Ah yes, well spotted. Personally I tend to avoid __del__ in almost all circumstances, but particularly using Twisted (not because of anything Twisted-specific, but because my Twisted code tends to be *very* long-running, and because it's got sane "died" callbacks on most interfaces).

On Tue, Jan 22, 2013 at 12:15 PM, Tristan Seligmann <mithrandi@mithrandi.net> wrote:
Yes, but note that without the __del__ it had the same behaviour... -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista

On 22/01/13 16:45, Facundo Batista wrote:
Yes, but note that without the __del__ it had the same behaviour...
Not quite. The OP said that: a) He had a problem with a Twisted app not freeing memory under load b) He had reproduced that problem with his example, that included __del__ Nowhere did he say "it does the same without __del__". He was in fact not specific about whether the original/real code uses __del__ or not.

On Tue, Jan 22, 2013 at 2:02 PM, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
He didn't say it. I'm saying it, after testing it. -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista

On 22/01/13 17:06, Facundo Batista wrote:
When you say "it", you mean his example code as posted to the list, right? Because having just tested it, I don't see any problem - his example code has stable memory usage for both client/server processes, and with the client making 1, 100 or 1000 connections/sec, and the debug shows protocol and factory instances being freed as I would expect i.e. in a timely fashion, not just at process close. Tested on both: Python 2.7.3 / Twisted 11.1.0 / Linux 64-bit Python 2.6.8 / Twisted 12.2.0 / Linux 64-bit Odd...

On Tue, Jan 22, 2013 at 2:37 PM, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
I tested the original example, with the change of doing a loop each .1 seconds instead 1 second, and it shown a memory growth, with and without the __del__ in the Factory. It was using Twisted 12.2.0 under Python 2.7.3, on a Linux 32b. I measured the memory usage with top. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista

Here is the server code if you want to have a test: from twisted.internet.protocol import Protocol from twisted.internet.protocol import Factory from twisted.internet import reactor class Echo(Protocol): def __init__(self, factory): self.factory = factory def connectionMade(self): print 'MADE' def connectionLost(self, reason): print 'LOST' def dataReceived(self, data): self.transport.write(data) class EchoFactory(Factory): def buildProtocol(self, addr): return Echo(self) reactor.listenTCP(8007, EchoFactory()) reactor.run() Thanks! On Tue, Jan 22, 2013 at 10:06 AM, Diego Woitasen <diego@woitasen.com.ar> wrote:
-- Diego Woitasen

On Tue, Jan 22, 2013 at 10:38 AM, Marco Giusti <marco.giusti@gmail.com> wrote:
Ok, that's work. Thanks. My question is now, why is this done automatically for EchoProtocol() and not for EchoFactory()? Looks like the references are dropped why Python is taking too much time to frees the memory. An explanation around this is welcome :) Regards, Diego -- Diego Woitasen

On Tue, Jan 22, 2013 at 3:06 PM, Diego Woitasen <diego@woitasen.com.ar> wrote:
I reproduce the problem with a modification of Echo client example from Twisted documentation:
Adding __del__ methods causes the object to become uncollectable if it forms part of a reference cycle. Thus it is almost always a bad idea to add __del__ methods when attempting to debug a space leak, as the most likely outcome is that you introduce a _new_, _different_ leak to the one you are trying to debug. -- mithrandi, i Ainil en-Balandor, a faer Ambar

On Tue, Jan 22, 2013 at 9:15 AM, Tristan Seligmann <mithrandi@mithrandi.net>wrote:
Thanks for pointing this out. I was reading through this whole thread and going "why hasn't anyone else pointed out how wrong it is to add __del__ to debug this yet???". -- Christopher Armstrong http://radix.twistedmatrix.com/ http://planet-if.com/

On 22/01/13 15:15, Tristan Seligmann wrote:
Ah yes, well spotted. Personally I tend to avoid __del__ in almost all circumstances, but particularly using Twisted (not because of anything Twisted-specific, but because my Twisted code tends to be *very* long-running, and because it's got sane "died" callbacks on most interfaces).

On Tue, Jan 22, 2013 at 12:15 PM, Tristan Seligmann <mithrandi@mithrandi.net> wrote:
Yes, but note that without the __del__ it had the same behaviour... -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista

On 22/01/13 16:45, Facundo Batista wrote:
Yes, but note that without the __del__ it had the same behaviour...
Not quite. The OP said that: a) He had a problem with a Twisted app not freeing memory under load b) He had reproduced that problem with his example, that included __del__ Nowhere did he say "it does the same without __del__". He was in fact not specific about whether the original/real code uses __del__ or not.

On Tue, Jan 22, 2013 at 2:02 PM, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
He didn't say it. I'm saying it, after testing it. -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista

On 22/01/13 17:06, Facundo Batista wrote:
When you say "it", you mean his example code as posted to the list, right? Because having just tested it, I don't see any problem - his example code has stable memory usage for both client/server processes, and with the client making 1, 100 or 1000 connections/sec, and the debug shows protocol and factory instances being freed as I would expect i.e. in a timely fashion, not just at process close. Tested on both: Python 2.7.3 / Twisted 11.1.0 / Linux 64-bit Python 2.6.8 / Twisted 12.2.0 / Linux 64-bit Odd...

On Tue, Jan 22, 2013 at 2:37 PM, Phil Mayers <p.mayers@imperial.ac.uk> wrote:
I tested the original example, with the change of doing a loop each .1 seconds instead 1 second, and it shown a memory growth, with and without the __del__ in the Factory. It was using Twisted 12.2.0 under Python 2.7.3, on a Linux 32b. I measured the memory usage with top. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista
participants (6)
-
Christopher Armstrong
-
Diego Woitasen
-
Facundo Batista
-
Marco Giusti
-
Phil Mayers
-
Tristan Seligmann