
On Mon, Dec 26, 2005 at 12:21:29PM -0500, Jean-Paul Calderone wrote:
Of course, in creating a cycle which contains an object with an implementation of __del__, you have created a leak, since Python's GC cannot collect that kind of graph.
Ah this explains many things. I didn't realize that having a __del__ callback made any difference from a garbage collection point of view, so while trying to fix memleaks I probably added them ;). Sorry for posting it here and not a python list, but my basic problem is to make sure the "protocol" object is being collected away, and the protocol object is a very twisted thing, so I thought it would be at on topic here since everyone of us needs the protocol object garbage collected properly. Now it turned out more a language thing than I thought originally... Ok, going back to how this thing started. I happened to allocate 50M of ram somehow attached to a protocol object, and then I noticed that the reconnectingclientfactory was leaking memory after a disconnect/reconnect event. Every time I restarted the server, 50M were added to the RSS of the task. That was definitely a memleak, and I never had a __del__ method. Then I started adding debugging aid to figure out what was going wrong. By removing the self and cross references the memleak was fixed in the client. So then I figured out the same self-references were in the server as well, and I added more debugging in the server as well. That lead me in the current situation. So something was definitely going wrong w.r.t. memleaks even before I started messing with the __del__ methods. But I'm very relieived to know that python gets it right if __del__ isn't implemented.
Hopefully the __del__ implementation is only included as an aid to understanding what is going on, and you don't actually need it in any of your actual applications. Once removed, the cycle will be collectable by Python.
Correct, it was only an aid, it didn't exist until today.
When you have "two cross referenced objects", that's a cycle, and Python will indeed clean it up. The only exception is if there is a
Well, I never cared about cyclic references until today, because I thought python would understand it automatically like I think it's possible infact. But then while trying to debug the 50M leak in the client at every server restart (so very visible), I quickly into this: http://www.nightmare.com/medusa/memory-leaks.html class thing: pass a = thing() b = thing() a.other = b b.other = a del a del b Code like above is very common in my twisted based server. Note that there's no __del__ method in the class "thing". So what you say seems in disagreement with the above url. Perhaps I got bitten by the common mistake "I found it on the internet so it must be true"... I really hope you're the one being right, my code was all written with your ideas in mind but that seems to collide strong with the above url. I guess I should have checked the date, it's from 99, perhaps it has been true a long time ago?
__del__ implementation, as I mentioned above. This is a general problem with garbage collection. If you have two objects which refer to each other and which each wish to perform some finalization, which finalizer do you call first?
Why would it matter which one you call first? Random no? Better to call it random than to leak memory, no? At least python should spawn a gigantic warning that there's a cross reference leaking, instead of silenty not calling __del__.
You might be surprised :) These things tend to build up, if your process is long-running.
I think you're right there was no memleak generated by self/cross cyclic references, but then the load is pretty low at the moment so I could have overlooked it. I periodically monitor the rss of all tasks. I never had problems before noticing the reconnectingclientfactory memleak (which btw I can't reproduce anymore after removing the cross references).
(You can probably guess what I'm going to say here. ;) In general, I avoid implementing __del__. My programs may end up with cycles, but as long as I don't have __del__, Python can figure out how to free the objects. Note that it does sometimes take it a while (and this has implications for peak memory usage which may be important to you), but if you find a case that it doesn't handle, then you've probably found a bug in the GC that python-dev will fix.
Hope this helps, and happy holidays,
Thanks a lot, things looks much better now, I'm relieved that python can figure out how to free objects, I always thought it was able to do so infact ;). Happy holidays to you too. So, I'll backout all my latest changes, and I'll try to find the real cause of the reconnectingclientfactory memleak which definitely happened even though there was no __del__ method implemented.