On Mon, 3 Jun 2002, Tim Peters wrote:
I didn't count it as a strike against the patch -- I had just hoped that untracking tuples would result in faster execution than turning GC off and letting my heap swell obscenely.
Kevin, keep in mind that we haven't run your app: the only things we know about it are what you've told us. That your heap swells obscenely when gc is off is news to me. Does your heap swell obscenely when turning gc off, but does not swell obscenely if you keep gc on? If so, you should keep in mind too that the best way to speed gc is not to create cyclic trash to begin with.
This part of my app allocates a mix of several kinds of objects. Some are cyclic tuple trees, many are acyclic tuples, and others are complex class instances. To make things worse, the object lifetimes vary from ephemeral to very long-lived. Due to this mix, turning GC off does cause the heap to swell, and we have to very carefully monitor the algorithm to relieve the swelling frequently enough to prevent eating too much system memory, but infrequently enough that we don't spend all of our time in the GC tracking through the many acyclic tuples. Also, due to the dynamic nature of the algorithm, it is not trivial to avoid creating cyclic trash.
One extreme case could happend if I turn off GC, run my code, and let it fill all of my real memory with tuples, and start swapping to disk. Clearly, keeping GC enabled with the tuple untracking patch would result in huge performance gains.
Sorry, this isn't clear at all. If your tuples aren't actually in cycles, then whether GC is on or off is irrelevant to how long they live, and to how much memory they consume. It doesn't cost any extra memory (not even one byte) for a tuple to live in a gc list; on the flip side, no memory is saved by Neil's patch.
But I do generate cyclic trash, and it does build up when I turn off GC. So, if it runs long enough with GC turned off, the machine will run out of real memory. This is all that I am saying.
This is not the situation I was dealing with, though I was hoping for a relatively smaller improvement from having a more compact and (hopefully) less fragmented heap.
Neil's patch should have no effect on memory use or fragmentation. It's only aiming at reducing the time spent uselessly scanning and rescanning and rescanning and ... tuples in bad cases. So long as your tuples stay alive, they're going to consume the same amount of memory with or without the patch, and whether or not you disable gc.
With Neil's patch and GC turned ON:
Python untracks many, many tuples that give for several generations that would otherwise have to scanned to dispose of the accumulating cyclic garbage. Some GC overhead is observed, due to normal periodic scanning and the extra work needed to untrack acyclic tuples.
without Neil's patch and automatic GC turned OFF:
Python does not spend any time scanning for garbage, and things are very fast until we run out of core memory, or the patterns of allocations result in an extremely fragmented heap due to interleaved allocation of objects with very heterogeneous lifetimes. At certain intervals, GC would be triggered manually to release the cyclic trash.
My hope was that the more compact and hopefully less fragmented heap would more than offset the overhead of automatic GC scans and the tuple untracking sweep.
Hopefully my comments are now a little clearer...
-- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: email@example.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com