
On Tue, Dec 27, 2005 at 01:47:42PM +1100, Christopher Armstrong wrote:
- gc.collect() is the equivalent of what's run occassionally by
IMHO the "occasionally" is the only wrong thing of the whole story. It must not be "occasionally", it must be "occasionally _or_ when the task is growing to an insanse size". The total amount of anonymous memory allocated by the interpreter must be tracked in O(1) and a collect() should have been invokved at least every time the amount of memory doubled. The reason it took me so long before I could suspect the gc, is that coming from a vm kernel background, I couldn't even dream that after the task grown up to >1G and the system was into swap, the python gc didn't even yet try to prune all potentially freeable objects. The gc should definitely be in function of "size" too, and currently it's not. There is definitely room for improvements in the gc by adding heuristics in function of "size of anonymous memory allocated", and it doesn't seem difficult to add it, nor it should impact performance since the deep gc.collect() (the only costly thing) would very rarely be invoked more frequently.