Calling the GC less often when there are lots of long-lived objects
Hello, There are recurring complaints about the garbage collector degrading performance when lots of objects are created in a row. In issue #4074, I've proposed a patch which basically implements Martin's suggestion in http://mail.python.org/pipermail/python-dev/2008-June/080579.html to base the decision to do a full collection on the ratio between the number of objects surviving the (n-1) generation collection and the number of long-lived objects. I've also added a condition so that this new behaviour is only triggered when there are more than 10000 long-lived objects -- therefore, cycles will still get collected quickly in lightweight programs. In Gregory's simple test of storing many tuples in a list, the behaviour has indeed changed from exponential to linear. Is anybody opposed to the principle of this proposal? Antoine.
Antoine Pitrou wrote:
I've proposed a patch which basically implements Martin's suggestion in http://mail.python.org/pipermail/python-dev/2008-June/080579.html
Is anybody opposed to the principle of this proposal?
Sounds okay to me. -- Greg
Antoine Pitrou schrieb:
Is anybody opposed to the principle of this proposal?
Is it reasonable to implement multiple policies so the user can switch between them? Or is the new algorithm superior in all cases?
Christian Heimes <lists <at> cheimes.de> writes:
Is it reasonable to implement multiple policies so the user can switch between them? Or is the new algorithm superior in all cases?
We could let the user configure the threshold between the old policy and the new policy. Currently it is hard-wired to a value of 10000 (that is, 10000 long-lived objects tracked by the GC).
On Tue, Dec 16, 2008 at 8:00 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Christian Heimes <lists <at> cheimes.de> writes:
Is it reasonable to implement multiple policies so the user can switch between them? Or is the new algorithm superior in all cases?
<http://mail.python.org/mailman/options/python-dev/jacobs%40bioinformed.com>
I'll test your patch, as I currently have to micro-manage the garbage collector in several of my algorithms or else they degenerate into almost continuous collection. Results in a day or two. ~Kevin
Antoine Pitrou <solipsis <at> pitrou.net> writes:
We could let the user configure the threshold between the old policy and the
new
policy. Currently it is hard-wired to a value of 10000 (that is, 10000 long-lived objects tracked by the GC).
I've removed the threshold in the latest patches because it didn't make much sense when a few long-lived objects contained a lot of objects not tracked by the GC. Another improvement I've included in the latest patches (but which is orthogonal to the algorithmic change) is that simple tuples and even simple dicts are not tracked by the GC if they don't need to. A few examples (gc.is_tracked() is a new function which returns True if an object is tracked by the GC):
import gc gc.is_tracked(()) False gc.is_tracked((1,2)) False gc.is_tracked((1,(2, "a", None))) False gc.is_tracked((1,(2, "a", None, {}))) True
d = {} gc.is_tracked(d) False d[1,2] = 3,4 gc.is_tracked(d) False d[5] = None, "a", (1,2,3) gc.is_tracked(d) False d[6] = {} gc.is_tracked(d) True gc.is_tracked(d[6]) False
Regards Antoine.
I've removed the threshold in the latest patches because it didn't make much sense when a few long-lived objects contained a lot of objects not tracked by the GC.
Another improvement I've included in the latest patches (but which is orthogonal to the algorithmic change) is that simple tuples and even simple dicts are not tracked by the GC if they don't need to. A few examples (gc.is_tracked() is a new function which returns True if an object is tracked by the GC):
As they are orthogonal, I think they should be considered separately, but in particular committed separately. FWIW, I'm in favor of both (but haven't reviewed the non-cyclic tuples one yet). So despite the organizational overhead, I'd appreciate if you could create separate patches, if not separate issues. Regards, Martin
participants (5)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Christian Heimes
-
Greg Ewing
-
Kevin Jacobs <jacobs@bioinformed.com>