Hi! I was thinking, why not let python gc break cycles with only one object.__del__ ? I don't see a problem with calling the __del__ method and then proceed as usual (break the cycle if it wasn't already broken by __del__) Many Thanks, Yoav Glazner
On Mon, Sep 13, 2010 at 8:09 AM, yoav glazner <yoavglazner@gmail.com> wrote:
why not let python gc break cycles with only one object.__del__ ?
If you can point to the code that prevents this, please report a bug. The last time I checked, there were proposals toeither add a __close__ or weaken __del__ to handle multi-__del__ cycles -- but single-__del__ cycles were already handled OK. -jJ
On Mon, 13 Sep 2010 12:16:36 -0400 Jim Jewett <jimjjewett@gmail.com> wrote:
The last time I checked, there were proposals toeither add a __close__ or weaken __del__ to handle multi-__del__ cycles -- but single-__del__ cycles were already handled OK.
They aren't:
class C(list): ... def __del__(self): pass ... c = C() c.append(c) del c import gc gc.collect() 1 gc.garbage [[[...]]] type(gc.garbage[0]) <class '__main__.C'>
[Jim Jewett]
The last time I checked ... single-__del__ cycles were already handled OK.
[Antoine Pitrou]
They aren't: ...
Antoine's right, unless things have changed dramatically since last time I was intimate with that code. CPython's "cyclic garbage detection" makes no attempt to analyze cycle structure. It infers that all trash it sees must be in cycles simply because the trash hasn't already been collected by the regular refcount-based gc. The presence of __del__ on a trash object then disqualifies it from further analysis, but there's no analysis of cycle structure regardless. Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
Tim Peters <tim.peters@...> writes:
Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
And hopefully no one will. That would be very brittle.
On Mon, 13 Sep 2010 19:22:02 +0000 (UTC) Benjamin <benjamin@python.org> wrote:
Tim Peters <tim.peters@...> writes:
Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
And hopefully no one will. That would be very brittle.
Why would it be?
Antoine Pitrou <solipsis@...> writes:
On Mon, 13 Sep 2010 19:22:02 +0000 (UTC) Benjamin <benjamin@...> wrote:
Tim Peters <tim.peters@...> writes:
Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
And hopefully no one will. That would be very brittle.
Why would it be?
Because if you're cycle suddenly had more than one __del__, it would stop being collected.
On 13 September 2010 20:22, Benjamin <benjamin@python.org> wrote:
Tim Peters <tim.peters@...> writes:
Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
And hopefully no one will. That would be very brittle.
More brittle than what PyPy, IronPython (and presumably) jython do? (Which is make cycles collectable by arbitrarily breaking them IIUC.) Michael
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Tue, Sep 14, 2010 at 3:25 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Jim Jewett]
The last time I checked ... single-__del__ cycles were already handled OK.
[Antoine Pitrou]
They aren't: ...
Antoine's right, unless things have changed dramatically since last time I was intimate with that code. CPython's "cyclic garbage detection" makes no attempt to analyze cycle structure. It infers that all trash it sees must be in cycles simply because the trash hasn't already been collected by the regular refcount-based gc. The presence of __del__ on a trash object then disqualifies it from further analysis, but there's no analysis of cycle structure regardless.
I had a skim through that code last night, and as far as I can tell it still works that way. However, it should be noted that the cyclic GC actually does release everything *else* in the cycle - it's solely the objects with __del__ methods that remain alive. There does appear to a *little* bit of structural analysis going on - it looks like the "finalizers" list ends up containing both objects with __del__ methods, as well as all other objects in the cyclic trash that are reachable from the objects with __del__ methods.
Of course it doesn't _have_ to be that way. Nobody cared enough yet to add a pile of new code to special-case cycles with a single __del__.
Just from skimming the code, I wonder if, once finalizers has been figured out, the GC could further partition that list into "to_delete" (no __del__ method), "to_finalize" (__del__ method, but all referrers in cycle have no __del__ method) and "uncollectable" (multiple __del__ methods in cycle). Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods. Objects that appear only in the first list could safely have their finalisers invoked, while those that also in the latter could not. This is definitely a case of "code talks" though - there's no fundamental problem with the idea, but also no great incentive for anyone to code it when __del__ is comparatively easy to avoid (although not trivial, see Raymond's recent modifications to OrderedDictionary to avoid exactly this issue). Or, accept that __del__ is evil, and try to come up with a workable proposal for that better weakref callback based scheme Jim mentioned. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods.
But since it's a cycle, isn't *everything* in the cycle going to be reachable from everything else? -- Greg
[Nick Coghlan]
Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods.
[Greg Ewing]
But since it's a cycle, isn't *everything* in the cycle going to be reachable from everything else?
Note that I was sloppy in saying that CPython's cyclic gc only sees trash objects in cycles. More accurately, it sees trash objects in cycles, and objects (which may or may not be in cycles) reachable only from trash objects in cycles. For example, if objects A and B point to each other, that's a cycle. If A also happens to point to D, where D has a __del__ method, and nothing else points to D, then that's a case where D is not in a cycle, but is nevertheless trash if A and B are trash. And if A and B lack finalizers, then CPython's cyclic gc will reclaim D, despite that it does have a __del__. That pattern is exploitable too. If, e.g., you have some resource R that needs to be cleaned up, owned by an object A that may participate in cycles, it's often possible to put R in a different, very simple object with a __del__ method, and have A point to that latter object instead.
On Mon, Sep 13, 2010 at 8:04 PM, Tim Peters <tim.peters@gmail.com> wrote:
[Nick Coghlan]
Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods.
[Greg Ewing]
But since it's a cycle, isn't *everything* in the cycle going to be reachable from everything else?
Note that I was sloppy in saying that CPython's cyclic gc only sees trash objects in cycles. More accurately, it sees trash objects in cycles, and objects (which may or may not be in cycles) reachable only from trash objects in cycles. For example, if objects A and B point to each other, that's a cycle. If A also happens to point to D, where D has a __del__ method, and nothing else points to D, then that's a case where D is not in a cycle, but is nevertheless trash if A and B are trash. And if A and B lack finalizers, then CPython's cyclic gc will reclaim D, despite that it does have a __del__.
That pattern is exploitable too. If, e.g., you have some resource R that needs to be cleaned up, owned by an object A that may participate in cycles, it's often possible to put R in a different, very simple object with a __del__ method, and have A point to that latter object instead.
Yeah, I think we even recommended this pattern at some point. ISTR we designed the new io library to exploit it. -- --Guido van Rossum (python.org/~guido)
On 9/13/2010 11:07 PM, Guido van Rossum wrote:
On Mon, Sep 13, 2010 at 8:04 PM, Tim Peters <tim.peters@gmail.com> wrote:
[Nick Coghlan]
Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods.
[Greg Ewing]
But since it's a cycle, isn't *everything* in the cycle going to be reachable from everything else?
That pattern is exploitable too. If, e.g., you have some resource R that needs to be cleaned up, owned by an object A that may participate in cycles, it's often possible to put R in a different, very simple object with a __del__ method, and have A point to that latter object instead.
Yeah, I think we even recommended this pattern at some point. ISTR we designed the new io library to exploit it.
Yes, this topic came up some while back on this list and Tim's solution is exactly the design pattern I suggested then: http://mail.python.org/pipermail/python-ideas/2009-October/006222.html -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
On Tue, Sep 14, 2010 at 12:44 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
Alternatively, when building finalizers, build two lists: one for objects with __del__ methods and one for objects that are reachable from objects with __del__ methods.
But since it's a cycle, isn't *everything* in the cycle going to be reachable from everything else?
In addition to what Tim said, there may be more than one cycle being collected. So you can have situations like objects, A, B C in one cycle and D, E, F in a different cycle. Suppose A, B and D all have __del__ methods. Then your two lists would be: __del__ method: A, B, D Reachable from objects with __del__ method: A, B, C, E, F It's just another way of viewing what the OP described: cycles containing only a single object with __del__ don't actually have an ordering problem, so you can just call it before you destroy any of the objects. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (11)
-
Antoine Pitrou
-
Benjamin
-
Benjamin Peterson
-
Greg Ewing
-
Guido van Rossum
-
Jim Jewett
-
Michael Foord
-
Nick Coghlan
-
Scott Dial
-
Tim Peters
-
yoav glazner