[issue9141] Allow objects to decide if they can be collected by GC

Daniel Stutzbach report at bugs.python.org
Sat Jul 10 02:33:35 CEST 2010


Daniel Stutzbach <daniel at stutzbachenterprises.com> added the comment:

2010/7/9 Kristján Valur Jónsson <report at bugs.python.org>

> Your message was classified as spam, I have no idea why, but this is why I
> only noticed it now.
>

Yes, I just noticed that tonight as well.  I filed a bug on the meta-tracker
in the hopes that someone can dig into the underlying cause.

> Your suggestion is interesting, I hadn't thought of that.  Yes, it is
> possible to use the track/untrack functions (I think), but that would mean
> that you would have to monitor your object for every state change and
> reliably detect the transition from one state to another.  Being able to
> query the current state and let gc know:  "No, I cannot be collected as I am
> now" is a much more robust solution from the programmers perspective.
>
> A further difference is this:  If an object isn't tracked, it won't be
> collected if it is part of a cycle, but it will not be put in gc.garbage
> either.  In effect, it will just remain in memory, unreachable, with no
> chance of it ever being released.
>

Yes, I see.  I have used the track/untrack approach, but it was in a very
different situation.  I have a long-running C function which keeps alive a
large number of objects.  At the start of the function, I untrack them all
as a performance optimization, so the garbage collector does not have to
spend time traversing them.  Before the function returns, I track them
again.

I see now why that wouldn't work for your use-case.  Thank you.  I like the
idea of your patch to give opportunities to tell the GC that they are
uncollectable on the fly.

I'm concerned about the performance impact of making tp_traverse do
double-duty, though.  Calling tp_traverse for every object in a cycle will
have the effect of making an extra pass on every reference from every object
participating in the cycle.  For example, consider a large list that's part
of a cycle.  If we call the list's tp_traverse to establish if
it's collectible, list's tp_traverse will call visit() on every item in the
list.  Even though you've made visit() a do-nothing function, that's still a
function call per reference.  It seems a shame to do all of that work
unnecessarily.

----------
Added file: http://bugs.python.org/file17925/unnamed

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9141>
_______________________________________
-------------- next part --------------
<div class="gmail_quote">2010/7/9 Kristján Valur Jónsson <span dir="ltr">&lt;<a href="mailto:report at bugs.python.org" target="_blank">report at bugs.python.org</a>&gt;</span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div>Your message was classified as spam, I have no idea why, but this is why I only noticed it now.</div></blockquote><div><br></div><div>Yes, I just noticed that tonight as well.  I filed a bug on the meta-tracker in the hopes that someone can dig into the underlying cause.</div>


<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Your suggestion is interesting, I hadn&#39;t thought of that.  Yes, it is possible to use the track/untrack functions (I think), but that would mean that you would have to monitor your object for every state change and reliably detect the transition from one state to another.  Being able to query the current state and let gc know:  &quot;No, I cannot be collected as I am now&quot; is a much more robust solution from the programmers perspective.<br>



<br>
A further difference is this:  If an object isn&#39;t tracked, it won&#39;t be collected if it is part of a cycle, but it will not be put in gc.garbage either.  In effect, it will just remain in memory, unreachable, with no chance of it ever being released.<br>


</blockquote><div><br></div><div>Yes, I see.  I have used the track/untrack approach, but it was in a very different situation.  I have a long-running C function which keeps alive a large number of objects.  At the start of the function, I untrack them all as a performance optimization, so the garbage collector does not have to spend time traversing them.  Before the function returns, I track them again.</div>


<div><br></div><div>I see now why that wouldn&#39;t work for your use-case.  Thank you.  I like the idea of your patch to give opportunities to tell the GC that they are uncollectable on the fly.  </div><div><br></div><div>
I&#39;m concerned about the performance impact of making tp_traverse do double-duty, though.  Calling tp_traverse for every object in a cycle will have the effect of making an extra pass on every reference from every object participating in the cycle.  For example, consider a large list that&#39;s part of a cycle.  If we call the list&#39;s tp_traverse to establish if it&#39;s collectible, list&#39;s tp_traverse will call visit() on every item in the list.  Even though you&#39;ve made visit() a do-nothing function, that&#39;s still a function call per reference.  It seems a shame to do all of that work unnecessarily.</div>
</div>


More information about the Python-bugs-list mailing list