[Python-Dev] gcmodule issue w/adding __del__ to generator objects

Phillip J. Eby pje at telecommunity.com
Sun Jun 19 04:15:54 CEST 2005


At 06:50 PM 6/18/2005 -0600, Neil Schemenauer wrote:
>On Sat, Jun 18, 2005 at 06:24:48PM -0400, Phillip J. Eby wrote:
> > So, I think I've got this sorted out, assuming that I'm not doing
> > something hideously insane by having 'has_finalizer()' always
> > check tp_del even for non-heap types, and defining a tp_del slot
> > for generators to call close() in.
>
>That sounds like the right thing to do.
>
>I suspect the "uncollectable cycles" problem will not be completely
>solvable.  With this change, all generators become objects with
>finalizers.  In reality, a 'file' object, for example, has a
>finalizer as well but it gets away without telling the GC that
>because its finalizer doesn't do anything "evil".  Since generators
>can do arbitrary things, the GC must assume the worst.

Yep.  It's too bad that there's no simple way to guarantee that the 
generator won't resurrect anything.  On the other hand, close() is 
guaranteed to give the generator at most one chance to do this.  So, 
perhaps there's some way we could have the GC close() generators in 
unreachable cycles.  No, wait, that would mean they could resurrect things, 
right?  Argh.


>Most cycles involving enhanced generators can probably be broken by
>the GC because the generator is not in the strongly connected part
>of cycle.  The GC will have to work a little harder to figure that
>out but that's probably not too significant.

Yep; by setting the generator's frame to None, I was able to significantly 
reduce the number of generator cycles in the tests.


>The real problem is that some cycles involving enhanced generators
>will not be breakable by the GC.  I think some programs that used to
>work okay are now going to start leaking memory because objects will
>accumulate in gc.garbage.

Yep, unless we .close() generators after adding them to gc.garbage(), which 
*might* be an option.  Although, I suppose if it *were* an option, then why 
doesn't GC already have some sort of ability to do this?  (i.e. run __del__ 
methods on items in gc.garbage, then remove them if their refcount drops to 
1 as a result).

[...pause to spend 5 minutes working it out in pseudocode...]

Okay, I think I see why you can't do it.  You could guarantee that all 
relevant __del__ methods get called, but it's bloody difficult to end up 
with only unreachable items in gc.garbage afterwards.   I think gc would 
have to keep a new list for items reachable from finalizers, that don't 
themselves have finalizers.  Then, before creating gc.garbage, you walk the 
finalizers and call their finalization (__del__) methods.  Then, you put 
any remaining items that are in either the finalizer list or the 
reachable-from-finalizers list into gc.garbage.

This approach might need a new type slot, but it seems like it would let us 
guarantee that finalizers get called, even if the object ends up in garbage 
as a result.  In the case of generators, however, close() guarantees that 
the generator releases all its references, and so can no longer be part of 
a cycle.  Thus, it would guarantee eventual cleanup of all 
generators.  And, it would lift the general limitation on __del__ methods.

Hm.  Sounds too good to be true.  Surely if this were possible, Uncle Timmy 
would've thought of it already, no?  Guess we'll have to wait and see what 
he thinks.


>Now, I could be wrong about all this.  I've have not been following
>the PEP 343 discussion too closely.  Maybe Guido has some clever
>idea.  Also, I find it difficult to hold in my head a complete model
>of how the GC now works.  It's an incredibly subtle piece of code.
>Perhaps Tim can comment.

I'm hoping Uncle Timmy can work his usual algorithmic magic here and 
provide us with a brilliant but impossible-for-mere-mortals-to-understand 
solution.  (The impossible-to-understand part being optional, of course. :) )



More information about the Python-Dev mailing list