[Python-Dev] gcmodule issue w/adding __del__ to generator objects

Phillip J. Eby pje at telecommunity.com
Sun Jun 19 22:16:30 CEST 2005


At 10:15 PM 6/18/2005 -0400, Phillip J. Eby wrote:
>Okay, I think I see why you can't do it.  You could guarantee that all
>relevant __del__ methods get called, but it's bloody difficult to end up
>with only unreachable items in gc.garbage afterwards.   I think gc would
>have to keep a new list for items reachable from finalizers, that don't
>themselves have finalizers.  Then, before creating gc.garbage, you walk the
>finalizers and call their finalization (__del__) methods.  Then, you put
>any remaining items that are in either the finalizer list or the
>reachable-from-finalizers list into gc.garbage.
>
>This approach might need a new type slot, but it seems like it would let us
>guarantee that finalizers get called, even if the object ends up in garbage
>as a result.  In the case of generators, however, close() guarantees that
>the generator releases all its references, and so can no longer be part of
>a cycle.  Thus, it would guarantee eventual cleanup of all
>generators.  And, it would lift the general limitation on __del__ methods.
>
>Hm.  Sounds too good to be true.  Surely if this were possible, Uncle Timmy
>would've thought of it already, no?  Guess we'll have to wait and see what
>he thinks.

Or maybe not.  After sleeping on it, I realized that the problems are all 
in when and how often __del__ is called.  The idea I had above would end up 
calling __del__ twice on non-generator objects.  For generators it's not a 
problem because the first call ends up ensuring that the second call is a 
no-op.

However, the *order* of __del__ calls makes a difference, even for 
generators.  What good is a finally: clause if all the objects reachable 
from it have been finalized already, anyway?

Ultimately, I'm thinking that maybe we were right not to allow try-finally 
to cross yield boundaries in generators.  It doesn't seem like you can 
guarantee anything about the behavior in the presence of cycles, so what's 
the point?

For a while I played around with the idea that maybe we could still support 
'with:' in generators, though, because to implement that we could make 
frames call __exit__ on any pending 'with' blocks as part of their tp_clear 
operation.  This would only work, however, if the objects with __exit__ 
methods don't have any references back to the frame.  In essence, you'd 
need a way to put the __exit__ objects on a GC-managed list that wouldn't 
run until after all the tp_clear calls had finished.

But even that is tough to make guarantees about.  For example, can you 
guarantee in that case that a generator's 'with:' blocks are __exit__-ed in 
the proper order?

Really, if we do allow 'with' and 'try-finally' to surround yield, I think 
we're going to have to tell people that it only works if you use a with or 
try-finally in some non-generator code to ensure that the generator.close() 
gets called, and that if you end up creating a garbage cycle, we either 
have to let it end up in gc.garbage, or just not execute its finally clause 
or __exit__ methods.

Of course, this sort of happens right now for other things with __del__; if 
it's part of a cycle the __del__ method never gets called.  The only 
difference is that it hangs around in gc.garbage, doing nothing useful.  If 
it's garbage, it's not reachable from anywhere else, so it does nobody any 
good to have it around.  So, maybe we should just say, "sucks to be you" 
and tp_clear anything that we'd otherwise have put in gc.garbage.  :)

In other words, since we're not going to call those __del__ methods anyway, 
maybe it just needs to be part of the language semantics that __del__ isn't 
guaranteed to be called, and a garbage collector that can't find a safe way 
to call it, doesn't have to.  tp_dealloc for classic classes and heap types 
could then just skip calling __del__ if they've already been cleared... oh 
wait, how do you know you've been cleared?  Argh.  Another nice idea runs 
up on the rocks of reality.

On the other hand, if you go ahead and run __del__ after tp_clear, the 
__del__ method will quickly run afoul of an AttributeError and die with 
only a minor spew to sys.stderr, thus encouraging people to get rid of 
their silly useless __del__ methods on objects that normally end up in 
cycles.  :)



More information about the Python-Dev mailing list