GC and finalizers [was: No destructor]

Paul Duffin pduffin at hursley.ibm.com
Thu Aug 31 12:33:49 EDT 2000


Martin von Loewis wrote:
> 
> Paul Duffin <pduffin at hursley.ibm.com> writes:
> 
> > A little picture describing the cycle would be useful.
> 
> Maybe, yes. However, people participating in this thread should have
> no problems following a few lines of Python code.
> 

True enough but as a visitor interested in learning from you lot
I know very little Python.

> > The clean up sequence would have to be something more like.
> >
> >       __del__ is called for all objects in the cycle.
> 
> In what order? What if one invocation of __del__ restores the entire
> cycle back to life?
> 

That is a problem. You either detect it as soon as it happens and
stop, or you keep going and check afterwards. I think calling the
destructor on all objects makes the most sense.

Why would a Python object want to preserve itself ? One possible
reason would be to store itself in a cache, are there any others ?

> >       If any objects now have external references then the garbage
> >       collection is over.
> 
> What if there are other objects living in cycles (but different
> cycles)?  Do you know an efficient algorithm (in time and space) that
> computes all objects in a cycle?
> 

By "any objects" I meant any objects in this cycle. I am not quite sure 
how to determine whether an object has acquired a reference external to
this cycle.

If necessary then the dictionaries of these objects could be emptied
after all the finalizers have been called which should eliminate all 
possible sources of cycles and guarantee that the cycle is broken. At
this point it is easy to determine whether or not an object has a
reference from outside of the cycle by simply checking whether the
object has any references apart from the one held by the garbage
collector (see below).

I am not sure whether this is satisfactory as it basically clears all
objects in the cycle making them practically useless.

> > h.next.item is obviously a reference to an object so assuming that it
> > is the only reference to the object when the dict is cleared the
> > reference will go and the resource object will be freed and should clean
> > itself up.
> 
> No, it is an external reference. h.next.item was meant as an integer
> object (so it is still an object, just not an instance object); releasing
> this integer will *not* release the resource.
> 

I am a little confused, is the integer the resource, or is the resource
something else which the integer magically refers to ?

If the integer is the resource then as long as the object releases its
reference then there is no problem, or is there ?

> [I'm having difficulties to parse the next sentence, please correct
>  me if I broke it up in the wrong spots.]
> 
> > Am I right in saying that because Python does not prevent external accesses
> > to an object that any object could be the cause of a cycle
> 
> It is true that any instance object could be the cause of a cycle,
> yes.  I don't know whether the ability to external access is the
> cause; I'd rather think it is not.
> 

What I meant was that allowing external access to the object (specifically 
the dictionary) means that objects which do not normally cause cycles
could be made to cause cycles by an external party. Hence every object
would have to be written to cope with being the cause of a cycle instead
of just those ones which were intended to cause cycles.

> > so every __del__ function would have to check that any external
> > objects it was referencing were still valid, i.e. not just deleted.
> 
> In current Python, no object has to check whether something it
> references is deleted - objects that are still referenced are never
> deleted.
> 

True. The point is that the object has not yet been deleted, rather it
has been marked as deleted somehow.

> > Because the instance dictionaries would not be cleared until all
> > __del__ functions had been called then self.next.item exists when
> > Head.__del__ is called. It should probably also check that self.next
> > and self.next.item are valid anyway.
> 
> Calling all finalizers first before clearing any object causes
> different problems. How do you deal with reference counting then?
> Given a single finalizer
> 
> def __delf__(self):
>    self.next = None
> 
> how do you deal with the DECREF of the old value of self.next? It used
> to be 1, before collection started; it then goes to zero. Normally,
> you should call the finalizer of the object now. However, you may have
> called that already, as self.next may have been found earlier in the
> cycle. What do you do?
> 

In a previous post (a long time ago) I covered this problem. Basically
before the garbage collector starts cleaning up the objects in a cycle
it creates a counted reference to each one. This will prevent objects
in the cycle being freed if the finalizer of one object breaks the cycle.

Thinking about this problem hurts my head ;-)

The current solution to this of adding cycles which could not be cleaned 
up to a global list does ensure that an application can find those 
objects which could not be cleaned up but what is it supposed to do with
them ?

If it is going to clean them up then it will run into exactly the same
problems that the garbage collector does, especially if the objects 
in the cycle were created in the depths of a module and the application
has no special knowledge about how to clean them up.

Is exiting a valid thing for a Python application to do in this
situation ?

Maybe a compromise solution would be for objects which expect to be
part of the cycle to provide another method, e.g. __cycle__ which
is either a very careful replacement for __del__, or it just tries
to break the cycle. If a cycle is detected which contains multiple
objects with finalizers and no __cycle__ methods then the garbage
is added to the global list.



More information about the Python-list mailing list