[Python-checkins] python/dist/src/Modules gcmodule.c,2.66,2.67

tim_one@users.sourceforge.net tim_one@users.sourceforge.net
Mon, 07 Apr 2003 12:21:21 -0700


Update of /cvsroot/python/python/dist/src/Modules
In directory sc8-pr-cvs1:/tmp/cvs-serv25392/python/Modules

Modified Files:
	gcmodule.c 
Log Message:
Reworked has_finalizer() to use the new _PyObject_Lookup() instead
of PyObject_HasAttr(); the former promises never to execute
arbitrary Python code.  Undid many of the changes recently made to
worm around the worst consequences of that PyObject_HasAttr() could
execute arbitrary Python code.

Compatibility is hard to discuss, because the dangerous cases are
so perverse, and much of this appears to rely on implementation
accidents.

To start with, using hasattr() to check for __del__ wasn't only
dangerous, in some cases it was wrong:  if an instance of an old-
style class didn't have "__del__" in its instance dict or in any
base class dict, but a getattr hook said __del__ existed, then
hasattr() said "yes, this object has a __del__".  But
instance_dealloc() ignores the possibility of getattr hooks when
looking for a __del__, so while object.__del__ succeeds, no
__del__ method is called when the object is deleted.  gc was
therefore incorrect in believing that the object had a finalizer.

The new method doesn't suffer that problem (like instance_dealloc(),
_PyObject_Lookup() doesn't believe __del__ exists in that case), but
does suffer a somewhat opposite-- and even more obscure --oddity:
if an instance of an old-style class doesn't have "__del__" in its
instance dict, and a base class does have "__del__" in its dict,
and the first base class with a "__del__" associates it with a
descriptor (an object with a __get__ method), *and* if that
descriptor raises an exception when __get__ is called, then
(a) the current method believes the instance does have a __del__,
but (b) hasattr() does not believe the instance has a __del__.

While these disagree, I believe the new method is "more correct":
because the descriptor *will* be called when the object is
destructed, it can execute arbitrary Python code at the time the
object is destructed, and that's really what gc means by "has a
finalizer":  not specifically a __del__ method, but more generally
the possibility of executing arbitrary Python code at object
destruction time.  Code in a descriptor's __get__() executed at
destruction time can be just as problematic as code in a
__del__() executed then.

So I believe the new method is better on all counts.

Bugfix candidate, but it's unclear to me how all this differs in
the 2.2 branch (e.g., new-style and old-style classes already
took different gc paths in 2.3 before this last round of patches,
but don't in the 2.2 branch).


Index: gcmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/gcmodule.c,v
retrieving revision 2.66
retrieving revision 2.67
diff -C2 -d -r2.66 -r2.67
*** gcmodule.c	6 Apr 2003 23:30:52 -0000	2.66
--- gcmodule.c	7 Apr 2003 19:21:15 -0000	2.67
***************
*** 360,366 ****
  /* Return true if object has a finalization method.
   * CAUTION:  An instance of an old-style class has to be checked for a
!  *__del__ method, and that can cause arbitrary Python code to get executed
!  * via the class's __getattr__ hook (if any).  This function can therefore
!  * mutate the object graph, and that's been the source of subtle bugs.
   */
  static int
--- 360,367 ----
  /* Return true if object has a finalization method.
   * CAUTION:  An instance of an old-style class has to be checked for a
!  *__del__ method, and earlier versions of this used to call PyObject_HasAttr,
!  * which in turn could call the class's __getattr__ hook (if any).  That
!  * could invoke arbitrary Python code, mutating the object graph in arbitrary
!  * ways, and that was the source of some excruciatingly subtle bugs.
   */
  static int
***************
*** 368,376 ****
  {
  	if (PyInstance_Check(op)) {
- 		/* This is the dangerous path:  hasattr can invoke
- 		 * the class __getattr__(), and that can do anything.
- 		 */
  		assert(delstr != NULL);
! 		return PyObject_HasAttr(op, delstr);
  	}
  	else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE))
--- 369,374 ----
  {
  	if (PyInstance_Check(op)) {
  		assert(delstr != NULL);
! 		return _PyInstance_Lookup(op, delstr) != NULL;
  	}
  	else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE))
***************
*** 380,415 ****
  }
  
! /* Move all objects out of unreachable, into collectable or finalizers.
!  * It's possible that some objects will get collected (via refcount falling
!  * to 0), or resurrected, as a side effect of checking for __del__ methods.
!  * After, finalizers contains all the objects from unreachable that haven't
!  * been collected by magic, and that have a finalizer.  gc_refs is
!  * GC_REACHABLE for all of those.  collectable contains all the remaining
!  * objects from unreachable, and gc_refs remains GC_TENTATIVELY_UNREACHABLE
!  * for those (we're still not sure they're reclaimable after this!  Some
!  * may yet by reachable *from* the objects in finalizers).
   */
  static void
! move_finalizers(PyGC_Head *unreachable, PyGC_Head *collectable,
! 		PyGC_Head *finalizers)
  {
! 	while (!gc_list_is_empty(unreachable)) {
! 		PyGC_Head *gc = unreachable->gc.gc_next;
  		PyObject *op = FROM_GC(gc);
! 		int finalizer;
  
  		assert(IS_TENTATIVELY_UNREACHABLE(op));
  
! 		finalizer = has_finalizer(op);
! 		if (unreachable->gc.gc_next == gc) {
  			gc_list_remove(gc);
! 			if (finalizer) {
! 				gc_list_append(gc, finalizers);
! 				gc->gc.gc_refs = GC_REACHABLE;
! 			}
! 			else
! 				gc_list_append(gc, collectable);
  		}
! 		/* else has_finalizer() deleted op via side effect */
  	}
  }
--- 378,403 ----
  }
  
! /* Move the objects in unreachable with __del__ methods into finalizers.
!  * The objects remaining in unreachable do not have __del__ methods, and
!  * gc_refs remains GC_TENTATIVELY_UNREACHABLE for them.  The objects
!  * moved into finalizers have gc_refs changed to GC_REACHABLE.
   */
  static void
! move_finalizers(PyGC_Head *unreachable, PyGC_Head *finalizers)
  {
! 	PyGC_Head *gc = unreachable->gc.gc_next;
! 
! 	while (gc != unreachable) {
  		PyObject *op = FROM_GC(gc);
! 		PyGC_Head *next = gc->gc.gc_next;
  
  		assert(IS_TENTATIVELY_UNREACHABLE(op));
  
! 		if (has_finalizer(op)) {
  			gc_list_remove(gc);
! 			gc_list_append(gc, finalizers);
! 			gc->gc.gc_refs = GC_REACHABLE;
  		}
! 		gc = next;
  	}
  }
***************
*** 431,439 ****
  
  /* Move objects that are reachable from finalizers, from the unreachable set
!  * into the reachable_from_finalizers set.
   */
  static void
! move_finalizer_reachable(PyGC_Head *finalizers,
! 			 PyGC_Head *reachable_from_finalizers)
  {
  	traverseproc traverse;
--- 419,426 ----
  
  /* Move objects that are reachable from finalizers, from the unreachable set
!  * into finalizers set.
   */
  static void
! move_finalizer_reachable(PyGC_Head *finalizers)
  {
  	traverseproc traverse;
***************
*** 444,448 ****
  		(void) traverse(FROM_GC(gc),
  				(visitproc)visit_move,
! 				(void *)reachable_from_finalizers);
  	}
  }
--- 431,435 ----
  		(void) traverse(FROM_GC(gc),
  				(visitproc)visit_move,
! 				(void *)finalizers);
  	}
  }
***************
*** 476,488 ****
  /* Handle uncollectable garbage (cycles with finalizers, and stuff reachable
   * only from such cycles).
!  * If DEBUG_SAVEALL or hasfinalizer, the objects in finalizers are appended
!  * to the module garbage list (a Python list).  The objects in finalizers
!  * are merged into the old list regardless.
   * Returns 0 if all OK, <0 on error (out of memory to grow the garbage list).
   * The finalizers list is made empty on a successful return.
   */
  static int
! handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old, int hasfinalizer)
  {
  	if (garbage == NULL) {
  		garbage = PyList_New(0);
--- 463,478 ----
  /* Handle uncollectable garbage (cycles with finalizers, and stuff reachable
   * only from such cycles).
!  * If DEBUG_SAVEALL, all objects in finalizers are appended to the module
!  * garbage list (a Python list), else only the objects in finalizers with
!  * __del__ methods are appended to garbage.  All objects in finalizers are
!  * merged into the old list regardless.
   * Returns 0 if all OK, <0 on error (out of memory to grow the garbage list).
   * The finalizers list is made empty on a successful return.
   */
  static int
! handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old)
  {
+ 	PyGC_Head *gc = finalizers->gc.gc_next;
+ 
  	if (garbage == NULL) {
  		garbage = PyList_New(0);
***************
*** 490,497 ****
  			Py_FatalError("gc couldn't create gc.garbage list");
  	}
! 	if ((debug & DEBUG_SAVEALL) || hasfinalizer) {
! 		if (append_objects(garbage, finalizers) < 0)
! 			return -1;
  	}
  	gc_list_merge(finalizers, old);
  	return 0;
--- 480,492 ----
  			Py_FatalError("gc couldn't create gc.garbage list");
  	}
! 	for (; gc != finalizers; gc = gc->gc.gc_next) {
! 		PyObject *op = FROM_GC(gc);
! 
! 		if ((debug & DEBUG_SAVEALL) || has_finalizer(op)) {
! 			if (PyList_Append(garbage, op) < 0)
! 				return -1;
! 		}
  	}
+ 
  	gc_list_merge(finalizers, old);
  	return 0;
***************
*** 542,548 ****
  	PyGC_Head *old; /* next older generation */
  	PyGC_Head unreachable;
- 	PyGC_Head collectable;
  	PyGC_Head finalizers;
- 	PyGC_Head reachable_from_finalizers;
  	PyGC_Head *gc;
  
--- 537,541 ----
***************
*** 607,635 ****
  	 * instance objects with __del__ methods.
  	 *
! 	 * Move each unreachable object into the collectable set or the
! 	 * finalizers set.  Because we need to check for __del__ methods on
! 	 * instances of classic classes, arbitrary Python code may get
! 	 * executed by getattr hooks:  that may resurrect or deallocate (via
! 	 * refcount falling to 0) unreachable objects, so this is very
! 	 * delicate.
! 	 */
! 	gc_list_init(&collectable);
  	gc_list_init(&finalizers);
! 	move_finalizers(&unreachable, &collectable, &finalizers);
  	/* finalizers contains the unreachable objects with a finalizer;
  	 * unreachable objects reachable only *from* those are also
! 	 * uncollectable; we move those into a separate list
! 	 * (reachable_from_finalizers) so we don't have to do the dangerous
! 	 * has_finalizer() test again later.
  	 */
! 	gc_list_init(&reachable_from_finalizers);
! 	move_finalizer_reachable(&finalizers, &reachable_from_finalizers);
! 	/* And move everything only reachable from the reachable stuff. */
! 	move_finalizer_reachable(&reachable_from_finalizers,
! 				 &reachable_from_finalizers);
  
  	/* Collect statistics on collectable objects found and print
  	 * debugging information. */
! 	for (gc = collectable.gc.gc_next; gc != &collectable;
  			gc = gc->gc.gc_next) {
  		m++;
--- 600,616 ----
  	 * instance objects with __del__ methods.
  	 *
! 	 * Move unreachable objects with finalizers into a different list.
!  	 */
  	gc_list_init(&finalizers);
! 	move_finalizers(&unreachable, &finalizers);
  	/* finalizers contains the unreachable objects with a finalizer;
  	 * unreachable objects reachable only *from* those are also
! 	 * uncollectable, and we move those into the finalizers list too.
  	 */
! 	move_finalizer_reachable(&finalizers);
  
  	/* Collect statistics on collectable objects found and print
  	 * debugging information. */
! 	for (gc = unreachable.gc.gc_next; gc != &unreachable;
  			gc = gc->gc.gc_next) {
  		m++;
***************
*** 641,645 ****
  	 * the reference cycles to be broken. It may also cause some objects
  	 * in finalizers and/or reachable_from_finalizers to be freed */
! 	delete_garbage(&collectable, old);
  
  	/* Collect statistics on uncollectable objects found and print
--- 622,626 ----
  	 * the reference cycles to be broken. It may also cause some objects
  	 * in finalizers and/or reachable_from_finalizers to be freed */
! 	delete_garbage(&unreachable, old);
  
  	/* Collect statistics on uncollectable objects found and print
***************
*** 652,662 ****
  			debug_cycle("uncollectable", FROM_GC(gc));
  	}
- 	for (gc = reachable_from_finalizers.gc.gc_next;
- 	     gc != &reachable_from_finalizers;
- 	     gc = gc->gc.gc_next) {
- 		n++;
- 		if (debug & DEBUG_UNCOLLECTABLE)
- 			debug_cycle("uncollectable", FROM_GC(gc));
- 	}
  	if (debug & DEBUG_STATS) {
  		if (m == 0 && n == 0) {
--- 633,636 ----
***************
*** 674,684 ****
  	 * this if they insist on creating this type of structure.
  	 */
! 	if (handle_finalizers(&finalizers, old, 1) == 0)
! 		(void)handle_finalizers(&reachable_from_finalizers, old, 0);
  
  	if (PyErr_Occurred()) {
! 		if (gc_str == NULL) {
  		    gc_str = PyString_FromString("garbage collection");
- 		}
  		PyErr_WriteUnraisable(gc_str);
  		Py_FatalError("unexpected exception during garbage collection");
--- 648,656 ----
  	 * this if they insist on creating this type of structure.
  	 */
! 	(void)handle_finalizers(&finalizers, old);
  
  	if (PyErr_Occurred()) {
! 		if (gc_str == NULL)
  		    gc_str = PyString_FromString("garbage collection");
  		PyErr_WriteUnraisable(gc_str);
  		Py_FatalError("unexpected exception during garbage collection");