[pypy-commit] pypy default: Document in even more details the issue of delayed __del__,

arigo noreply at buildbot.pypy.org
Fri Apr 11 11:11:42 CEST 2014


Author: Armin Rigo <arigo at tunes.org>
Branch: 
Changeset: r70544:25176f5d15bf
Date: 2014-04-11 11:11 +0200
http://bitbucket.org/pypy/pypy/changeset/25176f5d15bf/

Log:	Document in even more details the issue of delayed __del__, prompted
	by issue 878.

diff --git a/pypy/doc/cpython_differences.rst b/pypy/doc/cpython_differences.rst
--- a/pypy/doc/cpython_differences.rst
+++ b/pypy/doc/cpython_differences.rst
@@ -106,23 +106,43 @@
 Differences related to garbage collection strategies
 ----------------------------------------------------
 
-Most of the garbage collectors used or implemented by PyPy are not based on
+The garbage collectors used or implemented by PyPy are not based on
 reference counting, so the objects are not freed instantly when they are no
 longer reachable.  The most obvious effect of this is that files are not
 promptly closed when they go out of scope.  For files that are opened for
 writing, data can be left sitting in their output buffers for a while, making
-the on-disk file appear empty or truncated.
+the on-disk file appear empty or truncated.  Moreover, you might reach your
+OS's limit on the number of concurrently opened files.
 
-Fixing this is essentially not possible without forcing a
+Fixing this is essentially impossible without forcing a
 reference-counting approach to garbage collection.  The effect that you
 get in CPython has clearly been described as a side-effect of the
 implementation and not a language design decision: programs relying on
 this are basically bogus.  It would anyway be insane to try to enforce
 CPython's behavior in a language spec, given that it has no chance to be
 adopted by Jython or IronPython (or any other port of Python to Java or
-.NET, like PyPy itself).
+.NET).
 
-This affects the precise time at which ``__del__`` methods are called, which
+Even the naive idea of forcing a full GC when we're getting dangerously
+close to the OS's limit can be very bad in some cases.  If your program
+leaks open files heavily, then it would work, but force a complete GC
+cycle every n'th leaked file.  The value of n is a constant, but the
+program can take an arbitrary amount of memory, which makes a complete
+GC cycle arbitrarily long.  The end result is that PyPy would spend an
+arbitrarily large fraction of its run time in the GC --- slowing down
+the actual execution, not by 10% nor 100% nor 1000% but by essentially
+any factor.
+
+To the best of our knowledge this problem has no better solution than
+fixing the programs.  If it occurs in 3rd-party code, this means going
+to the authors and explaining the problem to them: they need to close
+their open files in order to run on any non-CPython-based implementation
+of Python.
+
+---------------------------------
+
+Here are some more technical details.  This issue affects the precise
+time at which ``__del__`` methods are called, which
 is not reliable in PyPy (nor Jython nor IronPython).  It also means that
 weak references may stay alive for a bit longer than expected.  This
 makes "weak proxies" (as returned by ``weakref.proxy()``) somewhat less


More information about the pypy-commit mailing list