Garbage announcement printed on interpreter shutdown
Hey #python-dev, I'd like to ask your opinion on this change; I think it should be reverted or at least made silent by default. Basically, it prints a warning like gc: 2 uncollectable objects at shutdown: Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them. at interpreter shutdown if gc.garbage is nonempty. IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it, assuming it came from their console application (imagine Mercurial printing this). Opinions? Georg Am 09.08.2010 00:18, schrieb antoine.pitrou:
Author: antoine.pitrou Date: Mon Aug 9 00:18:46 2010 New Revision: 83861
Log: Issue #477863: Print a warning at shutdown if gc.garbage is not empty.
Modified: python/branches/py3k/Doc/library/gc.rst python/branches/py3k/Doc/whatsnew/3.2.rst python/branches/py3k/Include/pythonrun.h python/branches/py3k/Lib/test/test_gc.py python/branches/py3k/Misc/NEWS python/branches/py3k/Modules/gcmodule.c python/branches/py3k/Python/pythonrun.c
Modified: python/branches/py3k/Doc/library/gc.rst ============================================================================== --- python/branches/py3k/Doc/library/gc.rst (original) +++ python/branches/py3k/Doc/library/gc.rst Mon Aug 9 00:18:46 2010 @@ -177,6 +177,15 @@ If :const:`DEBUG_SAVEALL` is set, then all unreachable objects will be added to this list rather than freed.
+ .. versionchanged:: 3.2 + If this list is non-empty at interpreter shutdown, a warning message + gets printed: + + :: + + gc: 2 uncollectable objects at shutdown: + Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them. + The following constants are provided for use with :func:`set_debug`:
@@ -197,6 +206,9 @@ reachable but cannot be freed by the collector). These objects will be added to the ``garbage`` list.
+ .. versionchanged:: 3.2 + Also print the contents of the :data:`garbage` list at interpreter + shutdown (rather than just its length), if it isn't empty.
.. data:: DEBUG_SAVEALL
Modified: python/branches/py3k/Doc/whatsnew/3.2.rst ============================================================================== --- python/branches/py3k/Doc/whatsnew/3.2.rst (original) +++ python/branches/py3k/Doc/whatsnew/3.2.rst Mon Aug 9 00:18:46 2010 @@ -119,6 +119,11 @@ * The :class:`ftplib.FTP` class now supports the context manager protocol (Contributed by Tarek Ziadé and Giampaolo Rodolà; :issue:`4972`.)
+* A warning message will now get printed at interpreter shutdown if + the :data:`gc.garbage` list isn't empty. This is meant to make the + programmer aware that his code contains object finalization issues. + (Added by Antoine Pitrou; :issue:`477863`.) + * The :func:`shutil.copytree` function has two new options:
* *ignore_dangling_symlinks*: when ``symlinks=False`` (meaning that the
Modified: python/branches/py3k/Include/pythonrun.h ============================================================================== --- python/branches/py3k/Include/pythonrun.h (original) +++ python/branches/py3k/Include/pythonrun.h Mon Aug 9 00:18:46 2010 @@ -148,6 +148,7 @@ PyAPI_FUNC(void) PyByteArray_Fini(void); PyAPI_FUNC(void) PyFloat_Fini(void); PyAPI_FUNC(void) PyOS_FiniInterrupts(void); +PyAPI_FUNC(void) _PyGC_Fini(void);
/* Stuff with no proper home (yet) */ PyAPI_FUNC(char *) PyOS_Readline(FILE *, FILE *, char *);
Modified: python/branches/py3k/Lib/test/test_gc.py ============================================================================== --- python/branches/py3k/Lib/test/test_gc.py (original) +++ python/branches/py3k/Lib/test/test_gc.py Mon Aug 9 00:18:46 2010 @@ -1,5 +1,5 @@ import unittest -from test.support import verbose, run_unittest +from test.support import verbose, run_unittest, strip_python_stderr import sys import gc import weakref @@ -466,6 +466,42 @@ # would be damaged, with an empty __dict__. self.assertEqual(x, None)
+ def test_garbage_at_shutdown(self): + import subprocess + code = """if 1: + import gc + class X: + def __init__(self, name): + self.name = name + def __repr__(self): + return "
" %% self.name + def __del__(self): + pass + + x = X('first') + x.x = x + x.y = X('second') + del x + if %d: + gc.set_debug(gc.DEBUG_UNCOLLECTABLE) + """ + def run_command(code): + p = subprocess.Popen([sys.executable, "-c", code], + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + stdout, stderr = p.communicate() + self.assertEqual(p.returncode, 0) + self.assertEqual(stdout.strip(), b"") + return strip_python_stderr(stderr) + + stderr = run_command(code % 0) + self.assertIn(b"gc: 2 uncollectable objects at shutdown", stderr) + self.assertNotIn(b"[ , ]", stderr) + # With DEBUG_UNCOLLECTABLE, the garbage list gets printed + stderr = run_command(code % 1) + self.assertIn(b"gc: 2 uncollectable objects at shutdown", stderr) + self.assertIn(b"[ , ]", stderr) + class GCTogglingTests(unittest.TestCase): def setUp(self): gc.enable() Modified: python/branches/py3k/Misc/NEWS ============================================================================== --- python/branches/py3k/Misc/NEWS (original) +++ python/branches/py3k/Misc/NEWS Mon Aug 9 00:18:46 2010 @@ -30,6 +30,8 @@ Extensions ----------
+- Issue #477863: Print a warning at shutdown if gc.garbage is not empty. + - Issue #6869: Fix a refcount problem in the _ctypes extension.
- Issue #5504: ctypes should now work with systems where mmap can't
Modified: python/branches/py3k/Modules/gcmodule.c ============================================================================== --- python/branches/py3k/Modules/gcmodule.c (original) +++ python/branches/py3k/Modules/gcmodule.c Mon Aug 9 00:18:46 2010 @@ -1295,17 +1295,16 @@
static struct PyModuleDef gcmodule = { PyModuleDef_HEAD_INIT, - "gc", - gc__doc__, - -1, - GcMethods, - NULL, - NULL, - NULL, - NULL + "gc", /* m_name */ + gc__doc__, /* m_doc */ + -1, /* m_size */ + GcMethods, /* m_methods */ + NULL, /* m_reload */ + NULL, /* m_traverse */ + NULL, /* m_clear */ + NULL /* m_free */ };
- PyMODINIT_FUNC PyInit_gc(void) { @@ -1364,6 +1363,37 @@ return n; }
+void +_PyGC_Fini(void) +{ + if (garbage != NULL && PyList_GET_SIZE(garbage) > 0) { + PySys_WriteStderr( + "gc: " + "%" PY_FORMAT_SIZE_T "d uncollectable objects at shutdown:\n", + PyList_GET_SIZE(garbage) + ); + if (debug & DEBUG_UNCOLLECTABLE) { + PyObject *repr = NULL, *bytes = NULL; + repr = PyObject_Repr(garbage); + if (!repr || !(bytes = PyUnicode_EncodeFSDefault(repr))) + PyErr_WriteUnraisable(garbage); + else { + PySys_WriteStderr( + " %s\n", + PyBytes_AS_STRING(bytes) + ); + } + Py_XDECREF(repr); + Py_XDECREF(bytes); + } + else { + PySys_WriteStderr( + " Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them.\n" + ); + } + } +} + /* for debugging */ void _PyGC_Dump(PyGC_Head *g)
Modified: python/branches/py3k/Python/pythonrun.c ============================================================================== --- python/branches/py3k/Python/pythonrun.c (original) +++ python/branches/py3k/Python/pythonrun.c Mon Aug 9 00:18:46 2010 @@ -404,6 +404,9 @@ while (PyGC_Collect() > 0) /* nothing */; #endif + /* We run this while most interpreter state is still alive, so that + debug information can be printed out */ + _PyGC_Fini();
/* Destroy all modules */ PyImport_Cleanup(); _______________________________________________ Python-checkins mailing list Python-checkins@python.org http://mail.python.org/mailman/listinfo/python-checkins
-- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
Georg Brandl writes:
it prints a warning (...) at interpreter shutdown if gc.garbage is nonempty.
IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default
Opinions?
Agreed, this should be reverted for the reasons you give but DO LEAVE THIS on by default for regrtest (or maybe unittest in general) :) It has already proved useful for me. Is that doable? -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/
On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl
IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it,
Agreed; this should be silent by default. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein
2010/9/10 Fred Drake
On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl
wrote: IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it,
Agreed; this should be silent by default.
+1. I suggest to enable it only when Py_DEBUG (or Py_TRACE_REFS or Py_REF_DEBUG?) is defined. -- Amaury Forgeot d'Arc
On Fri, Sep 10, 2010 at 3:32 PM, Georg Brandl
IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it, assuming it came from their console application (imagine Mercurial printing this).
A non-empty gc.garbage often indicates that there is a bug in the program and that it is probably leaking memory [1]. That's a little different from a DeprecationWarning which doesn't indicate a bug; it just indicates that the program might not run correctly using a future version of Python. I think a better comparison would be with exceptions throw from a __del__, which (as far as I know) are still printed to the console. +1 on adding a way to enable/disable the feature. -1 on removing the feature -0 on making it disabled by default [1] I know that some large, long-running programs periodically check gc.garbage and carefully choose where to break cycles, but those are the exception and not the rule. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
On Sep 10, 2010, at 5:10 PM, Amaury Forgeot d'Arc wrote:
2010/9/10 Fred Drake
: On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl
wrote: IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it,
Agreed; this should be silent by default.
+1. I suggest to enable it only when Py_DEBUG (or Py_TRACE_REFS or Py_REF_DEBUG?) is defined.
Would it be possible to treat it the same way as a deprecation warning, and show it under the same conditions? It would be nice to know if my Python program is leaking uncollectable objects without rebuilding the interpreter.
On Sat, Sep 11, 2010 at 9:42 AM, Glyph Lefkowitz
On Sep 10, 2010, at 5:10 PM, Amaury Forgeot d'Arc wrote:
2010/9/10 Fred Drake
: On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl
wrote: IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it,
Agreed; this should be silent by default.
+1. I suggest to enable it only when Py_DEBUG (or Py_TRACE_REFS or Py_REF_DEBUG?) is defined.
Would it be possible to treat it the same way as a deprecation warning, and show it under the same conditions? It would be nice to know if my Python program is leaking uncollectable objects without rebuilding the interpreter.
My suggestion: 1. Add a gc.WARN_UNCOLLECTABLE flag on gc.set_debug that enables the warning message. 2. Have regrtest explicitly set this for our own test suite As far as automatically turning it on for third party test suites goes, we could either: - require them to turn it on explicitly via gc.set_debug - have gc.WARN_UNCOLLECTABLE default to true for non-optimised runs (__debug__ == True) and false for runs with -O or -OO (__debug__ == False) - set it by looking at the -W arguments passed in at interpreter startup (e.g. enable it when all warnings are enabled, leave it disabled by default otherwise) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Sep 10, 2010 at 14:55, Daniel Stutzbach
On Fri, Sep 10, 2010 at 3:32 PM, Georg Brandl
wrote: IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it, assuming it came from their console application (imagine Mercurial printing this).
A non-empty gc.garbage often indicates that there is a bug in the program and that it is probably leaking memory [1]. That's a little different from a DeprecationWarning which doesn't indicate a bug; it just indicates that the program might not run correctly using a future version of Python. I think a better comparison would be with exceptions throw from a __del__, which (as far as I know) are still printed to the console.
Sure, but exceptions printed out by a __del__ method during interpreter shutdown are not explicitly done as part of the shutdown process, they just happen to typically be triggered by a shutdown. This gc info, OTOH, is explicitly debugging information that is only printed out (typically) at shutdown and thus at a point where it will no longer effect semantics or the performance of the application. So I view this as entirely debugging information and thus in no way critical for a user to know about. -Brett
+1 on adding a way to enable/disable the feature. -1 on removing the feature -0 on making it disabled by default
[1] I know that some large, long-running programs periodically check gc.garbage and carefully choose where to break cycles, but those are the exception and not the rule. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
Hello,
I'd like to ask your opinion on this change; I think it should be reverted or at least made silent by default. Basically, it prints a warning like
gc: 2 uncollectable objects at shutdown: Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them.
at interpreter shutdown if gc.garbage is nonempty.
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable). One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings). What do you think? Antoine.
2010/9/29 Antoine Pitrou
Hello,
I'd like to ask your opinion on this change; I think it should be reverted or at least made silent by default. Basically, it prints a warning like
gc: 2 uncollectable objects at shutdown: Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them.
at interpreter shutdown if gc.garbage is nonempty.
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable).
One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings).
What do you think?
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script. -- Regards, Benjamin
On Wed, Sep 29, 2010 at 14:27, Benjamin Peterson
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Yeah, I often don't close files in scripts that I know are short running or only ever open one or two files, and I don't think I should be warned about that. Cheers, Dirkjan
Le mercredi 29 septembre 2010 à 07:27 -0500, Benjamin Peterson a écrit :
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable).
One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings).
What do you think?
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Isn't the "with" statement appropriate in these cases? My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script. But if it doesn't, what about enabling it with a command-line switch?
On Wed, 29 Sep 2010 10:42:27 pm Antoine Pitrou wrote:
My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script.
But if it doesn't, what about enabling it with a command-line switch?
I think the ability to detect such file descriptor leaks would be valuable, but I'm not sure that it should be running all the time. At the risk of bike-shedding, is it something which could be controlled at runtime, like garbage collection? E.g. something like: gc.enable_file_warnings() run_my_tests_for_leakage() gc.disable_file_warnings() or similar. (I'm not wedded to it being in the gc module.) Otherwise, I'm +0.25 on enabling it with a command line switch, and -0 on turning it on by default. -- Steven D'Aprano
On 09/29/2010 02:42 PM, Antoine Pitrou wrote:
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Isn't the "with" statement appropriate in these cases?
A hacked-together quick script might contain code like: parse(open(bla).read()) Compared to this, "with" adds a new indentation level and a new variable, while breaking the flow of the code: with open(bla) as foo: contents = foo.read() parse(contents) People used to writing production code under stringent guidelines (introduced for good reason) will probably not be sympathetic to quick-hack usage patterns, but Python is used on both sides of the fence.
On Sep 29, 2010, at 11:11 PM, Steven D'Aprano wrote:
On Wed, 29 Sep 2010 10:42:27 pm Antoine Pitrou wrote:
My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script.
But if it doesn't, what about enabling it with a command-line switch?
I think the ability to detect such file descriptor leaks would be valuable, but I'm not sure that it should be running all the time. At the risk of bike-shedding, is it something which could be controlled at runtime, like garbage collection? E.g. something like:
gc.enable_file_warnings() run_my_tests_for_leakage() gc.disable_file_warnings()
or similar. (I'm not wedded to it being in the gc module.)
I don't think it should be in the gc module, but I would prefer it be enabled and controlled through a separate module, rather than something Python does automatically for your convenience. -Barry
On Wed, Sep 29, 2010 at 05:42, Antoine Pitrou
Le mercredi 29 septembre 2010 à 07:27 -0500, Benjamin Peterson a écrit :
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable).
One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings).
What do you think?
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Isn't the "with" statement appropriate in these cases?
Yes, which is why I suspect people saying they don't bother have been programming Python for a while and are not in the habit yet of using a 'with' statement. The amount of extra typing compared to inlining a call is minimal.
My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script.
Does everyone here run all their code under py-debug? If not then I say switch it on when py-debug is on so that we at least detect the leaks in the stdlib without having to think about it.
But if it doesn't, what about enabling it with a command-line switch?
Sure, but I say always turn it on under py-debug.
On Wed, Sep 29, 2010 at 9:31 AM, Brett Cannon
On Wed, Sep 29, 2010 at 05:42, Antoine Pitrou
wrote: Le mercredi 29 septembre 2010 à 07:27 -0500, Benjamin Peterson a écrit :
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable).
One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings).
What do you think?
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Isn't the "with" statement appropriate in these cases?
Yes, which is why I suspect people saying they don't bother have been programming Python for a while and are not in the habit yet of using a 'with' statement. The amount of extra typing compared to inlining a call is minimal.
My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script.
Does everyone here run all their code under py-debug? If not then I say switch it on when py-debug is on so that we at least detect the leaks in the stdlib without having to think about it.
But if it doesn't, what about enabling it with a command-line switch?
Sure, but I say always turn it on under py-debug.
This would be a big +1 from me. Geremy Condra
On Wed, Sep 29, 2010 at 11:40 PM, Barry Warsaw
I don't think it should be in the gc module, but I would prefer it be enabled and controlled through a separate module, rather than something Python does automatically for your convenience.
The os module would seem to be the place to enable/disable tracking of OS level resource leaks (i.e. file descriptors and possible HANDLES on Windows). I'm not sure how practical this idea will prove to implement though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 29 September 2010 22:25, Nick Coghlan
On Wed, Sep 29, 2010 at 11:40 PM, Barry Warsaw
wrote: I don't think it should be in the gc module, but I would prefer it be enabled and controlled through a separate module, rather than something Python does automatically for your convenience.
The os module would seem to be the place to enable/disable tracking of OS level resource leaks (i.e. file descriptors and possible HANDLES on Windows).
Heh, I was expecting the sys module to be the natural choice because this would be changing interpreter behaviour. It's just random bikeshedding at this point however. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org
Sorry for late post. On 2010/09/29 20:01, Antoine Pitrou wrote:
Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects
I think this can be relaxed with the patch in #9815. ;-)
participants (17)
-
Amaury Forgeot d'Arc
-
Antoine Pitrou
-
Barry Warsaw
-
Benjamin Peterson
-
Brett Cannon
-
Daniel Stutzbach
-
Dirkjan Ochtman
-
Floris Bruynooghe
-
Fred Drake
-
Georg Brandl
-
geremy condra
-
Glyph Lefkowitz
-
Hirokazu Yamamoto
-
Hrvoje Niksic
-
Nick Coghlan
-
Steven D'Aprano
-
Łukasz Langa