Garbage announcement printed on interpreter shutdown

Hey #python-dev, I'd like to ask your opinion on this change; I think it should be reverted or at least made silent by default. Basically, it prints a warning like gc: 2 uncollectable objects at shutdown: Use gc.set_debug(gc.DEBUG_UNCOLLECTABLE) to list them. at interpreter shutdown if gc.garbage is nonempty. IMO this runs contrary to the decision we made when DeprecationWarnings were made silent by default: it spews messages not only at developers, but also at users, who don't need it and probably are going to be quite confused by it, assuming it came from their console application (imagine Mercurial printing this). Opinions? Georg Am 09.08.2010 00:18, schrieb antoine.pitrou:
-- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Georg Brandl writes:
Agreed, this should be reverted for the reasons you give but DO LEAVE THIS on by default for regrtest (or maybe unittest in general) :) It has already proved useful for me. Is that doable? -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/

On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl <g.brandl@gmx.net> wrote:
Agreed; this should be silent by default. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein

On Sep 10, 2010, at 5:10 PM, Amaury Forgeot d'Arc wrote:
Would it be possible to treat it the same way as a deprecation warning, and show it under the same conditions? It would be nice to know if my Python program is leaking uncollectable objects without rebuilding the interpreter.

On Sat, Sep 11, 2010 at 9:42 AM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
My suggestion: 1. Add a gc.WARN_UNCOLLECTABLE flag on gc.set_debug that enables the warning message. 2. Have regrtest explicitly set this for our own test suite As far as automatically turning it on for third party test suites goes, we could either: - require them to turn it on explicitly via gc.set_debug - have gc.WARN_UNCOLLECTABLE default to true for non-optimised runs (__debug__ == True) and false for runs with -O or -OO (__debug__ == False) - set it by looking at the -W arguments passed in at interpreter startup (e.g. enable it when all warnings are enabled, leave it disabled by default otherwise) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Sep 10, 2010 at 3:32 PM, Georg Brandl <g.brandl@gmx.net> wrote:
A non-empty gc.garbage often indicates that there is a bug in the program and that it is probably leaking memory [1]. That's a little different from a DeprecationWarning which doesn't indicate a bug; it just indicates that the program might not run correctly using a future version of Python. I think a better comparison would be with exceptions throw from a __del__, which (as far as I know) are still printed to the console. +1 on adding a way to enable/disable the feature. -1 on removing the feature -0 on making it disabled by default [1] I know that some large, long-running programs periodically check gc.garbage and carefully choose where to break cycles, but those are the exception and not the rule. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

On Fri, Sep 10, 2010 at 14:55, Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
Sure, but exceptions printed out by a __del__ method during interpreter shutdown are not explicitly done as part of the shutdown process, they just happen to typically be triggered by a shutdown. This gc info, OTOH, is explicitly debugging information that is only printed out (typically) at shutdown and thus at a point where it will no longer effect semantics or the performance of the application. So I view this as entirely debugging information and thus in no way critical for a user to know about. -Brett

Hello,
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable). One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings). What do you think? Antoine.

On Wed, Sep 29, 2010 at 14:27, Benjamin Peterson <benjamin@python.org> wrote:
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Yeah, I often don't close files in scripts that I know are short running or only ever open one or two files, and I don't think I should be warned about that. Cheers, Dirkjan

Le mercredi 29 septembre 2010 à 07:27 -0500, Benjamin Peterson a écrit :
Isn't the "with" statement appropriate in these cases? My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script. But if it doesn't, what about enabling it with a command-line switch?

On Wed, 29 Sep 2010 10:42:27 pm Antoine Pitrou wrote:
I think the ability to detect such file descriptor leaks would be valuable, but I'm not sure that it should be running all the time. At the risk of bike-shedding, is it something which could be controlled at runtime, like garbage collection? E.g. something like: gc.enable_file_warnings() run_my_tests_for_leakage() gc.disable_file_warnings() or similar. (I'm not wedded to it being in the gc module.) Otherwise, I'm +0.25 on enabling it with a command line switch, and -0 on turning it on by default. -- Steven D'Aprano

On Sep 29, 2010, at 11:11 PM, Steven D'Aprano wrote:
I don't think it should be in the gc module, but I would prefer it be enabled and controlled through a separate module, rather than something Python does automatically for your convenience. -Barry

On Wed, Sep 29, 2010 at 11:40 PM, Barry Warsaw <barry@python.org> wrote:
The os module would seem to be the place to enable/disable tracking of OS level resource leaks (i.e. file descriptors and possible HANDLES on Windows). I'm not sure how practical this idea will prove to implement though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 29 September 2010 22:25, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh, I was expecting the sys module to be the natural choice because this would be changing interpreter behaviour. It's just random bikeshedding at this point however. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org

On 09/29/2010 02:42 PM, Antoine Pitrou wrote:
A hacked-together quick script might contain code like: parse(open(bla).read()) Compared to this, "with" adds a new indentation level and a new variable, while breaking the flow of the code: with open(bla) as foo: contents = foo.read() parse(contents) People used to writing production code under stringent guidelines (introduced for good reason) will probably not be sympathetic to quick-hack usage patterns, but Python is used on both sides of the fence.

On Wed, Sep 29, 2010 at 05:42, Antoine Pitrou <solipsis@pitrou.net> wrote:
Yes, which is why I suspect people saying they don't bother have been programming Python for a while and are not in the habit yet of using a 'with' statement. The amount of extra typing compared to inlining a call is minimal.
Does everyone here run all their code under py-debug? If not then I say switch it on when py-debug is on so that we at least detect the leaks in the stdlib without having to think about it.
But if it doesn't, what about enabling it with a command-line switch?
Sure, but I say always turn it on under py-debug.

Sorry for late post. On 2010/09/29 20:01, Antoine Pitrou wrote:
Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects
I think this can be relaxed with the patch in #9815. ;-)

Georg Brandl writes:
Agreed, this should be reverted for the reasons you give but DO LEAVE THIS on by default for regrtest (or maybe unittest in general) :) It has already proved useful for me. Is that doable? -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/

On Fri, Sep 10, 2010 at 4:32 PM, Georg Brandl <g.brandl@gmx.net> wrote:
Agreed; this should be silent by default. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "A storm broke loose in my mind." --Albert Einstein

On Sep 10, 2010, at 5:10 PM, Amaury Forgeot d'Arc wrote:
Would it be possible to treat it the same way as a deprecation warning, and show it under the same conditions? It would be nice to know if my Python program is leaking uncollectable objects without rebuilding the interpreter.

On Sat, Sep 11, 2010 at 9:42 AM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
My suggestion: 1. Add a gc.WARN_UNCOLLECTABLE flag on gc.set_debug that enables the warning message. 2. Have regrtest explicitly set this for our own test suite As far as automatically turning it on for third party test suites goes, we could either: - require them to turn it on explicitly via gc.set_debug - have gc.WARN_UNCOLLECTABLE default to true for non-optimised runs (__debug__ == True) and false for runs with -O or -OO (__debug__ == False) - set it by looking at the -W arguments passed in at interpreter startup (e.g. enable it when all warnings are enabled, leave it disabled by default otherwise) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Sep 10, 2010 at 3:32 PM, Georg Brandl <g.brandl@gmx.net> wrote:
A non-empty gc.garbage often indicates that there is a bug in the program and that it is probably leaking memory [1]. That's a little different from a DeprecationWarning which doesn't indicate a bug; it just indicates that the program might not run correctly using a future version of Python. I think a better comparison would be with exceptions throw from a __del__, which (as far as I know) are still printed to the console. +1 on adding a way to enable/disable the feature. -1 on removing the feature -0 on making it disabled by default [1] I know that some large, long-running programs periodically check gc.garbage and carefully choose where to break cycles, but those are the exception and not the rule. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

On Fri, Sep 10, 2010 at 14:55, Daniel Stutzbach <daniel@stutzbachenterprises.com> wrote:
Sure, but exceptions printed out by a __del__ method during interpreter shutdown are not explicitly done as part of the shutdown process, they just happen to typically be triggered by a shutdown. This gc info, OTOH, is explicitly debugging information that is only printed out (typically) at shutdown and thus at a point where it will no longer effect semantics or the performance of the application. So I view this as entirely debugging information and thus in no way critical for a user to know about. -Brett

Hello,
I would like to piggy-back on this discussion to suggest further warnings (either by default, or switchable). One feature I've often considered would be to add a warning in FileIO and socket dealloc if these objects haven't been closed explicitly. In most situations, relying on garbage collection to shutdown OS resources (here, file descriptors) is something we like to discourage. Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects (the random test_tarfile failures on the Windows buildbots were a symptom of that; their cause would have been obvious with such warnings). What do you think? Antoine.

On Wed, Sep 29, 2010 at 14:27, Benjamin Peterson <benjamin@python.org> wrote:
It seems like a slippery slope. Sometimes you really don't care like when you're just hacking together a quick script.
Yeah, I often don't close files in scripts that I know are short running or only ever open one or two files, and I don't think I should be warned about that. Cheers, Dirkjan

Le mercredi 29 septembre 2010 à 07:27 -0500, Benjamin Peterson a écrit :
Isn't the "with" statement appropriate in these cases? My assumption is/was that the benefit of warning against leaks in real applications (or even - sigh - the standard library) would outweigh the inconvenience when hacking together a quick script. But if it doesn't, what about enabling it with a command-line switch?

On Wed, 29 Sep 2010 10:42:27 pm Antoine Pitrou wrote:
I think the ability to detect such file descriptor leaks would be valuable, but I'm not sure that it should be running all the time. At the risk of bike-shedding, is it something which could be controlled at runtime, like garbage collection? E.g. something like: gc.enable_file_warnings() run_my_tests_for_leakage() gc.disable_file_warnings() or similar. (I'm not wedded to it being in the gc module.) Otherwise, I'm +0.25 on enabling it with a command line switch, and -0 on turning it on by default. -- Steven D'Aprano

On Sep 29, 2010, at 11:11 PM, Steven D'Aprano wrote:
I don't think it should be in the gc module, but I would prefer it be enabled and controlled through a separate module, rather than something Python does automatically for your convenience. -Barry

On Wed, Sep 29, 2010 at 11:40 PM, Barry Warsaw <barry@python.org> wrote:
The os module would seem to be the place to enable/disable tracking of OS level resource leaks (i.e. file descriptors and possible HANDLES on Windows). I'm not sure how practical this idea will prove to implement though. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 29 September 2010 22:25, Nick Coghlan <ncoghlan@gmail.com> wrote:
Heh, I was expecting the sys module to be the natural choice because this would be changing interpreter behaviour. It's just random bikeshedding at this point however. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org

On 09/29/2010 02:42 PM, Antoine Pitrou wrote:
A hacked-together quick script might contain code like: parse(open(bla).read()) Compared to this, "with" adds a new indentation level and a new variable, while breaking the flow of the code: with open(bla) as foo: contents = foo.read() parse(contents) People used to writing production code under stringent guidelines (introduced for good reason) will probably not be sympathetic to quick-hack usage patterns, but Python is used on both sides of the fence.

On Wed, Sep 29, 2010 at 05:42, Antoine Pitrou <solipsis@pitrou.net> wrote:
Yes, which is why I suspect people saying they don't bother have been programming Python for a while and are not in the habit yet of using a 'with' statement. The amount of extra typing compared to inlining a call is minimal.
Does everyone here run all their code under py-debug? If not then I say switch it on when py-debug is on so that we at least detect the leaks in the stdlib without having to think about it.
But if it doesn't, what about enabling it with a command-line switch?
Sure, but I say always turn it on under py-debug.

Sorry for late post. On 2010/09/29 20:01, Antoine Pitrou wrote:
Furthermore, it can produce real bugs, especially under Windows when coupled with refererence cycles created by traceback objects
I think this can be relaxed with the patch in #9815. ;-)
participants (17)
-
Amaury Forgeot d'Arc
-
Antoine Pitrou
-
Barry Warsaw
-
Benjamin Peterson
-
Brett Cannon
-
Daniel Stutzbach
-
Dirkjan Ochtman
-
Floris Bruynooghe
-
Fred Drake
-
Georg Brandl
-
geremy condra
-
Glyph Lefkowitz
-
Hirokazu Yamamoto
-
Hrvoje Niksic
-
Nick Coghlan
-
Steven D'Aprano
-
Łukasz Langa