[Python-Dev] Idea: reduce GC threshold in development mode (-X dev)

Serhiy Storchaka storchaka at gmail.com
Fri Jun 8 04:17:20 EDT 2018

08.06.18 10:48, Victor Stinner пише:
> Yury Selivanov pushed his implementation of the PEP 567 -- Context
> Variables at January 23, 2018. Yesterday, 4 months after the commit
> and only 3 weeks before 3.7.0 final release, a crash has been found in
> the implementation:
> https://bugs.python.org/issue33803
> (it's now fixed, don't worry Ned!)
> The bug is a "common" mistake in an object constructor implemented in
> C: the object was tracked by the garbage collector before it was fully
> initialized, and a GC collection caused a crash somewhere in "object
> traversing". By "common", I mean that I saw this exact bug between 5
> and 10 times over the last 5 years.
> In the bpo issue, I asked why we only spotted the bug yesterday? It
> seems like changing the threshold of the GC generation 0 from 700 to 5
> triggers the bug immediately in test_context (tests of the PEP 567). I
> wrote a proof-of-concept patch to change the threshold when using -X
> dev.
> Question: Do you think that bugs spotted by a GC collection are common
> enough to change the GC thresholds in development mode (new -X dev
> flag of Python 3.7)?
> GC collections detect various kinds of bugs. Another "common" bug is
> when an object remains somehow alive in the GC whereas its memory has
> been freed: using PYTHONMALLOC=debug (debug feature already enabled by
> -X dev), a GC collection will always crash in such case.
> I'm not sure about the exact thresholds that would be used in
> development mode. The general question is more if it would be useful.
> Then the side question is if reducing the threshold would kill
> performances or not.
> About performances, -X dev allows to enable debug features which have
> an "acceptable" cost in term of performance and memory, but enabled
> features are chosen on a case by case basis. For example, I chose to
> *not* enable tracemalloc using -X dev because the cost in term of CPU
> *and* memory is too high (usually 2x slower and memory x2).

Reducing GC threshold can hide other bugs that will be reproducible only 
in the release mode (because of earlier releasing of resources or 
changed order of destroying objects).

What is the cost of traversing all objects? Would it be too high if just 
traverse all objects every time when the garbage collecting potentially 
can happen, but without modifying any data, just check for consistency 
of GC headers?

It may be worth to write also suggestions for testing extensions 
(including setting low GC threshold) and include them in the Devguide 
(for core developers) and the "Extending and Embedding" section of the 
documentation (for authors of third-party extensions).

More information about the Python-Dev mailing list