So, on what principled basis do we exempt, say, ints from participating in cyclic GC too? Do we have to "just know" that a cycle can't be reached from an int's type object? If so, is that even true? Or just convenient to pretend to believe to avoid adding 16 more bytes to each int object and grossly slowing gc scans in the presence of many ints? ;-) If it is true, how do we know for which type objects it is and isn't true?
In this case, the int object doesn't have a reference to its type because is not a heap type so that's fine. The problem appeared after the changes done in 3.8 to heap types (https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api)
On Thu, 27 May 2021 at 23:12, Tim Peters tim.peters@gmail.com wrote:
;Victor Stinner vstinner@python.org]
... For a more concrete example, read the "_thread lock traverse" section of my article on these problems: https://vstinner.github.io/subinterpreter-leaks.html
There were two reference cycles, and both were "connected" with a lock object in the middle (look at my drawing). The lock object was *not* directly part of any ref cycle. But because it had no traverse function, the GC failed to break both cycles in a single collection. A second manual GC collection was needed to break the second cycles, to *work around* the issue.
Thanks! Let me correct/refine my characterization. For cyclic GC to work as intended, every object from which a cycle may be reached must participate in cyclic GC, and its tp_traverse must visit every contained object from which a cycle may be reached. Necessary and sufficient, there. Contrary to previous stabs, it's actually irrelevant whether the object may itself be directly in a cycle. As in your example, it can do damage to the intent if an object is merely a (or just on an) acyclic bridge between cycles (even though not _itself_ in a cycle).
So, if we're saying that type objects in general may be in cycles, then it's necessary that absolutely every object participate in GC, and its tp_traverse must visit absolutely every object it points to (every object points to a type object, as does every contained object likewise: there are potential cycles everywhere).
So, on what principled basis do we exempt, say, ints from participating in cyclic GC too? Do we have to "just know" that a cycle can't be reached from an int's type object? If so, is that even true? Or just convenient to pretend to believe to avoid adding 16 more bytes to each int object and grossly slowing gc scans in the presence of many ints? ;-) If it is true, how do we know for which type objects it is and isn't true?