IMPORTANT: Python 3.10b2 release blockers
Hi,
Tomorrow is the scheduled release of Python 3.10 beta 2 but unfortunately we have several release blockers:
https://bugs.python.org/issue41282: Deprecate and remove distutils https://bugs.python.org/issue40222: "Zero cost" exception handling https://bugs.python.org/issue42972: [C API] Heap types (PyType_FromSpec) must fully implement the GC protocol https://bugs.python.org/issue44043: 3.10 b1 armhf Bus Error in hashlib test: test_gil
We also have the address sanitizer buildbot failing :
https://buildbot.python.org/all/#/builders/582/builds/165/steps/5/logs/stdio
and some segmentation faults on the fedora stable buildbot:
https://buildbot.python.org/all/#/builders/543/builds/190
Some of these issues have PRs but some of them have not. Please, if you are involved or you maintain one of the areas involved in these issues, take a look at them and act with one of the following:
- Fix the issue making a PR
- Review an existing PR and / or merge it
- If you intend to mark it as a deferred blocker, please provide a rationale and contact me first by pinging me in the issue.
Until these issues are fixed or deferred, the release team will not be able to make a new beta release.
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
Small correction:
https://bugs.python.org/issue40222: "Zero cost" exception handling
and the segfaults on these buildbots:
https://buildbot.python.org/all/#/builders/582/builds/165/steps/5/logs/stdio https://buildbot.python.org/all/#/builders/543/builds/190
are 3-11 (main branch) only, but they are also quite important to get fixed as soon as possible, so the buildbot failures don't pile up.
On the other hand, seems that there is a nasty race condition on test_asyncio and many of the refleak builders for 3.10 hang, rendering them not usefull:
https://buildbot.python.org/all/#/builders/693/builds/21 https://buildbot.python.org/all/#/builders/677/builds/22 https://buildbot.python.org/all/#/builders/669/builds/22 ...
You can access the release dashboard for the buildbots here:
https://buildbot.python.org/all/#/release_status
On Mon, 24 May 2021 at 23:45, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Hi,
Tomorrow is the scheduled release of Python 3.10 beta 2 but unfortunately we have several release blockers:
https://bugs.python.org/issue41282: Deprecate and remove distutils https://bugs.python.org/issue40222: "Zero cost" exception handling https://bugs.python.org/issue42972: [C API] Heap types (PyType_FromSpec) must fully implement the GC protocol https://bugs.python.org/issue44043: 3.10 b1 armhf Bus Error in hashlib test: test_gil
We also have the address sanitizer buildbot failing :
https://buildbot.python.org/all/#/builders/582/builds/165/steps/5/logs/stdio
and some segmentation faults on the fedora stable buildbot:
https://buildbot.python.org/all/#/builders/543/builds/190
Some of these issues have PRs but some of them have not. Please, if you are involved or you maintain one of the areas involved in these issues, take a look at them and act with one of the following:
- Fix the issue making a PR
- Review an existing PR and / or merge it
- If you intend to mark it as a deferred blocker, please provide a rationale and contact me first by pinging me in the issue.
Until these issues are fixed or deferred, the release team will not be able to make a new beta release.
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
Hi,
Friendly reminder that the Python3.10 beta 2 is still blocked on:
https://bugs.python.org/issue42972
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
On Mon, 24 May 2021 at 23:54, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Small correction:
https://bugs.python.org/issue40222: "Zero cost" exception handling
and the segfaults on these buildbots:
https://buildbot.python.org/all/#/builders/582/builds/165/steps/5/logs/stdio https://buildbot.python.org/all/#/builders/543/builds/190
are 3-11 (main branch) only, but they are also quite important to get fixed as soon as possible, so the buildbot failures don't pile up.
On the other hand, seems that there is a nasty race condition on test_asyncio and many of the refleak builders for 3.10 hang, rendering them not usefull:
https://buildbot.python.org/all/#/builders/693/builds/21 https://buildbot.python.org/all/#/builders/677/builds/22 https://buildbot.python.org/all/#/builders/669/builds/22 ...
You can access the release dashboard for the buildbots here:
https://buildbot.python.org/all/#/release_status
On Mon, 24 May 2021 at 23:45, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Hi,
Tomorrow is the scheduled release of Python 3.10 beta 2 but unfortunately we have several release blockers:
https://bugs.python.org/issue41282: Deprecate and remove distutils https://bugs.python.org/issue40222: "Zero cost" exception handling https://bugs.python.org/issue42972: [C API] Heap types (PyType_FromSpec) must fully implement the GC protocol https://bugs.python.org/issue44043: 3.10 b1 armhf Bus Error in hashlib test: test_gil
We also have the address sanitizer buildbot failing :
https://buildbot.python.org/all/#/builders/582/builds/165/steps/5/logs/stdio
and some segmentation faults on the fedora stable buildbot:
https://buildbot.python.org/all/#/builders/543/builds/190
Some of these issues have PRs but some of them have not. Please, if you are involved or you maintain one of the areas involved in these issues, take a look at them and act with one of the following:
- Fix the issue making a PR
- Review an existing PR and / or merge it
- If you intend to mark it as a deferred blocker, please provide a rationale and contact me first by pinging me in the issue.
Until these issues are fixed or deferred, the release team will not be able to make a new beta release.
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
On 5/26/21 7:21 AM, Pablo Galindo Salgado wrote:
Hi,
Friendly reminder that the Python3.10 beta 2 is still blocked on:
https://bugs.python.org/issue42972 <https://bugs.python.org/issue42972>
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
I took a quick look at that issue. My cursory understanding is that there's a gigantic list of work that just needs seasoned CPython core devs to power through--specifically, implementing full GC protocol for a long list of extension types that (I'm assuming) happen to hold references to other objects. Is this friendly reminder your way of asking for volunteers to carve off a chunk and do the work?
Cheers,
//arry/
The friendly reminder is so we don't forget that we are blocked by the issue. Also, a small and friendly ping the the core devs that made the changes in the first place that require this fix to get involved in the issue, as the process is blocked by changes they authored or reviewed and merged.
Regarding the work itself, Erlend has created a bunch of pull requests that cover most of it, I am reviewing them as fast as possible but as you said there is a lot to do so it certainly would welcome more reviews.
On Wed, 26 May 2021, 15:50 Larry Hastings, <larry@hastings.org> wrote:
On 5/26/21 7:21 AM, Pablo Galindo Salgado wrote:
Hi,
Friendly reminder that the Python3.10 beta 2 is still blocked on:
https://bugs.python.org/issue42972
Thanks for your help,
Regards from stormy London, Pablo Galindo Salgado
I took a quick look at that issue. My cursory understanding is that there's a gigantic list of work that just needs seasoned CPython core devs to power through--specifically, implementing full GC protocol for a long list of extension types that (I'm assuming) happen to hold references to other objects. Is this friendly reminder your way of asking for volunteers to carve off a chunk and do the work?
Cheers,
*/arry*
python-committers mailing list -- python-committers@python.org To unsubscribe send an email to python-committers-leave@python.org https://mail.python.org/mailman3/lists/python-committers.python.org/ Message archived at https://mail.python.org/archives/list/python-committers@python.org/message/7... Code of Conduct: https://www.python.org/psf/codeofconduct/
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ?
Many PRs or checked in patches only do this:
+static int +ucd_traverse(PreviousDBVersion *self, visitproc visit, void *arg) +{
Py_VISIT(Py_TYPE(self));
return 0; +}
AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Hi Marc,
Yes, check out this from the 3.9 what's new document:
https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api
Instances of heap-allocated types (such as those created with PyType_FromSpec() and similar APIs) hold a reference to their type object since Python 3.8. As indicated in the “Changes in the C API” of Python 3.8, for the vast majority of cases, there should be no side effect but for types that have a custom tp_traverse function, ensure that all custom tp_traverse functions of heap-allocated types visit the object’s type.
Example:
int foo_traverse(foo_struct *self, visitproc visit, void *arg) { // Rest of the traverse function #if PY_VERSION_HEX >= 0x03090000 // This was not needed before Python 3.9 (Python issue 35810 and 40217) Py_VISIT(Py_TYPE(self)); #endif } If your traverse function delegates to tp_traverse of its base class (or another type), ensure that Py_TYPE(self) is visited only once. Note that only heap types are expected to visit the type in tp_traverse.
For example, if your tp_traverse function includes:
base->tp_traverse(self, visit, arg) then add:
#if PY_VERSION_HEX >= 0x03090000 // This was not needed before Python 3.9 (Python issue 35810 and 40217) if (base->tp_flags & Py_TPFLAGS_HEAPTYPE) { // a heap type's tp_traverse already visited Py_TYPE(self) } else { Py_VISIT(Py_TYPE(self)); } #else (See bpo-35810 and bpo-40217 for more information.)
Regards from sunny London, Pablo
On Thu, 27 May 2021, 16:36 Marc-Andre Lemburg, <mal@egenix.com> wrote:
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ?
Many PRs or checked in patches only do this:
+static int +ucd_traverse(PreviousDBVersion *self, visitproc visit, void *arg) +{
Py_VISIT(Py_TYPE(self));
return 0; +}
AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
[Pablo Galindo Salgado <pablogsal@gmail.com>]
Hi Marc,
Yes, check out this from the 3.9 what's new document:
https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api
Instances of heap-allocated types (such as those created with PyType_FromSpec() and similar APIs) hold a reference to their type object since Python 3.8. ...
I think Marc-Andre's question is more subtle than that:
[Marc-Andre Lemburg, <mal@egenix.com>]
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ? ... AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
To make that more formal, if there's a chain of pointers from an object X that reaches X again (X is a direct member of a cycle), then for cyclic gc to work X's tp_traverse must visit all directly contained objects that could be the second object in a direct cycle reaching X again.
But, for example, if X contains a tuple of integers, there's no need for X's tp_traverse to visit that tuple. It's impossible to reach X from the tuple. Cyclic gc cares only about cycles; chasing non-cyclic dead ends doesn't accomplish anything beyond burning processor cycles.
This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
... By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
On 27.05.2021 19:40, Tim Peters wrote:
[Pablo Galindo Salgado <pablogsal@gmail.com>]
Hi Marc,
Yes, check out this from the 3.9 what's new document:
https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api
Instances of heap-allocated types (such as those created with PyType_FromSpec() and similar APIs) hold a reference to their type object since Python 3.8. ...
Thanks for these details. I wasn't aware of that change.
So all those instances have an increase in memory footprint compared to Python 3.7 ?
I think Marc-Andre's question is more subtle than that:
[Marc-Andre Lemburg, <mal@egenix.com>]
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ? ... AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
To make that more formal, if there's a chain of pointers from an object X that reaches X again (X is a direct member of a cycle), then for cyclic gc to work X's tp_traverse must visit all directly contained objects that could be the second object in a direct cycle reaching X again.
But, for example, if X contains a tuple of integers, there's no need for X's tp_traverse to visit that tuple. It's impossible to reach X from the tuple. Cyclic gc cares only about cycles; chasing non-cyclic dead ends doesn't accomplish anything beyond burning processor cycles.
This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
Indeed, that's what I was after. GC is really only needed for objects which can potentially participate in cycles which need to be reclaimed. As I understand the logic (even with the objects referencing the constant type objects on the heap), many of the objects which are now made GC traversal aware don't really have a need for it -- their traversal only finds type objects, no other objects.
... By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
Oh, thanks for the explanation. I was under the impression that GC-aware objects are added to a GC pool for processing at the next GC run. If that's not the case in general -- only if they are part of dead cycles -- then it's merely wasting time on traversing known dead ends... and developer time for adding the unnecessary logic ;-)
Perhaps this provides an easy way to unblock the release :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
[Tim]
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
[Marc-Andre Lemburg]
Oh, thanks for the explanation. I was under the impression that GC-aware objects are added to a GC pool for processing at the next GC run.
At creation, they're added to "the youngest GC generation", which is a doubly-linked list. It's doubly-linked precisely so (among other things ;-) ) they can be removed from the youngest generation in O(1) time if refcounting destroys them before the next GC run. GC only looks at the objects still in the doubly-linked list when it runs. The only effect it has on refcounting is to increase the time it takes refcounting to do reclamation (refcounting has to adjust the double-linked list pointers, to reflect that the object no longer exists).
If that's not the case in general -- only if they are part of dead cycles -- then it's merely wasting time on traversing known dead ends... and developer time for adding the unnecessary logic ;-)
And some conceptual confusion. For example, I noted in a later message that we've apparently added a tp_traverse to regexp pattern objects. "Why?" is baffling to me: how could they possibly participate in a cycle? My best guess is that there's no need for them to participate in cyclic gc at all, despite that - sure - they have a type pointer.
"Why?" is baffling to me: how could they possibly participate in a cycle?
If the type object is a heap type (by default mutable), someone could just add a reference directly to it that makes it being in a cycle with the instance.
Even if that's not the case, IIRC, as the type refers to the module and the module to the globals, is enough to place an instance of the type in something reachable from the globals of the same module where the type object lives to get a cycle involving the type object and the instance.
On Thu, 27 May 2021, 19:27 Tim Peters, <tim.peters@gmail.com> wrote:
[Tim]
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
[Marc-Andre Lemburg]
Oh, thanks for the explanation. I was under the impression that GC-aware objects are added to a GC pool for processing at the next GC run.
At creation, they're added to "the youngest GC generation", which is a doubly-linked list. It's doubly-linked precisely so (among other things ;-) ) they can be removed from the youngest generation in O(1) time if refcounting destroys them before the next GC run. GC only looks at the objects still in the doubly-linked list when it runs. The only effect it has on refcounting is to increase the time it takes refcounting to do reclamation (refcounting has to adjust the double-linked list pointers, to reflect that the object no longer exists).
If that's not the case in general -- only if they are part of dead cycles -- then it's merely wasting time on traversing known dead ends... and developer time for adding the unnecessary logic ;-)
And some conceptual confusion. For example, I noted in a later message that we've apparently added a tp_traverse to regexp pattern objects. "Why?" is baffling to me: how could they possibly participate in a cycle? My best guess is that there's no need for them to participate in cyclic gc at all, despite that - sure - they have a type pointer.
So all those instances have an increase in memory footprint compared to Python 3.7 ?
I am afraid that's the case. This is one of the costs of making types not being heap types.
On Thu, 27 May 2021, 19:04 Marc-Andre Lemburg, <mal@egenix.com> wrote:
On 27.05.2021 19:40, Tim Peters wrote:
[Pablo Galindo Salgado <pablogsal@gmail.com>]
Hi Marc,
Yes, check out this from the 3.9 what's new document:
https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api
Instances of heap-allocated types (such as those created with PyType_FromSpec() and similar APIs) hold a reference to their type object since Python 3.8. ...
Thanks for these details. I wasn't aware of that change.
So all those instances have an increase in memory footprint compared to Python 3.7 ?
I think Marc-Andre's question is more subtle than that:
[Marc-Andre Lemburg, <mal@egenix.com>]
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ? ... AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
To make that more formal, if there's a chain of pointers from an object X that reaches X again (X is a direct member of a cycle), then for cyclic gc to work X's tp_traverse must visit all directly contained objects that could be the second object in a direct cycle reaching X again.
But, for example, if X contains a tuple of integers, there's no need for X's tp_traverse to visit that tuple. It's impossible to reach X from the tuple. Cyclic gc cares only about cycles; chasing non-cyclic dead ends doesn't accomplish anything beyond burning processor cycles.
This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
Indeed, that's what I was after. GC is really only needed for objects which can potentially participate in cycles which need to be reclaimed. As I understand the logic (even with the objects referencing the constant type objects on the heap), many of the objects which are now made GC traversal aware don't really have a need for it -- their traversal only finds type objects, no other objects.
... By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
Oh, thanks for the explanation. I was under the impression that GC-aware objects are added to a GC pool for processing at the next GC run. If that's not the case in general -- only if they are part of dead cycles -- then it's merely wasting time on traversing known dead ends... and developer time for adding the unnecessary logic ;-)
Perhaps this provides an easy way to unblock the release :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
[Tim Peters <tim.peters@gmail.com>]
... This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
Ya, I need to retract that :-) If X's type object is in a cycle not directly containing X, and X is in a cycle not directly containing its type object, and both cycles are dead, then X's tp_traverse must visit X's type object for cyclic gc to deduce that the cycle directly containing the type object _is_ dead.
Else only the cycle containing X will be reclaimed, and the cycle containing the type object will have to wait for another gc run.
But that doesn't apply to some of the patches we're seeing. We're seeing visits to things that can't possibly be parts of cycles:
+static int +pattern_traverse(PatternObject *self, visitproc visit, void *arg) +{
Py_VISIT(Py_TYPE(self));
Py_VISIT(self->groupindex);
Py_VISIT(self->indexgroup);
Py_VISIT(self->pattern);
return 0; +}
For example, self->pattern there is a string. For that matter, it's hard to conceive of how a regexp pattern object could possibly be a direct member of any cycle.
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
On Thu, 27 May 2021, 19:07 Tim Peters, <tim.peters@gmail.com> wrote:
[Tim Peters <tim.peters@gmail.com>]
... This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
Ya, I need to retract that :-) If X's type object is in a cycle not directly containing X, and X is in a cycle not directly containing its type object, and both cycles are dead, then X's tp_traverse must visit X's type object for cyclic gc to deduce that the cycle directly containing the type object _is_ dead.
Else only the cycle containing X will be reclaimed, and the cycle containing the type object will have to wait for another gc run.
But that doesn't apply to some of the patches we're seeing. We're seeing visits to things that can't possibly be parts of cycles:
+static int +pattern_traverse(PatternObject *self, visitproc visit, void *arg) +{
Py_VISIT(Py_TYPE(self));
Py_VISIT(self->groupindex);
Py_VISIT(self->indexgroup);
Py_VISIT(self->pattern);
return 0; +}
For example, self->pattern there is a string. For that matter, it's hard to conceive of how a regexp pattern object could possibly be a direct member of any cycle.
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
On 27.05.2021 20:20, Pablo Galindo Salgado wrote:
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Modules dicts are cleared during interpreter shutdown to break such cycles. You would not really want to use GC to clear type objects: if you GC the type object before the instance, this would create really difficult to handle situations :-)
Also: Why are types mutable ? AFAIK, they are only meant to be mutable at C level and then only until they are fully initialized (PyType_Ready() called).
On Thu, 27 May 2021, 19:07 Tim Peters, <tim.peters@gmail.com <mailto:tim.peters@gmail.com>> wrote:
[Tim Peters <tim.peters@gmail.com <mailto:tim.peters@gmail.com>>] > ... > This is, I believe, akin to what Marc-Andre is bringing up: if X > can't be reached _from_ X's type object, there's no need for X's > tp_traverse to visit X's type object. It _can_ be visited, but it > would be a waste of time. Ya, I need to retract that :-) If X's type object is in a cycle not directly containing X, and X is in a cycle not directly containing its type object, and both cycles are dead, then X's tp_traverse must visit X's type object for cyclic gc to deduce that the cycle directly containing the type object _is_ dead. Else only the cycle containing X will be reclaimed, and the cycle containing the type object will have to wait for another gc run. But that doesn't apply to some of the patches we're seeing. We're seeing visits to things that can't possibly be parts of cycles: +static int +pattern_traverse(PatternObject *self, visitproc visit, void *arg) +{ + Py_VISIT(Py_TYPE(self)); + Py_VISIT(self->groupindex); + Py_VISIT(self->indexgroup); + Py_VISIT(self->pattern); + return 0; +} + For example, self->pattern there is a string. For that matter, it's hard to conceive of how a regexp pattern object could possibly be a direct member of any cycle. And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Modules dicts are cleared during interpreter shutdown to break such cycles.
That is precisely what's not working because of the cycles, check the first message in the issue:
https://bugs.python.org/msg385297
Also: Why are types mutable ? AFAIK, they are only meant to be mutable at C level and then only until they are fully initialized (PyType_Ready() called).
Check https://bugs.python.org/issue43916
On Thu, 27 May 2021 at 19:38, Marc-Andre Lemburg <mal@egenix.com> wrote:
On 27.05.2021 20:20, Pablo Galindo Salgado wrote:
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Modules dicts are cleared during interpreter shutdown to break such cycles. You would not really want to use GC to clear type objects: if you GC the type object before the instance, this would create really difficult to handle situations :-)
Also: Why are types mutable ? AFAIK, they are only meant to be mutable at C level and then only until they are fully initialized (PyType_Ready() called).
On Thu, 27 May 2021, 19:07 Tim Peters, <tim.peters@gmail.com <mailto:tim.peters@gmail.com>> wrote:
[Tim Peters <tim.peters@gmail.com <mailto:tim.peters@gmail.com>>] > ... > This is, I believe, akin to what Marc-Andre is bringing up: if X > can't be reached _from_ X's type object, there's no need for X's > tp_traverse to visit X's type object. It _can_ be visited, but it > would be a waste of time. Ya, I need to retract that :-) If X's type object is in a cycle not directly containing X, and X is in a cycle not directly containing
its type object, and both cycles are dead, then X's tp_traverse must visit X's type object for cyclic gc to deduce that the cycle directly containing the type object _is_ dead.
Else only the cycle containing X will be reclaimed, and the cycle containing the type object will have to wait for another gc run. But that doesn't apply to some of the patches we're seeing. We're seeing visits to things that can't possibly be parts of cycles: +static int +pattern_traverse(PatternObject *self, visitproc visit, void *arg) +{ + Py_VISIT(Py_TYPE(self)); + Py_VISIT(self->groupindex); + Py_VISIT(self->indexgroup); + Py_VISIT(self->pattern); + return 0; +} + For example, self->pattern there is a string. For that matter, it's hard to conceive of how a regexp pattern object could possibly be a direct member of any cycle. And if a type pointer is the only thing being visited, then there's
no point unless the object can itself be reachable from the type object.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
Sorry, the last link should have been:
https://bugs.python.org/issue43908
On Thu, 27 May 2021 at 19:41, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Modules dicts are cleared during interpreter shutdown to break such cycles.
That is precisely what's not working because of the cycles, check the first message in the issue:
https://bugs.python.org/msg385297
Also: Why are types mutable ? AFAIK, they are only meant to be mutable at C level and then only until they are fully initialized (PyType_Ready() called).
Check https://bugs.python.org/issue43916
On Thu, 27 May 2021 at 19:38, Marc-Andre Lemburg <mal@egenix.com> wrote:
On 27.05.2021 20:20, Pablo Galindo Salgado wrote:
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Modules dicts are cleared during interpreter shutdown to break such cycles. You would not really want to use GC to clear type objects: if you GC the type object before the instance, this would create really difficult to handle situations :-)
Also: Why are types mutable ? AFAIK, they are only meant to be mutable at C level and then only until they are fully initialized (PyType_Ready() called).
On Thu, 27 May 2021, 19:07 Tim Peters, <tim.peters@gmail.com <mailto:tim.peters@gmail.com>> wrote:
[Tim Peters <tim.peters@gmail.com <mailto:tim.peters@gmail.com>>] > ... > This is, I believe, akin to what Marc-Andre is bringing up: if X > can't be reached _from_ X's type object, there's no need for X's > tp_traverse to visit X's type object. It _can_ be visited, but it > would be a waste of time. Ya, I need to retract that :-) If X's type object is in a cycle not directly containing X, and X is in a cycle not directly containing
its type object, and both cycles are dead, then X's tp_traverse must visit X's type object for cyclic gc to deduce that the cycle directly containing the type object _is_ dead.
Else only the cycle containing X will be reclaimed, and the cycle containing the type object will have to wait for another gc run. But that doesn't apply to some of the patches we're seeing. We're seeing visits to things that can't possibly be parts of cycles: +static int +pattern_traverse(PatternObject *self, visitproc visit, void *arg) +{ + Py_VISIT(Py_TYPE(self)); + Py_VISIT(self->groupindex); + Py_VISIT(self->indexgroup); + Py_VISIT(self->pattern); + return 0; +} + For example, self->pattern there is a string. For that matter, it's hard to conceive of how a regexp pattern object could possibly be a direct member of any cycle. And if a type pointer is the only thing being visited, then there's
no point unless the object can itself be reachable from the type object.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
[Tim]
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
{Pablo]
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Sorry, but this is apparently a rat's nest and that's too sketchy for me ;-)
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
import re p = re.compile("ab*c") type(p) <class 're.Pattern'> type(p).__module__ 're' type(type(p).__module__) <class 'str'>
That is, its __module__ attribute is a string, not a module object.
And, in general, I can't add attributes to p, or change their bindings:
p.flags 32 p.flags = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute p.newattr = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 're.Pattern' object has no attribute 'newattr'
I just don't see a way to trick it into being part of a cycle.
Not denying that safe is better than sorry, though.
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
Let me try to remember when we saw this problem in the past, but on first sigh, it seems that indeed that cannot happen in the regular case.
And, in general, I can't add attributes to p, or change their bindings:
That's after the changes to add a new Py_TPFLAGS_IMMUTABLETYPE flag, which predates the creation of the issue we are discussing. But in general, new heap types are not immutable by default (check https://bugs.python.org/issue43908). For instance, in python3.8:
array.array.x = 4 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't set attributes of built-in/extension type 'array.array'
but in python 3.10.0a7 (before Py_TPFLAGS_IMMUTABLETYPE):
Python 3.10.0a7 (tags/v3.10.0a7:53e55290cf, May 27 2021, 20:20:26) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin Type "help", "copyright", "credits" or "license" for more information.
import array instance = array.array("f") type = array.array type.instance = instance
On Thu, 27 May 2021 at 20:15, Tim Peters <tim.peters@gmail.com> wrote:
And if a type pointer is the only thing being visited, then there's no
[Tim] point
unless the object can itself be reachable from the type object.
{Pablo]
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Sorry, but this is apparently a rat's nest and that's too sketchy for me ;-)
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
import re p = re.compile("ab*c") type(p) <class 're.Pattern'> type(p).__module__ 're' type(type(p).__module__) <class 'str'>
That is, its __module__ attribute is a string, not a module object.
And, in general, I can't add attributes to p, or change their bindings:
p.flags 32 p.flags = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute p.newattr = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 're.Pattern' object has no attribute 'newattr'
I just don't see a way to trick it into being part of a cycle.
Not denying that safe is better than sorry, though.
[Tim]
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
Found the problem that we saw regarding a cycle involving the type. That comes from this comment:
https://github.com/python/cpython/pull/23811#issuecomment-747788766
On Thu, 27 May 2021 at 20:24, Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
Let me try to remember when we saw this problem in the past, but on first sigh, it seems that indeed that cannot happen in the regular case.
And, in general, I can't add attributes to p, or change their bindings:
That's after the changes to add a new Py_TPFLAGS_IMMUTABLETYPE flag, which predates the creation of the issue we are discussing. But in general, new heap types are not immutable by default (check https://bugs.python.org/issue43908). For instance, in python3.8:
array.array.x = 4 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't set attributes of built-in/extension type 'array.array'
but in python 3.10.0a7 (before Py_TPFLAGS_IMMUTABLETYPE):
Python 3.10.0a7 (tags/v3.10.0a7:53e55290cf, May 27 2021, 20:20:26) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin Type "help", "copyright", "credits" or "license" for more information.
import array instance = array.array("f") type = array.array type.instance = instance
On Thu, 27 May 2021 at 20:15, Tim Peters <tim.peters@gmail.com> wrote:
And if a type pointer is the only thing being visited, then there's no
[Tim] point
unless the object can itself be reachable from the type object.
{Pablo]
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Sorry, but this is apparently a rat's nest and that's too sketchy for me ;-)
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
import re p = re.compile("ab*c") type(p) <class 're.Pattern'> type(p).__module__ 're' type(type(p).__module__) <class 'str'>
That is, its __module__ attribute is a string, not a module object.
And, in general, I can't add attributes to p, or change their bindings:
p.flags 32 p.flags = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute p.newattr = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 're.Pattern' object has no attribute 'newattr'
I just don't see a way to trick it into being part of a cycle.
Not denying that safe is better than sorry, though.
Tim, check this out:
import re, gc x = re.compile("x") gc.get_referents(x.__class__)[-1] <module '_sre' (built-in)>
That seems due to:
https://github.com/python/cpython/blob/e90e0422182f4ca7faefd19c629f84aebb34e...
On Thu, 27 May 2021 at 20:15, Tim Peters <tim.peters@gmail.com> wrote:
And if a type pointer is the only thing being visited, then there's no
[Tim] point
unless the object can itself be reachable from the type object.
{Pablo]
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Sorry, but this is apparently a rat's nest and that's too sketchy for me ;-)
Can you flesh this out for what stumbled into being my running example? That is, how could a regexp pattern object be part of a cycle?
import re p = re.compile("ab*c") type(p) <class 're.Pattern'> type(p).__module__ 're' type(type(p).__module__) <class 'str'>
That is, its __module__ attribute is a string, not a module object.
And, in general, I can't add attributes to p, or change their bindings:
p.flags 32 p.flags = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: readonly attribute p.newattr = re Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 're.Pattern' object has no attribute 'newattr'
I just don't see a way to trick it into being part of a cycle.
Not denying that safe is better than sorry, though.
[Pablo]
Tim, check this out:
import re, gc x = re.compile("x") gc.get_referents(x.__class__)[-1] <module '_sre' (built-in)>
Cool! So presumably this constructs a cycle involving a pattern object:
import re p = re.compile("ab*c") import _sre _sre.WOWZA = p
Indeed, under the current main branch:
import gc gc.get_referents(type(p))[-1].WOWZA is p True
Then again, type objects are in cycles regardless:
type(p).__mro__[0] is type(p) True dict.__mro__[0] is dict True int.__mro__[0] is int True
Etc.
I suppose I could ask why heap types were fiddled to point to their module objects too - but that's really got nothing to do with getting the release done, so I won't :-)
On Fri, May 28, 2021 at 6:55 AM Tim Peters <tim.peters@gmail.com> wrote:
I suppose I could ask why heap types were fiddled to point to their module objects too - but that's really got nothing to do with getting the release done, so I won't :-)
PyHeapTypeObject.ht_module was added by the PEP 573 "Module State Access from C Extension Methods": https://www.python.org/dev/peps/pep-0573/
It is set if you create a type using PyType_FromModuleAndSpec(). Why does it matter to store the module? Well, previously, it was not possible to unload a C extension which prevented to release memory at Python exit, and many Python objects survived to Py_Finalize(). This behavior is unpleasant or can cause issues when Python is embedded. This issue is getting worse if you consider the subinterpreter use case (also when embedding Python). See https://bugs.python.org/issue1635741: "Py_Finalize() doesn't clear all Python objects at exit".
What is the relationship between unloading C extensions and PyHeapTypeObject.ht_module? Well, to be able to unload an extension, the extension must implement the multiphase initialization API (PEP 489) and it must be possible to create multiple instances of the same extension. For correctness, it becomes important that each extension has its own type instances and its own variables values. Global variables must be made per module instance.
In practice, it means that global variables must be moved into a "module state": PyModule_GetState(). The new problem becomes: how can you access the module style from type methods which only get the instance in its parameter and not the module? Well, there is how PyHeapTypeObject.ht_module comes into the game: PyType_GetModule() gives you the module.
Simple example:
static int _abcmodule_exec(PyObject *module) { ... state->_abc_data_type = (PyTypeObject *)PyType_FromModuleAndSpec(module, &_abc_data_type_spec, NULL); if (state->_abc_data_type == NULL) { return -1; } ... }
static PyObject * abc_data_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { ... state = PyType_GetModuleState(type); ... }
The tp_new function of the _abc._abc_data gets the type. It retrieves the module state using PyType_FromModuleAndSpec().
This example is the simple case, since the type cannot be subclassed (it doesn't have the Py_TPFLAGS_BASETYPE flag). Otherwise, you would need the newly added _PyType_GetModuleByDef() function to find the right type in the MRO, since PyHeapTypeObject.ht_module can be different in a subclass (it can be NULL). You need to get the "defining class" to retrieve the module state related to your function. For example, for an ABC type, you would like to retrieve the ABC module state in your ABC functions.
At the end, the great news is that extension modules behave as modules implemented in Python: each extension instance has its own types, variables values, internal state, etc. For example, it works as expected to "reload" an extension to get a "fresh module" for unit tests. Moreover, it is really possible to fully unload an extension.
Finally, if your heap types fully implement the GC protocol (traverse, clear, GC type flag), unloading the extension is able to destroy all Python objects that it created ;-) (except of objects that you kept alive on purpose which can keep the extension partially alive ;-))
Hey, it's a funny issue!
For more context about these issues and subinterpreters, see the talk that Dong-hee Na and me gave the Language Summit:
- https://pyfound.blogspot.com/2021/05/the-2021-python-language-summit_16.html
- https://github.com/vstinner/talks/raw/main/2021-PyconUS/subinterpreters.pdf
Victor
Night gathers, and now my watch begins. It shall not end until my death.
On 28. 05. 21 6:55, Tim Peters wrote:
[Pablo]
Tim, check this out:
import re, gc x = re.compile("x") gc.get_referents(x.__class__)[-1] <module '_sre' (built-in)>
Cool! So presumably this constructs a cycle involving a pattern object:
import re p = re.compile("ab*c") import _sre _sre.WOWZA = p
Indeed, under the current main branch:
import gc gc.get_referents(type(p))[-1].WOWZA is p True
Then again, type objects are in cycles regardless:
type(p).__mro__[0] is type(p) True dict.__mro__[0] is dict True int.__mro__[0] is int True
Etc.
I suppose I could ask why heap types were fiddled to point to their module objects too - but that's really got nothing to do with getting the release done, so I won't :-)
Not all heap types need to point to their module objects. Just the ones that need the module's state -- for example, if their method needs to raise a module-specific exception.
On Thu, May 27, 2021 at 8:24 PM Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
And if a type pointer is the only thing being visited, then there's no point unless the object can itself be reachable from the type object.
But that could happen easily for heap types as they are mutable by default. For instance, you set the instance in a global:
type -> module -> globals -> instance -> type
Immutable heap types have the same leak problem. Making a mutable heap immutable doesn't solve the problem.
Example of my msg385297:
"If one of these conditions is not met, the GC can fail to destroy a type during a GC collection. If an instance is kept alive late while a Python interpreter is being deleted, it's possible that the type is never deleted, which can keep indirectly many objects alive and so don't delete them neither."
For a more concrete example, read the "_thread lock traverse" section of my article on these problems: https://vstinner.github.io/subinterpreter-leaks.html
There were two reference cycles, and both were "connected" with a lock object in the middle (look at my drawing). The lock object was *not* directly part of any ref cycle. But because it had no traverse function, the GC failed to break both cycles in a single collection. A second manual GC collection was needed to break the second cycles, to *work around* the issue.
The problem is that a small reference cycle can keep tons of Python objects alive way longer than expected. If a heap cycle doesn't fully implement the GC protocol, in the worst case, these objects are never deleted.
Victor
Night gathers, and now my watch begins. It shall not end until my death.
;Victor Stinner <vstinner@python.org>]
... For a more concrete example, read the "_thread lock traverse" section of my article on these problems: https://vstinner.github.io/subinterpreter-leaks.html
There were two reference cycles, and both were "connected" with a lock object in the middle (look at my drawing). The lock object was *not* directly part of any ref cycle. But because it had no traverse function, the GC failed to break both cycles in a single collection. A second manual GC collection was needed to break the second cycles, to *work around* the issue.
Thanks! Let me correct/refine my characterization. For cyclic GC to work as intended, every object from which a cycle may be reached must participate in cyclic GC, and its tp_traverse must visit every contained object from which a cycle may be reached. Necessary and sufficient, there. Contrary to previous stabs, it's actually irrelevant whether the object may itself be directly in a cycle. As in your example, it can do damage to the intent if an object is merely a (or just on an) acyclic bridge between cycles (even though not _itself_ in a cycle).
So, if we're saying that type objects in general may be in cycles, then it's necessary that absolutely every object participate in GC, and its tp_traverse must visit absolutely every object it points to (every object points to a type object, as does every contained object likewise: there are potential cycles everywhere).
So, on what principled basis do we exempt, say, ints from participating in cyclic GC too? Do we have to "just know" that a cycle can't be reached from an int's type object? If so, is that even true? Or just convenient to pretend to believe to avoid adding 16 more bytes to each int object and grossly slowing gc scans in the presence of many ints? ;-) If it is true, how do we know for which type objects it is and isn't true?
So, on what principled basis do we exempt, say, ints from participating in cyclic GC too? Do we have to "just know" that a cycle can't be reached from an int's type object? If so, is that even true? Or just convenient to pretend to believe to avoid adding 16 more bytes to each int object and grossly slowing gc scans in the presence of many ints? ;-) If it is true, how do we know for which type objects it is and isn't true?
In this case, the int object doesn't have a reference to its type because is not a heap type so that's fine. The problem appeared after the changes done in 3.8 to heap types (https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api)
On Thu, 27 May 2021 at 23:12, Tim Peters <tim.peters@gmail.com> wrote:
;Victor Stinner <vstinner@python.org>]
... For a more concrete example, read the "_thread lock traverse" section of my article on these problems: https://vstinner.github.io/subinterpreter-leaks.html
There were two reference cycles, and both were "connected" with a lock object in the middle (look at my drawing). The lock object was *not* directly part of any ref cycle. But because it had no traverse function, the GC failed to break both cycles in a single collection. A second manual GC collection was needed to break the second cycles, to *work around* the issue.
Thanks! Let me correct/refine my characterization. For cyclic GC to work as intended, every object from which a cycle may be reached must participate in cyclic GC, and its tp_traverse must visit every contained object from which a cycle may be reached. Necessary and sufficient, there. Contrary to previous stabs, it's actually irrelevant whether the object may itself be directly in a cycle. As in your example, it can do damage to the intent if an object is merely a (or just on an) acyclic bridge between cycles (even though not _itself_ in a cycle).
So, if we're saying that type objects in general may be in cycles, then it's necessary that absolutely every object participate in GC, and its tp_traverse must visit absolutely every object it points to (every object points to a type object, as does every contained object likewise: there are potential cycles everywhere).
So, on what principled basis do we exempt, say, ints from participating in cyclic GC too? Do we have to "just know" that a cycle can't be reached from an int's type object? If so, is that even true? Or just convenient to pretend to believe to avoid adding 16 more bytes to each int object and grossly slowing gc scans in the presence of many ints? ;-) If it is true, how do we know for which type objects it is and isn't true?
{Tim]
So, on what principled basis do we exempt, say, ints from participating in cyclic GC too?
{Pablo]
In this case, the int object doesn't have a reference to its type because is not a heap type so that's fine.
That baffled me at first, because _every_ object contains a pointer to its type object. But in the case of int, that doesn't count as "having a reference" to you, probably because of this:
static inline void _PyObject_Init(PyObject *op, PyTypeObject *typeobj) { assert(op != NULL); Py_SET_TYPE(op, typeobj); if (_PyType_HasFeature(typeobj, Py_TPFLAGS_HEAPTYPE)) { Py_INCREF(typeobj); } _Py_NewReference(op); }
So, ya, an int does have a pointer to its type object, but it's effectively exempted from even the refcounting system. That makes the int type object effectively immortal, which would in turn make any otherwise-dead cycle reachable from the int type object immortal too.
So no point in CPython'a peculiar form of GC chasing it (unlike, e.g., mark-&-sweep, CPython doesn't try to find the set of live objects).
This isn't an entirely happy state of affairs ;-)
On 25/05/2021 00.45, Pablo Galindo Salgado wrote:
Hi,
Tomorrow is the scheduled release of Python 3.10 beta 2 but unfortunately we have several release blockers:
https://bugs.python.org/issue44043 <https://bugs.python.org/issue44043>: 3.10 b1 armhf Bus Error in hashlib test: test_gil
The problem is already fixed. I forgot to close the release blocker bug after Greg and I took care of https://bugs.python.org/issue36515.
Christian
participants (7)
-
Christian Heimes
-
Larry Hastings
-
Marc-Andre Lemburg
-
Pablo Galindo Salgado
-
Petr Viktorin
-
Tim Peters
-
Victor Stinner