On 27.05.2021 19:40, Tim Peters wrote:
[Pablo Galindo Salgado pablogsal@gmail.com]
Hi Marc,
Yes, check out this from the 3.9 what's new document:
https://docs.python.org/3/whatsnew/3.9.html#changes-in-the-c-api
Instances of heap-allocated types (such as those created with PyType_FromSpec() and similar APIs) hold a reference to their type object since Python 3.8. ...
Thanks for these details. I wasn't aware of that change.
So all those instances have an increase in memory footprint compared to Python 3.7 ?
I think Marc-Andre's question is more subtle than that:
[Marc-Andre Lemburg, mal@egenix.com]
Hi Pablo,
could you or Erlend please explain why types which don't reference any other objects need to participate in GC for deallocation ? ... AFAIK (but could be wrong, of course), the type object itself does not reference any other objects related to the object that is being GCed.
To make that more formal, if there's a chain of pointers from an object X that reaches X again (X is a direct member of a cycle), then for cyclic gc to work X's tp_traverse must visit all directly contained objects that could be the second object in a direct cycle reaching X again.
But, for example, if X contains a tuple of integers, there's no need for X's tp_traverse to visit that tuple. It's impossible to reach X from the tuple. Cyclic gc cares only about cycles; chasing non-cyclic dead ends doesn't accomplish anything beyond burning processor cycles.
This is, I believe, akin to what Marc-Andre is bringing up: if X can't be reached _from_ X's type object, there's no need for X's tp_traverse to visit X's type object. It _can_ be visited, but it would be a waste of time.
Indeed, that's what I was after. GC is really only needed for objects which can potentially participate in cycles which need to be reclaimed. As I understand the logic (even with the objects referencing the constant type objects on the heap), many of the objects which are now made GC traversal aware don't really have a need for it -- their traversal only finds type objects, no other objects.
... By having (nearly) all stdlib types participate in GC, even ones which don't reference other objects and cannot be parts of reference circles, instead of immediately deleting them, we will keep those objects alive for much longer than necessary, potentially causing a resource overhead regression.
But I think "waste of time" is the worst of it. Participating in cyclic gc does nothing to delay refcounting from recycling objects ASAP. gc only reclaims objects that are reachable only from dead cycles; everything else in CPython is reclaimed the instant its refcount falls to 0, and that's so regardless of whether it participates in cyclic gc.
Oh, thanks for the explanation. I was under the impression that GC-aware objects are added to a GC pool for processing at the next GC run. If that's not the case in general -- only if they are part of dead cycles -- then it's merely wasting time on traversing known dead ends... and developer time for adding the unnecessary logic ;-)
Perhaps this provides an easy way to unblock the release :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, May 27 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/