[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

Feb. 17, 2022

      Thanks for the feedback.  My responses are inline below.

-eric

On Wed, Feb 16, 2022 at 6:36 AM Petr Viktorin <encukou@gmail.com> wrote:
...
Thank you very much for writing this down! It's very helpful to see a
concrete proposal, and the current state of this idea.
I like the change,
That's good to hear. :)
...
but I think it's unfortunately more complicated than
the PEP suggests.
That would be unsurprising. :)
...
...
This proposal is CPython-specific and, effectively, describes
internal implementation details.
I think that is a naïve statement. Refcounting is
implementation-specific, but it's hardly an *internal* detail.
Sorry for any confusion.  I didn't mean to say that refcounting is an
internal detail.  Rather, I was talking about how the proposed change
in refcounting behavior doesn't affect any guaranteed/documented
behavior, hence "internal".

Perhaps I missed some documented behavior?  I was going off the following:

* https://docs.python.org/3.11/c-api/intro.html#objects-types-and-reference-co...
* https://docs.python.org/3.11/c-api/structures.html#c.Py_REFCNT
...
There is
code that targets CPython specifically, and relies on the details.
Could you elaborate?  Do you mean such code relies on specific refcount values?
...
The refcount has public getters and setters,
Agreed.  However, what behavior do users expect and what guarantees do
we make?  Do we indicate how to interpret the refcount value they
receive?  What are the use cases under which a user would set an
object's refcount to a specific value?  Are users setting the refcount
of objects they did not create?
...
and you need a pretty good
grasp of the concept to write a C extension.
I would not expect this to be affected by this PEP, except in cases
where users are checking/modifying refcounts for objects they did not
create (since none of their objects will be immortal).
...
I think that it's safe to assume that this will break people's code,
Do you have some use case in mind, or an example?  From my perspective
I'm having a hard time seeing what this proposed change would break.

That said, Kevin Modzelewski indicated [1] that there were affected
cases for Pyston (though their change in behavior is slightly
different).

[1] https://mail.python.org/archives/list/python-dev@python.org/message/TPLEYDCX...
...
and
this PEP should convince us that the breakage is worth it rather than
dismiss the issue.
Sorry, I didn't mean to be dismissive.  I agree that if there is
breakage this PEP must address it.
...
It would be good to note that “container” refers to the GC term, as in
https://devguide.python.org/garbage_collector/#identifying-reference-cycles
and not e.g.
https://docs.python.org/3/library/collections.abc.html#collections.abc.Conta...
+1
...
...
This has a concrete impact on active projects in the Python community.
Below we describe several ways in which refcount modification has
a real negative effect on those projects.  None of that would
happen for objects that are truly immutable.
Reducing Cache Invalidation
---------------------------
Explicitly saying “CPU cache” would make the PEP easier to skim.
+1
...
...
Every modification of a refcount causes the corresponding cache
line to be invalidated.  This has a number of effects.
For one, the write must be propagated to other cache levels
and to main memory.  This has small effect on all Python programs.
Immortal objects would provide a slight relief in that regard.
On top of that, multi-core applications pay a price.  If two threads
are interacting with the same object (e.g. ``None``)  then they will
end up invalidating each other's caches with each incref and decref.
This is true even for otherwise immutable objects like ``True``,
``0``, and ``str`` instances.  This is also true even with
the GIL, though the impact is smaller.
This looks out of context. Python has a per-process GIL. It should it go
after the next section.
This isn't about a data race.  I'm talking about how if an object is
active in two different threads (on distinct cores) then incref/decref
in one thread will invalidate the cache (line) in the other thread.
The only impact of the GIL in this case is that the two threads aren't
running simultaneously and the cache invalidation on the idle thread
has less impact.

Perhaps I've missed something?
...
...
The proposed solution is obvious enough that two people came to the
same conclusion (and implementation, more or less) independently.
Who was it? Assuming it's not a secret :)
Me and Eddit. :)  I don't mind saying so.
...
...
In the case of per-interpreter GIL, the only realistic alternative
is to move all global objects into ``PyInterpreterState`` and add
one or more lookup functions to access them.  Then we'd have to
add some hacks to the C-API to preserve compatibility for the
may objects exposed there.  The story is much, much simpler
with immortal objects
Weren't you planning a PEP on subinterpreter GIL as well? Do you want to
submit them together?
I'd have to think about that.  The other PEP I'm writing for
per-interpreter GIL doesn't require immortal objects.  They just
simplify a number of things.  That's my motivation for writing this
PEP, in fact. :)
...
IMO, as it is, the PEP's motivation doesn't really stand on its own.
It's only worth it as a step towards per-interpreter GIL.
I expect Eddie would argue otherwise, but I probably wouldn't have
written this PEP if it weren't for its benefit to per-interpreter GIL.
...
(We might have a catch-24 situation in the way we currently handle PEPs.
That would mean we need to change the process, maybe even permanently.
IMO, this PEP will be very helpful in untangling the situation.)
Glad to help. :)
...
...
Backward Compatibility
-----------------------
This proposal is completely compatible.  It is internal-only so no API
is changing.
The approach is also compatible with extensions compiled to the stable
ABI.  Unfortunately, they will modify the refcount and invalidate all
the performance benefits of immortal objects.  However, the high bit
of the refcount will still match ``_Py_IMMORTAL_REFCNT`` so we can
still identify such objects as immortal.
So, any extension that uses the stable ABI will break an invariant.
What'll be the impact? The total refcount will probably go out of sync,
anything else?
The impact would be: objects incref/decref'ed by such a module would
be exposed to some of the performance penalties described earlier in
the PEP.  I expect the potential aggregate cost would be relatively
small.
...
If an extension DECREFs an immortal object, will it still match
_Py_IMMORTAL_REFCNT? How is that guaranteed?
It wouldn't match _Py_IMMORTAL_REFCNT, but the high bit of
_Py_IMMORTAL_REFCNT would still match.  That bit is what we would
actually be checking, rather than the full value.
...
What about extensions compiled with Python 3.11 (with this PEP) that use
an older version of the stable ABI, and thus should be compatible with
3.2+? Will they use the old versions of the macros? How will that be tested?
It wouldn't matter unless an object's refcount reached
_Py_IMMORTAL_REFCNT, at which point incref/decref would start
noop'ing.  What is the likelihood (in real code) that an object's
refcount would grow that far?  Even then, would such an object ever be
expected to go back to 0 (and be dealloc'ed)?  Otherwise the point is
moot.
...
...
No user-facing behavior changes, with the following exceptions:
* code that inspects the refcount (e.g. ``sys.getrefcount()``
   or directly via ``ob_refcnt``) will see a really, really large
   value
* ``Py_SET_REFCNT()`` will be a no-op for immortal objects
Neither should cause a problem.
Alternate Python Implementations
--------------------------------
This proposal is CPython-specific.
IMO it's specific to the C API, which is wider than just CPython. I
don't think we can just assume it'll have no impact on other
implementations.
Fair enough.
...
...
Non-Obvious Consequences
------------------------
* immortal containers effectively immortalize each contained item
* the same is true for objects held internally by other objects
   (e.g. ``PyTypeObject.tp_subclasses``)
So, do immortal lists immortalize values append()ed to them? (Can you
even have an immortal list?  Are there limits on what can be immortal?)
We have no plans to do more than ever explicitly immortalize objects.
So an immortal list is fine but it would have no effect on the
immortality of items it contains, other than implicitly (since the
list holds a reference to each item).

In general, it would be best to only immortalize immutable objects.
If we want to share any objects shared between threads without
protection (e.g. per-interpreter GIL) then such objects must be
immortal and immutable.  So lists and dicts, etc. couldn't be shared
(assuming we can't prevent further mutation).

However, for objects that will never be shared, it can be practical to
make some of them immortal too.  For example, sys.modules is a
per-interpreter dict that we do not expect to ever get freed until the
corresponding interpreter is finalized.  By making it immortal, we no
longer incur the extra overhead during incref/decref.

We can apply this idea in the pursuit of getting back some of that 4%
performance we lost.  At the end of runtime init we can mark *all*
objects as immortal and avoid the extra cost in incref/decref.  We
only need to worry about immutability with objects that we plan on
sharing between threads without a GIL.

(FYI, we still need to look closely at the impact of this approach on GC.)
...
...
* an immortal object's type is effectively immortal
Should this be enforced?
There is nothing to enforce.  The object holds a reference to its type
so the type will never be cleaned up as long as the immortal object
isn't.  Hence the type of an immortal object is effectively immortal.
We don't need the type to actually be marked as immortal.
...
...
* though extremely unlikely (and technically hard), any object could
   be incref'ed enough to reach ``_Py_IMMORTAL_REFCNT`` and then
   be treated as immortal
What would it take?
Basically, you;d have to do it deliberately (e.g. incref the object in
a tight loop).  Even with a tight loop it would take a long time to
count up to 2^60 or whatever the chosen value is.
...
...
This is not meant to be a public feature but rather an internal one.
So the proposal does *not* including adding any new public C-API,
nor any Python API.  However, this does not prevent us from
adding (publicly accessible) private API to do things
like immortalize an object or tell if one
is immortal.
This is a public change.
I agree that the change to the implementation of some public API is
certainly public, as is the change in behavior for immortal objects,
as is the potential <4% performance regression.  By "public feature" I
was referring to immortal objects.  We are not exposing that to users,
other than that they might notice some objects now have a really high
refcount that does not change.
...
Py_INCREF increments the reference count.
Py_REFCNT gets the reference count.
For immortal objects, Py_INCREF will no longer function as documented in
3.10, and Py_REFCNT can be used to witness it. Both are public API.
You are right that "Increment the reference count for object o." (as
documented) will not be true for an immortal object.  Instead it would
be something like "indicate that there is an additional reference for
object o".  I'll be sure to update the PEP, to add that change to the
docs wording.

Regardless, how important is that distinction?  If it matters then
clearly this proposal needs to change.  As an exercise, we can
consider one of the most used objects, None, and that we would make it
immortal.  How would that impact users of Py_INCREF() and Py_REFCNT()?
...
...
Immortal Global Objects
[snip]
There will likely be others we have not enumerated here.
How will the candidates be chosen?
Any objects that we would expect to share globally (ergo otherwise
immutable) will be made immortal.  That means the static types, the
builtin singletons, the objects in _PyRuntimeState.global_objects,
etc.
...
...
Making a mutable object immortal isn't particularly helpful.
The exception is if you can ensure the object isn't actually
modified again.  Since we aren't enforcing any immutability
for immortal objects it didn't make sense to emphasis
that relationship.
Is it just not helpful, or is it disallowed?
It is not disallowed.

Also, I need to clarify that section since there are cases where
making a mutable object immortal can provide performance benefits, as
described earlier.
...
What about __subclasses__/tp_subclasses?
That's one we'll have to deal with specially, e.g. for core static
types we'd store the object on PyInterpreterState.  Then the
__subclasses__ getter would do a lookup on the current interpreter,
instead of using tp_subclasses.  We could get rid of tp_subclasses or
perhaps use it only for the main interpreter.

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount"

Eric Snow