[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

11 Mar 2022

      responses inline

-eric

On Wed, Mar 9, 2022 at 8:23 AM Petr Viktorin <encukou@gmail.com> wrote:
...
"periodically reset the refcount for immortal objects (only enable this
if a stable ABI extension is imported?)" -- that sounds quite expensive,
both at runtime and maintenance-wise.
Are you talking just about "(only enable this if a stable ABI
extension is imported?)"?  Such a check could certainly be expensive
but it doesn't have to be.  However, I'm guessing that you are
actually talking about the mechanism to periodically reset the
refcount.

The actual periodic reset doesn't seem like it needs to be all that
expensive overall.  It would just need to be in a place that gets
triggered often enough, but not too often such that the extra cost of
resetting the refcount would be a problem.

One important factor is whether we need to worry about potential
de-immortalization for all immortal objects or only for a specific
subset, like the most commonly used objects (at least most commonly
used by the problematic older stable ABI extensions),  Mostly, we only
need to be concerned with the objects that are likely to trigger
de-immortalization in those extensions.  Realistically, there aren't
many potential immortal objects that would be exposed to the
de-immortalization problem (e.g. None, True, False), so we could limit
this workaround to them.

A variety of options come to mind.  In each case we would reset the
refcount of a given object if it is immortal.  (We would also only do
so if the refcount actually changed--to avoid cache invalidation and
copy-on-write.)

If we need to worry about *all* immortal objects then I see several options:

1. in a single place where stable ABI extensions are likely to pass
all objects often enough
2. in a single place where all objects pass through often enough

On the other hand, if we only need to worry about a fixed set of
objects, the following options come to mind:

1. in a single place that is likely to be called by older stable ABI extensions
2. in a place that runs often enough, targeting a hard-coded group of
immortal objects (common static globals like None)
   * perhaps in the eval breaker code, in exception handling, etc.
3. like (2) but rotate through subsets of the hard-coded group (to
reduce the overall cost)
4. like (2), but in spread out in type-specific code (e.g. static
types could be reset in type_dealloc())

Again, none of those should be in code that runs often enough that the
overhead would add up.
...
"provide a runtime flag for disabling immortality" also doesn't sound
workable to me. We'd essentially need to run all tests twice every time
to make sure it stays working.
Yeah, that makes it not worth it.
...
"Special-casing immortal objects in tp_dealloc() for the relevant types
(but not int, due to frequency?)" sounds promising.
The "relevant types" are those for which we skip calling incref/decref
entirely, like in Py_RETURN_NONE. This skipping is one of the optional
optimizations, so we're entirely in control of if/when to apply it.
We would definitely do it for those types.  NoneType and bool already
have a tp_dealloc that calls Py_FatalError() if triggered.  The
tp_dealloc for str & tuple have special casing for some singletons
that do likewise.  In PyType_Type.tp_dealloc we have a similar assert
for static types.  In each case we would instead reset the refcount to
the initial immortal value.  Regardless, in practice we may only need
to worry (as noted above) about the problem for the most commonly used
global objects, so perhaps we could stop there.

However, it depends on what the level of risk is, such that it would
warrant incurring additional potential performance/maintenance costs.
What is the likelihood of actual crashes due to pathological
de-immortalization in older stable ABI extensions?  I don't have a
clear answer to offer on that but I'd only expect it to be a problem
if such extensions are used heavily in (very) long-running processes.
...
How much would it slow things back down if it wasn't done for ints at all?
I'll look into that.  We're talking about the ~260 small ints, so it
depends on how much they are used relative to all the other int
objects that are used in a program.
...
Some more reasoning for not worrying about de-immortalizing in types
without this optimization:
These objects will be de-immortalized with refcount around 2^29, and
then incref/decref go back to being paired properly. If 2^29 is much
higher than the true reference count at de-immortalization, this'll just
cause a memory leak at shutdown.
And it's probably OK to assume that the true reference count of an
object can't be anywhere near 2^29: most of the time, to hold a
reference you also need to have a pointer to the referenced object, and
there ain't enough memory for that many pointers. This isn't a formally
sound assumption, of course -- you can incref a million times with a
single pointer if you pair the decrefs correctly. But it might be why we
had no issues with "int won't overflow", an assumption which would fail
with just 4× higher numbers.
Yeah, if we're dealing with properly paired incref/decref then the
worry about crashing after de-immortalization is mostly gone.  The
problem is where in the runtime we would simply not call Py_INCREF()
on certain objects because we know they are immortal.  For instance,
Py_RETURN_NONE (outside the older stable ABI) would no longer incref,
while the problematic stable ABI extension would keep actually
decref'ing until we crash.

Again, I'm not sure what the likelihood of this case is.  It seems
very unlikely to me.
...
Of course, the this argument would apply to immortalization and 64-bit
builds as well. I wonder if there are holes in it :)
With the numbers involved on 64-bit the problem is super unlikely due
to the massive numbers we're talking about, so we don't need to worry.
Or perhaps I misunderstood your point?
...
Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is
valid, refcount values 1 and 0 can no longer be treated specially.
That's probably not a practical issue for the relevant types, but it's
one more thing to think about when applying the optimization.
Given the low chance of the pathological case, the nature of the
conditions where it might happen, and the specificity of 0 and 1
amongst all the possible values, I wouldn't consider this a problem.
...
There's also the other direction to consider: if an old stable-ABI
extension does unpaired *increfs* on an immortal object, it'll
eventually overflow the refcount.
When the refcount is negative, decref will currently crash if built with
Py_DEBUG, and I think we want to keep that check/crash. (Note that
either be Python itself or any extension could be built with Py_DEBUG.)
Hopefully we can live with that, and hope anyone running with Py_DEBUG
will send a useful use case report.
Yeah, the overflow case is not new.  The only difference here (with
the specialized tp_dealloc) is that we'd reset the refcount for
immortal objects.  Of course, we could adjust the check in Py_DECREF()
to accommodate both situations and keep crashing like it does not.
...
Or is there another bit before the sign this'll mess up?
Nope, the sign bit is still highest-order.

[Python-Dev] Re: PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)

Eric Snow