PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 3)
I've updated PEP 683 for the feedback I've gotten. Thanks again for that!
The updated PEP text is included below. The largest changes involve
either the focus of the PEP (internal mechanism to mark objects
immortal) or the possible ways that things can break on older 32-bit
stable ABI extensions. All other changes are smaller.
Given the last round of discussion, I'm hoping this will be the last
round before we go to the steering council.
-eric
----------------
PEP: 683
Title: Immortal Objects, Using a Fixed Refcount
Author: Eric Snow
On Mon, Feb 28, 2022 at 6:01 PM Eric Snow
The updated PEP text is included below. The largest changes involve either the focus of the PEP (internal mechanism to mark objects immortal) or the possible ways that things can break on older 32-bit stable ABI extensions. All other changes are smaller.
In particular, I'm hoping to get your thoughts on the "Accidental De-Immortalizing" section. While I'm confident we will find a good solution, I'm not yet confident about the specific solution. So feedback would be appreciated. Thanks! -eric
On 09. 03. 22 4:58, Eric Snow wrote:
On Mon, Feb 28, 2022 at 6:01 PM Eric Snow
wrote: The updated PEP text is included below. The largest changes involve either the focus of the PEP (internal mechanism to mark objects immortal) or the possible ways that things can break on older 32-bit stable ABI extensions. All other changes are smaller.
In particular, I'm hoping to get your thoughts on the "Accidental De-Immortalizing" section. While I'm confident we will find a good solution, I'm not yet confident about the specific solution. So feedback would be appreciated. Thanks!
Hi, I like the newest version, except this one section is concerning. "periodically reset the refcount for immortal objects (only enable this if a stable ABI extension is imported?)" -- that sounds quite expensive, both at runtime and maintenance-wise. "provide a runtime flag for disabling immortality" also doesn't sound workable to me. We'd essentially need to run all tests twice every time to make sure it stays working. "Special-casing immortal objects in tp_dealloc() for the relevant types (but not int, due to frequency?)" sounds promising. The "relevant types" are those for which we skip calling incref/decref entirely, like in Py_RETURN_NONE. This skipping is one of the optional optimizations, so we're entirely in control of if/when to apply it. How much would it slow things back down if it wasn't done for ints at all? Some more reasoning for not worrying about de-immortalizing in types without this optimization: These objects will be de-immortalized with refcount around 2^29, and then incref/decref go back to being paired properly. If 2^29 is much higher than the true reference count at de-immortalization, this'll just cause a memory leak at shutdown. And it's probably OK to assume that the true reference count of an object can't be anywhere near 2^29: most of the time, to hold a reference you also need to have a pointer to the referenced object, and there ain't enough memory for that many pointers. This isn't a formally sound assumption, of course -- you can incref a million times with a single pointer if you pair the decrefs correctly. But it might be why we had no issues with "int won't overflow", an assumption which would fail with just 4× higher numbers. Of course, the this argument would apply to immortalization and 64-bit builds as well. I wonder if there are holes in it :) Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is valid, refcount values 1 and 0 can no longer be treated specially. That's probably not a practical issue for the relevant types, but it's one more thing to think about when applying the optimization. There's also the other direction to consider: if an old stable-ABI extension does unpaired *increfs* on an immortal object, it'll eventually overflow the refcount. When the refcount is negative, decref will currently crash if built with Py_DEBUG, and I think we want to keep that check/crash. (Note that either be Python itself or any extension could be built with Py_DEBUG.) Hopefully we can live with that, and hope anyone running with Py_DEBUG will send a useful use case report. Or is there another bit before the sign this'll mess up?
"periodically reset the refcount for immortal objects (only enable this if a stable ABI extension is imported?)" -- that sounds quite expensive, both at runtime and maintenance-wise.
As I understand it, the plan is to represent an immortal object by setting two high-order bits to 1. The higher bit is the actual test, and the one representing half of that is a safety margin. When reducing the reference count, CPython already checks whether the refcount's new value is 0. It could instead check whether refcount & (not !immortal_bit) is 0, which would detect when the safety margin has been reduced to 0 -- and could then add it back in. Since the bit manipulation is not conditional, the only extra branch will occur when an object is about to be de-allocated, and that might be rare enough to be an acceptable cost. (It still doesn't prevent rollover from too many increfs, but ... that should indeed be rare in the wild.) -jJ
On 10. 03. 22 3:35, Jim J. Jewett wrote:
"periodically reset the refcount for immortal objects (only enable this if a stable ABI extension is imported?)" -- that sounds quite expensive, both at runtime and maintenance-wise.
As I understand it, the plan is to represent an immortal object by setting two high-order bits to 1. The higher bit is the actual test, and the one representing half of that is a safety margin.
When reducing the reference count, CPython already checks whether the refcount's new value is 0. It could instead check whether refcount & (not !immortal_bit) is 0, which would detect when the safety margin has been reduced to 0 -- and could then add it back in. Since the bit manipulation is not conditional, the only extra branch will occur when an object is about to be de-allocated, and that might be rare enough to be an acceptable cost. (It still doesn't prevent rollover from too many increfs, but ... that should indeed be rare in the wild.)
The problem is that Py_DECREF is a macro, and its old behavior is compiled into some extensions. We can't change this without breaking the stable ABI.
responses inline
-eric
On Wed, Mar 9, 2022 at 8:23 AM Petr Viktorin
"periodically reset the refcount for immortal objects (only enable this if a stable ABI extension is imported?)" -- that sounds quite expensive, both at runtime and maintenance-wise.
Are you talking just about "(only enable this if a stable ABI extension is imported?)"? Such a check could certainly be expensive but it doesn't have to be. However, I'm guessing that you are actually talking about the mechanism to periodically reset the refcount. The actual periodic reset doesn't seem like it needs to be all that expensive overall. It would just need to be in a place that gets triggered often enough, but not too often such that the extra cost of resetting the refcount would be a problem. One important factor is whether we need to worry about potential de-immortalization for all immortal objects or only for a specific subset, like the most commonly used objects (at least most commonly used by the problematic older stable ABI extensions), Mostly, we only need to be concerned with the objects that are likely to trigger de-immortalization in those extensions. Realistically, there aren't many potential immortal objects that would be exposed to the de-immortalization problem (e.g. None, True, False), so we could limit this workaround to them. A variety of options come to mind. In each case we would reset the refcount of a given object if it is immortal. (We would also only do so if the refcount actually changed--to avoid cache invalidation and copy-on-write.) If we need to worry about *all* immortal objects then I see several options: 1. in a single place where stable ABI extensions are likely to pass all objects often enough 2. in a single place where all objects pass through often enough On the other hand, if we only need to worry about a fixed set of objects, the following options come to mind: 1. in a single place that is likely to be called by older stable ABI extensions 2. in a place that runs often enough, targeting a hard-coded group of immortal objects (common static globals like None) * perhaps in the eval breaker code, in exception handling, etc. 3. like (2) but rotate through subsets of the hard-coded group (to reduce the overall cost) 4. like (2), but in spread out in type-specific code (e.g. static types could be reset in type_dealloc()) Again, none of those should be in code that runs often enough that the overhead would add up.
"provide a runtime flag for disabling immortality" also doesn't sound workable to me. We'd essentially need to run all tests twice every time to make sure it stays working.
Yeah, that makes it not worth it.
"Special-casing immortal objects in tp_dealloc() for the relevant types (but not int, due to frequency?)" sounds promising.
The "relevant types" are those for which we skip calling incref/decref entirely, like in Py_RETURN_NONE. This skipping is one of the optional optimizations, so we're entirely in control of if/when to apply it.
We would definitely do it for those types. NoneType and bool already have a tp_dealloc that calls Py_FatalError() if triggered. The tp_dealloc for str & tuple have special casing for some singletons that do likewise. In PyType_Type.tp_dealloc we have a similar assert for static types. In each case we would instead reset the refcount to the initial immortal value. Regardless, in practice we may only need to worry (as noted above) about the problem for the most commonly used global objects, so perhaps we could stop there. However, it depends on what the level of risk is, such that it would warrant incurring additional potential performance/maintenance costs. What is the likelihood of actual crashes due to pathological de-immortalization in older stable ABI extensions? I don't have a clear answer to offer on that but I'd only expect it to be a problem if such extensions are used heavily in (very) long-running processes.
How much would it slow things back down if it wasn't done for ints at all?
I'll look into that. We're talking about the ~260 small ints, so it depends on how much they are used relative to all the other int objects that are used in a program.
Some more reasoning for not worrying about de-immortalizing in types without this optimization: These objects will be de-immortalized with refcount around 2^29, and then incref/decref go back to being paired properly. If 2^29 is much higher than the true reference count at de-immortalization, this'll just cause a memory leak at shutdown. And it's probably OK to assume that the true reference count of an object can't be anywhere near 2^29: most of the time, to hold a reference you also need to have a pointer to the referenced object, and there ain't enough memory for that many pointers. This isn't a formally sound assumption, of course -- you can incref a million times with a single pointer if you pair the decrefs correctly. But it might be why we had no issues with "int won't overflow", an assumption which would fail with just 4× higher numbers.
Yeah, if we're dealing with properly paired incref/decref then the worry about crashing after de-immortalization is mostly gone. The problem is where in the runtime we would simply not call Py_INCREF() on certain objects because we know they are immortal. For instance, Py_RETURN_NONE (outside the older stable ABI) would no longer incref, while the problematic stable ABI extension would keep actually decref'ing until we crash. Again, I'm not sure what the likelihood of this case is. It seems very unlikely to me.
Of course, the this argument would apply to immortalization and 64-bit builds as well. I wonder if there are holes in it :)
With the numbers involved on 64-bit the problem is super unlikely due to the massive numbers we're talking about, so we don't need to worry. Or perhaps I misunderstood your point?
Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is valid, refcount values 1 and 0 can no longer be treated specially. That's probably not a practical issue for the relevant types, but it's one more thing to think about when applying the optimization.
Given the low chance of the pathological case, the nature of the conditions where it might happen, and the specificity of 0 and 1 amongst all the possible values, I wouldn't consider this a problem.
There's also the other direction to consider: if an old stable-ABI extension does unpaired *increfs* on an immortal object, it'll eventually overflow the refcount. When the refcount is negative, decref will currently crash if built with Py_DEBUG, and I think we want to keep that check/crash. (Note that either be Python itself or any extension could be built with Py_DEBUG.) Hopefully we can live with that, and hope anyone running with Py_DEBUG will send a useful use case report.
Yeah, the overflow case is not new. The only difference here (with the specialized tp_dealloc) is that we'd reset the refcount for immortal objects. Of course, we could adjust the check in Py_DECREF() to accommodate both situations and keep crashing like it does not.
Or is there another bit before the sign this'll mess up?
Nope, the sign bit is still highest-order.
On 12. 03. 22 2:45, Eric Snow wrote:
responses inline
I'll snip some discussion for a reason I'll get to later, and get right to the third alternative: [...]
"Special-casing immortal objects in tp_dealloc() for the relevant types (but not int, due to frequency?)" sounds promising.
The "relevant types" are those for which we skip calling incref/decref entirely, like in Py_RETURN_NONE. This skipping is one of the optional optimizations, so we're entirely in control of if/when to apply it.
We would definitely do it for those types. NoneType and bool already have a tp_dealloc that calls Py_FatalError() if triggered. The tp_dealloc for str & tuple have special casing for some singletons that do likewise. In PyType_Type.tp_dealloc we have a similar assert for static types. In each case we would instead reset the refcount to the initial immortal value. Regardless, in practice we may only need to worry (as noted above) about the problem for the most commonly used global objects, so perhaps we could stop there.
However, it depends on what the level of risk is, such that it would warrant incurring additional potential performance/maintenance costs. What is the likelihood of actual crashes due to pathological de-immortalization in older stable ABI extensions? I don't have a clear answer to offer on that but I'd only expect it to be a problem if such extensions are used heavily in (very) long-running processes.
How much would it slow things back down if it wasn't done for ints at all?
I'll look into that. We're talking about the ~260 small ints, so it depends on how much they are used relative to all the other int objects that are used in a program.
Not only that -- as far as I understand, it's only cases where we know at compile time that a small int is being returned. AFAIK, that would be fast branches of aruthmetic code, but not much else. If not optimizing small ints is OK performance-wise, then everything looks good: we say that the “skip incref/decref” optimization can only be done for types whose instances are *all* immortal, leave it to future discussions to relax the requirement, and PEP 683 is good to go! With that I mind I snipped your discussion of the previous alternative. Going with this one wouldn't prevent us from doing something more clever in the future.
Some more reasoning for not worrying about de-immortalizing in types without this optimization: These objects will be de-immortalized with refcount around 2^29, and then incref/decref go back to being paired properly. If 2^29 is much higher than the true reference count at de-immortalization, this'll just cause a memory leak at shutdown. And it's probably OK to assume that the true reference count of an object can't be anywhere near 2^29: most of the time, to hold a reference you also need to have a pointer to the referenced object, and there ain't enough memory for that many pointers. This isn't a formally sound assumption, of course -- you can incref a million times with a single pointer if you pair the decrefs correctly. But it might be why we had no issues with "int won't overflow", an assumption which would fail with just 4× higher numbers.
Yeah, if we're dealing with properly paired incref/decref then the worry about crashing after de-immortalization is mostly gone. The problem is where in the runtime we would simply not call Py_INCREF() on certain objects because we know they are immortal. For instance, Py_RETURN_NONE (outside the older stable ABI) would no longer incref, while the problematic stable ABI extension would keep actually decref'ing until we crash.
Again, I'm not sure what the likelihood of this case is. It seems very unlikely to me.
Of course, the this argument would apply to immortalization and 64-bit builds as well. I wonder if there are holes in it :)
With the numbers involved on 64-bit the problem is super unlikely due to the massive numbers we're talking about, so we don't need to worry. Or perhaps I misunderstood your point?
That's true. However, as we're adjusting incref/decref documentation for this PEP anyway, it looks like we could add “you should keep a pointer around for each reference you hold”, and go from “super unlikely” to “impossible in well-behaved code” :)
Oh, and if the "Special-casing immortal objects in tp_dealloc()" way is valid, refcount values 1 and 0 can no longer be treated specially. That's probably not a practical issue for the relevant types, but it's one more thing to think about when applying the optimization.
Given the low chance of the pathological case, the nature of the conditions where it might happen, and the specificity of 0 and 1 amongst all the possible values, I wouldn't consider this a problem.
+1. But it's worth mentioning that it's not a problem.
participants (3)
-
Eric Snow
-
Jim J. Jewett
-
Petr Viktorin