[Python-Dev] PEP 442 clarification for type hierarchies
Stefan Behnel
stefan_ml at behnel.de
Mon Aug 5 22:30:29 CEST 2013
Antoine Pitrou, 05.08.2013 21:26:
> On Mon, 05 Aug 2013 21:03:33 +0200
> Stefan Behnel wrote:
>> I think the main problem I have with the PEP is this part:
>>
>> """
>> The PEP doesn't change the semantics of:
>> * C extension types with a custom tp_dealloc function.
>> """
>>
>> Meaning, it was designed to explicitly ignore this use case.
>
> It doesn't ignore it. It lets you fix your C extension type to use
> tp_finalize for resource finalization. It also provides the
> PyObject_CallFinalizerFromDealloc() API function to make it easier to
> call tp_finalize from your tp_dealloc.
>
> What the above sentence means is that, if you don't change your legacy
> tp_dealloc, your type will not take advantage of the new facilities.
>
> (you can take a look at the _io module; it was modified to take
> advantage of tp_finalize)
Oh, I'm aware of the backwards compatibility, but I totally *want* to take
advantage of the new feature.
>> * make finalisation run recursively (or iteratively) through all
>> inheritance levels, in a well defined execution environment (e.g. after
>> saving away the exception state)
>
> __init__ and other methods only let the user recurse explicitly.
> __del__ would be a weird exception if it recursed implicitly. Also, it
> would break backwards compatibility for existing uses of __del__.
Hmm, it's a bit unfortunate that tp_finalize() maps so directly to
__del__(), but I think this can be fixed. In any case, each tp_finalize()
function must only ever be called once, so if a subtype inherited the
tp_finalize() slot from its parent, it mustn't be called again. Instead,
the hierarchy would be followed upwards to search for the next
tp_finalize() that's different from the current one, i.e. the function
pointer differs. That means that only the top-most super type would need to
call __del__(), after all tp_finalize() functions in subtypes have run.
>> I think it's a mistake that the current implementation calls the
>> finalisation from tp_dealloc(). Instead, both the finalisation and the
>> deallocation should be called externally and independently from the cleanup
>> mechanism behind Py_DECREF(). (There is no problem in CS that can't be
>> solved by adding another level of indirection...)
>
> Why not, but I'm not sure that would solve anything on your side.
Well, it would untangle both phases and make it clearer what needs to call
what. If I call my super type's tp_dealloc(), does it need to call
tp_finalize() or not? And if it does: the type's one or just its own one?
If both phases are split, then the answer is simply: it doesn't and
mustn't, because that's already been taken care of. All it has to care
about is what it's there for: deallocation.
> If it does, would you like to cook a patch? I wonder if there's some
> unexpected issue with doing what you're proposing.
I was somehow expecting this question. There's still the open issue of
module initialisation, though. Not sure which is more important. Given that
this feature has already been merged, I guess it's better to fix it up
before people start making actual use of it, instead of putting work into a
new feature that's not even started yet.
>> An obvious open question is how to deal with exceptions during
>> finalisation. Any break in the execution chain would mean that a part of
>> the type wouldn't be finalised.
>
> Let's come back to pure Python:
>
> class A:
> def __del__(self):
> 1/0
>
> class B(A):
> def __del__(self):
> super().__del__()
> self.cleanup_resources()
What makes you think it's a good idea to call the parent type's finaliser
before doing the local finalisation, and not the other way round? What if
the subtype needs access to parts of the super type for its cleanup?
In other words, which makes more sense (at the C level):
try:
super().tp_finalize()
finally:
local_cleanup()
or
try:
local_cleanup()
finally:
super().tp_finalize()
Should that order be part of the protocol or not? (well, not for __del__()
I guess, but maybe for tp_finalize()?)
Coming back to the __del__() vs. tp_finalize() story, if tp_finalize()
first recursed into the super types, the top-most one of which then calls
__del__() and returns, we'd get an execution order that runs Python-level
__del__() methods before C-level tp_finalize() functions, but loose the
subtype-before-supertype execution order for tp_finalize() functions.
That might call for a three-step cleanup:
1) run all Python __del__() methods recursively
2) run all tp_finalize() functions recursively
3) run tp_dealloc() recursively
This would allow all three call chains to run in subtype-before-supertype
order, and execute the more sensitive Python methods before the low-level
"I know what I'm doing" C finalisers.
> If you want cleanup_resources() to be called always (despite
> A.__del__() raising), you have to use a try/finally block. There's no
> magic here.
>
> Letting the user call the upper finalizer explicitly lets them choose
> their preferred form of exception handling.
I'm ok with that. I'd just prefer it if each level didn't have to execute
some kind of boiler plate setup+teardown kind of code. It's a different
situation if the finaliser is being called from a subtype or if it's being
called as the main entry point for finalisation.
I guess that's mostly just an optimisation, though. At least Cython could
do that internally by wrapping the finaliser in a boiler plate function
before sticking it into the tp_finalize slot, and otherwise call it
directly if it's known at compile time.
Stefan
More information about the Python-Dev
mailing list