Mailman 3 PEP 442 clarification for type hierarchies - Python-Dev

PEP 442 clarification for type hierarchies

Stefan Behnel

4 Aug 2013 4 Aug '13

7:23 a.m.

Hi, I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation. One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended? Couldn't the super type call chain be made a part of the protocol? Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state. http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level. So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't just set up the execution environment and then call all finalisers of the type hierarchy in bottom-up order. Stefan

Show replies by date

Stefan Behnel

4 Aug 4 Aug

1:24 p.m.

Stefan Behnel, 04.08.2013 09:23:

...

I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation.

One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended? Couldn't the super type call chain be made a part of the protocol?

Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state.

http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize

Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level.

So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't just set up the execution environment and then call all finalisers of the type hierarchy in bottom-up order.

I continued my implementation and found that calling up the base type hierarchy is essentially the same code as calling up the hierarchy for tp_dealloc(), so that was easy to adapt to in Cython and is also more efficient than a generic loop (because it can usually benefit from inlining). So I'm personally ok with leaving the super type calling code to the user side, even though manual implementers may not be entirely happy. I think it should get explicitly documented how subtypes should deal with a tp_finalize() in (one of the) super types. It's not entirely trivial because the tp_finalize slot is not guaranteed to be filled for a super type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that PyType_Ready() copies it would still hold, though. For reference, my initial implementation in Cython is here: https://github.com/cython/cython/commit/6fdb49bd84192089c7e742d46594b59ad643... I'm currently running Cython's tests suite against it to see if everything broke along the way. Will report back as soon as I got everything working. Stefan

Stefan Behnel

3:59 p.m.

Stefan Behnel, 04.08.2013 15:24:

...

Stefan Behnel, 04.08.2013 09:23:

...
I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation.

One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended? Couldn't the super type call chain be made a part of the protocol?

Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state.

http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize

Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level.

So, essentially, I'm wondering whether PyObject_CallFinalizer() couldn't just set up the execution environment and then call all finalisers of the type hierarchy in bottom-up order.

I continued my implementation and found that calling up the base type hierarchy is essentially the same code as calling up the hierarchy for tp_dealloc(), so that was easy to adapt to in Cython and is also more efficient than a generic loop (because it can usually benefit from inlining). So I'm personally ok with leaving the super type calling code to the user side, even though manual implementers may not be entirely happy.

I think it should get explicitly documented how subtypes should deal with a tp_finalize() in (one of the) super types. It's not entirely trivial because the tp_finalize slot is not guaranteed to be filled for a super type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that PyType_Ready() copies it would still hold, though.

Hmm, it seems to me by now that the only safe way of handling this is to let each tp_dealloc() level in the hierarchy call tp_finalize() through PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() functions in base types and subtypes. However, that appears like a rather cumbersome and inefficient design. It also somewhat counters the advantage of having a finalisation step before deallocation, if the finalisers are only called after (partially) cleaning up the subtypes. ISTM that this feature hasn't been fully thought out... Stefan

Antoine Pitrou

5 Aug 5 Aug

6:56 p.m.

On Sun, 04 Aug 2013 17:59:57 +0200 Stefan Behnel wrote:

...

...
I continued my implementation and found that calling up the base type hierarchy is essentially the same code as calling up the hierarchy for tp_dealloc(), so that was easy to adapt to in Cython and is also more efficient than a generic loop (because it can usually benefit from inlining). So I'm personally ok with leaving the super type calling code to the user side, even though manual implementers may not be entirely happy.

I think it should get explicitly documented how subtypes should deal with a tp_finalize() in (one of the) super types. It's not entirely trivial because the tp_finalize slot is not guaranteed to be filled for a super type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that PyType_Ready() copies it would still hold, though.

Not only it could be NULL (if no upper type has a finalizer), but it could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in tp_flags). If an API is needed to make this easier then why not. But I'm not sure anyone else than Cython really has such concerns. Usually, the class hierarchy for C extension types is known at compile-time and therefore you know exactly which upper finalizers to call.

...

Hmm, it seems to me by now that the only safe way of handling this is to let each tp_dealloc() level in the hierarchy call tp_finalize() through PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() functions in base types and subtypes.

I'm not following you. Why is it "a bit fragile" to call the base tp_finalize from a derived tp_finalize? It should actually be totally safe, since tp_finalize is a regular function called in a safe environment (unlike tp_dealloc and tp_del). Regards Antoine.

Stefan Behnel

7:32 p.m.

Antoine Pitrou, 05.08.2013 20:56:

...

On Sun, 04 Aug 2013 17:59:57 +0200 Stefan Behnel wrote:

...
...
I continued my implementation and found that calling up the base type hierarchy is essentially the same code as calling up the hierarchy for tp_dealloc(), so that was easy to adapt to in Cython and is also more efficient than a generic loop (because it can usually benefit from inlining). So I'm personally ok with leaving the super type calling code to the user side, even though manual implementers may not be entirely happy.

I think it should get explicitly documented how subtypes should deal with a tp_finalize() in (one of the) super types. It's not entirely trivial because the tp_finalize slot is not guaranteed to be filled for a super type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that PyType_Ready() copies it would still hold, though.

Not only it could be NULL (if no upper type has a finalizer), but it could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in tp_flags). If an API is needed to make this easier then why not. But I'm not sure anyone else than Cython really has such concerns. Usually, the class hierarchy for C extension types is known at compile-time and therefore you know exactly which upper finalizers to call.

Well, you shouldn't have to, though. Otherwise, it would be practically impossible to insert a new finaliser into an existing hierarchy once other people/projects have started inheriting from it. And, sure, Cython makes these things so easy that people actually do them. The Sage math system has type hierarchies that go up to 10 levels deep IIRC. That's a lot of space for future changes.

...

...
Hmm, it seems to me by now that the only safe way of handling this is to let each tp_dealloc() level in the hierarchy call tp_finalize() through PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() functions in base types and subtypes.

I think I got confused here. PyObject_CallFinalizerFromDealloc() works on the object, not the type. So it can't be used to call anything but the bottom-most tp_finalize().

...

I'm not following you. Why is it "a bit fragile" to call the base tp_finalize from a derived tp_finalize? It should actually be totally safe, since tp_finalize is a regular function called in a safe environment (unlike tp_dealloc and tp_del).

As long as there is not OWTDI, you can't really make safe assumption about the way a super type's tp_finalize() and tp_dealloc() play together. The details definitely need to be spelled out here. Stefan

Antoine Pitrou

7:44 p.m.

On Mon, 05 Aug 2013 21:32:54 +0200 Stefan Behnel wrote:

...

...
...
Hmm, it seems to me by now that the only safe way of handling this is to let each tp_dealloc() level in the hierarchy call tp_finalize() through PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc() functions in base types and subtypes.

I think I got confused here. PyObject_CallFinalizerFromDealloc() works on the object, not the type. So it can't be used to call anything but the bottom-most tp_finalize().

Well, the bottom-most tp_finalize() is responsible for calling the upper ones, if it wants to.

...

...
I'm not following you. Why is it "a bit fragile" to call the base tp_finalize from a derived tp_finalize? It should actually be totally safe, since tp_finalize is a regular function called in a safe environment (unlike tp_dealloc and tp_del).

As long as there is not OWTDI, you can't really make safe assumption about the way a super type's tp_finalize() and tp_dealloc() play together. The details definitely need to be spelled out here.

I'd be glad to make the spec more explicit if needed, but first you need to tell me if the current behaviour is ok, or if you need something else (within the boundaries of backwards compatibility and reasonable expectations, though: i.e. no implicit recursion through the __mro__). Regards Antoine.

Antoine Pitrou

6:51 p.m.

Hi, On Sun, 04 Aug 2013 09:23:41 +0200 Stefan Behnel wrote:

...

I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation.

That's great to hear. "Safe execution environment" for finalization code is exactly what the PEP is about.

...

One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended?

Yes, it is intended that users have to call super().__del__() in their __del__ implementation, if they want to call the upper-level finalizer. This is exactly the same as in __init__() and (most?) other special functions.

...

Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state.

http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize

Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level.

I didn't give much thought to this detail. Originally I was simply copying this bit of semantics from tp_dealloc and tp_del, but indeed we could do better. Do you want to open an issue about it? Regards Antoine.

Stefan Behnel

7:03 p.m.

Hi, I was just continuing in my monologue when you replied, so I'll just drop my response below. Antoine Pitrou, 05.08.2013 20:51:

...

On Sun, 04 Aug 2013 09:23:41 +0200 Stefan Behnel wrote:

...
I'm currently catching up on PEP 442, which managed to fly completely below my radar so far. It's a really helpful change that could end up fixing a major usability problem that Cython was suffering from: user provided deallocation code now has a safe execution environment (well, at least in Py3.4+). That makes Cython a prime candidate for testing this, and I've just started to migrate the implementation.

That's great to hear. "Safe execution environment" for finalization code is exactly what the PEP is about.

...
One thing that I found to be missing from the PEP is inheritance handling. The current implementation doesn't seem to care about base types at all, so it appears to be the responsibility of the type to call its super type finalisation function. Is that really intended?

Yes, it is intended that users have to call super().__del__() in their __del__ implementation, if they want to call the upper-level finalizer. This is exactly the same as in __init__() and (most?) other special functions.

That's the Python side of things. However, if a subtype overwrites tp_finalize(), then there should be a protocol for making sure the super type's tp_finalize() is called, and that it's being called in the right kind of execution environment.

...

...
Another bit is the exception handling. According to the documentation, tp_finalize() is supposed to first save the current exception state, then do the cleanup, then call WriteUnraisable() if necessary, then restore the exception state.

http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize

Is there a reason why this is left to the user implementation, rather than doing it generically right in PyObject_CallFinalizer() ? That would also make it more efficient to call through the super type hierarchy, I guess. I don't see a need to repeat this exception state swapping at each level.

I didn't give much thought to this detail. Originally I was simply copying this bit of semantics from tp_dealloc and tp_del, but indeed we could do better. Do you want to open an issue about it?

I think the main problem I have with the PEP is this part: """ The PEP doesn't change the semantics of: * C extension types with a custom tp_dealloc function. """ Meaning, it was designed to explicitly ignore this use case. That's a mistake, IMHO. If we are to add a new finalisation protocol, why not make it work in the general case (i.e. fix the problem once and for all), instead of restricting it to a special case and leaving the rest to each user to figure out again? Separating the finalisation from the deallocation is IMO a good idea. It fixes cyclic garbage collection, that's excellent. And it removes the differences between GC and normal refcounting cleanup by clarifying in what states finalisation and deallocation are executed (one safe, one not). I think what's missing is the following. * split deallocation into a distinct finalisation and deallocation phase * make finalisation run recursively (or iteratively) through all inheritance levels, in a well defined execution environment (e.g. after saving away the exception state) * after successful finalisation, run the deallocation, as before. My guess is that a recursive finalisation phase where subtype code calls into the supertype is generally more efficient, so I think I'd prefer that. I think it's a mistake that the current implementation calls the finalisation from tp_dealloc(). Instead, both the finalisation and the deallocation should be called externally and independently from the cleanup mechanism behind Py_DECREF(). (There is no problem in CS that can't be solved by adding another level of indirection...) An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised. One way to handle this could be to simply assume that the deallocation phase would still clean up anything that's left over. Or the protocol could dictate that each level must swallow its own exceptions and call the super type finaliser with a clean exception state. This might suggest that an external iterative call loop through the finaliser hierarchy has a usability advantage over recursive calls. Just dropping my idea here. Stefan

Antoine Pitrou

7:26 p.m.

On Mon, 05 Aug 2013 21:03:33 +0200 Stefan Behnel wrote:

...

I think the main problem I have with the PEP is this part:

""" The PEP doesn't change the semantics of: * C extension types with a custom tp_dealloc function. """

Meaning, it was designed to explicitly ignore this use case.

It doesn't ignore it. It lets you fix your C extension type to use tp_finalize for resource finalization. It also provides the PyObject_CallFinalizerFromDealloc() API function to make it easier to call tp_finalize from your tp_dealloc. What the above sentence means is that, if you don't change your legacy tp_dealloc, your type will not take advantage of the new facilities. (you can take a look at the _io module; it was modified to take advantage of tp_finalize)

...

* make finalisation run recursively (or iteratively) through all inheritance levels, in a well defined execution environment (e.g. after saving away the exception state)

__init__ and other methods only let the user recurse explicitly. __del__ would be a weird exception if it recursed implicitly. Also, it would break backwards compatibility for existing uses of __del__.

...

I think it's a mistake that the current implementation calls the finalisation from tp_dealloc(). Instead, both the finalisation and the deallocation should be called externally and independently from the cleanup mechanism behind Py_DECREF(). (There is no problem in CS that can't be solved by adding another level of indirection...)

Why not, but I'm not sure that would solve anything on your side. If it does, would you like to cook a patch? I wonder if there's some unexpected issue with doing what you're proposing.

...

An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised.

Let's come back to pure Python: class A: def __del__(self): 1/0 class B(A): def __del__(self): super().__del__() self.cleanup_resources() If you want cleanup_resources() to be called always (despite A.__del__() raising), you have to use a try/finally block. There's no magic here. Letting the user call the upper finalizer explicitly lets them choose their preferred form of exception handling. Regards Antoine.

Stefan Behnel

8:30 p.m.

Antoine Pitrou, 05.08.2013 21:26:

...

On Mon, 05 Aug 2013 21:03:33 +0200 Stefan Behnel wrote:

...
I think the main problem I have with the PEP is this part:

""" The PEP doesn't change the semantics of: * C extension types with a custom tp_dealloc function. """

Meaning, it was designed to explicitly ignore this use case.

It doesn't ignore it. It lets you fix your C extension type to use tp_finalize for resource finalization. It also provides the PyObject_CallFinalizerFromDealloc() API function to make it easier to call tp_finalize from your tp_dealloc.

What the above sentence means is that, if you don't change your legacy tp_dealloc, your type will not take advantage of the new facilities.

(you can take a look at the _io module; it was modified to take advantage of tp_finalize)

Oh, I'm aware of the backwards compatibility, but I totally *want* to take advantage of the new feature.

...

...
* make finalisation run recursively (or iteratively) through all inheritance levels, in a well defined execution environment (e.g. after saving away the exception state)

__init__ and other methods only let the user recurse explicitly. __del__ would be a weird exception if it recursed implicitly. Also, it would break backwards compatibility for existing uses of __del__.

Hmm, it's a bit unfortunate that tp_finalize() maps so directly to __del__(), but I think this can be fixed. In any case, each tp_finalize() function must only ever be called once, so if a subtype inherited the tp_finalize() slot from its parent, it mustn't be called again. Instead, the hierarchy would be followed upwards to search for the next tp_finalize() that's different from the current one, i.e. the function pointer differs. That means that only the top-most super type would need to call __del__(), after all tp_finalize() functions in subtypes have run.

...

...
I think it's a mistake that the current implementation calls the finalisation from tp_dealloc(). Instead, both the finalisation and the deallocation should be called externally and independently from the cleanup mechanism behind Py_DECREF(). (There is no problem in CS that can't be solved by adding another level of indirection...)

Why not, but I'm not sure that would solve anything on your side.

Well, it would untangle both phases and make it clearer what needs to call what. If I call my super type's tp_dealloc(), does it need to call tp_finalize() or not? And if it does: the type's one or just its own one? If both phases are split, then the answer is simply: it doesn't and mustn't, because that's already been taken care of. All it has to care about is what it's there for: deallocation.

...

If it does, would you like to cook a patch? I wonder if there's some unexpected issue with doing what you're proposing.

I was somehow expecting this question. There's still the open issue of module initialisation, though. Not sure which is more important. Given that this feature has already been merged, I guess it's better to fix it up before people start making actual use of it, instead of putting work into a new feature that's not even started yet.

...

...
An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised.

Let's come back to pure Python:

class A: def __del__(self): 1/0

class B(A): def __del__(self): super().__del__() self.cleanup_resources()

What makes you think it's a good idea to call the parent type's finaliser before doing the local finalisation, and not the other way round? What if the subtype needs access to parts of the super type for its cleanup? In other words, which makes more sense (at the C level): try: super().tp_finalize() finally: local_cleanup() or try: local_cleanup() finally: super().tp_finalize() Should that order be part of the protocol or not? (well, not for __del__() I guess, but maybe for tp_finalize()?) Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() first recursed into the super types, the top-most one of which then calls __del__() and returns, we'd get an execution order that runs Python-level __del__() methods before C-level tp_finalize() functions, but loose the subtype-before-supertype execution order for tp_finalize() functions. That might call for a three-step cleanup: 1) run all Python __del__() methods recursively 2) run all tp_finalize() functions recursively 3) run tp_dealloc() recursively This would allow all three call chains to run in subtype-before-supertype order, and execute the more sensitive Python methods before the low-level "I know what I'm doing" C finalisers.

...

If you want cleanup_resources() to be called always (despite A.__del__() raising), you have to use a try/finally block. There's no magic here.

Letting the user call the upper finalizer explicitly lets them choose their preferred form of exception handling.

I'm ok with that. I'd just prefer it if each level didn't have to execute some kind of boiler plate setup+teardown kind of code. It's a different situation if the finaliser is being called from a subtype or if it's being called as the main entry point for finalisation. I guess that's mostly just an optimisation, though. At least Cython could do that internally by wrapping the finaliser in a boiler plate function before sticking it into the tp_finalize slot, and otherwise call it directly if it's known at compile time. Stefan

Antoine Pitrou

6 Aug 6 Aug

12:12 p.m.

Le Mon, 05 Aug 2013 22:30:29 +0200, Stefan Behnel a écrit :

...

Hmm, it's a bit unfortunate that tp_finalize() maps so directly to __del__(), but I think this can be fixed. In any case, each tp_finalize() function must only ever be called once, so if a subtype inherited the tp_finalize() slot from its parent, it mustn't be called again.

This is already dealt with by a custom bit in the GC header (cf. _PyGC_IS_FINALIZED, IIRC).

...

...
...
An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised.

Let's come back to pure Python:

class A: def __del__(self): 1/0

class B(A): def __del__(self): super().__del__() self.cleanup_resources()

What makes you think it's a good idea to call the parent type's finaliser before doing the local finalisation, and not the other way round? What if the subtype needs access to parts of the super type for its cleanup?

I'm not saying it's a good idea. I'm just saying that to reason about the C API, it is a good idea to reason about equivalent pure Python code. Since exceptions aren't implicitly silenced in pure Python code, they probably shouldn't in C code.

...

In other words, which makes more sense (at the C level):

try: super().tp_finalize() finally: local_cleanup()

or

try: local_cleanup() finally: super().tp_finalize()

Should that order be part of the protocol or not? (well, not for __del__() I guess, but maybe for tp_finalize()?)

No, it is left to the user's preference. Since tp_finalize() is meant to be equivalent to __del__(), I think it's better if the protocols aren't subtly different (to the extent to which it is possible, of course).

...

Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() first recursed into the super types, the top-most one of which then calls __del__() and returns, we'd get an execution order that runs Python-level __del__() methods before C-level tp_finalize() functions, but loose the subtype-before-supertype execution order for tp_finalize() functions.

Well... to get that, you'd have to subclass a pure Python class with a C extension type. Does that ever happen?

...

That might call for a three-step cleanup:

1) run all Python __del__() methods recursively 2) run all tp_finalize() functions recursively 3) run tp_dealloc() recursively

I don't see any reason why tp_finalize should be distinct from __del__, while e.g. __init__ and tp_init map to the exact same thing. (you might wonder why tp_finalize isn't called tp_del, but that's because there is already something named tp_del - something which is obsoleted by PEP 442, incidently ;-)). Regards Antoine.

Stefan Behnel

3:18 p.m.

Antoine Pitrou, 06.08.2013 14:12:

...

Le Mon, 05 Aug 2013 22:30:29 +0200, Stefan Behnel a écrit :

...
Hmm, it's a bit unfortunate that tp_finalize() maps so directly to __del__(), but I think this can be fixed. In any case, each tp_finalize() function must only ever be called once, so if a subtype inherited the tp_finalize() slot from its parent, it mustn't be called again.

This is already dealt with by a custom bit in the GC header (cf. _PyGC_IS_FINALIZED, IIRC).

But that's only at an instance level. If a type in the hierarchy inherited the slot function for tp_finalize() from its parent, then the child must skip its parent in the call chain to prevent calling the same slot function twice. No instance flag can help you here.

...

...
...
...
An obvious open question is how to deal with exceptions during finalisation. Any break in the execution chain would mean that a part of the type wouldn't be finalised.

Let's come back to pure Python:

class A: def __del__(self): 1/0

class B(A): def __del__(self): super().__del__() self.cleanup_resources()

What makes you think it's a good idea to call the parent type's finaliser before doing the local finalisation, and not the other way round? What if the subtype needs access to parts of the super type for its cleanup?

I'm not saying it's a good idea. I'm just saying that to reason about the C API, it is a good idea to reason about equivalent pure Python code. Since exceptions aren't implicitly silenced in pure Python code, they probably shouldn't in C code.

...
In other words, which makes more sense (at the C level):

try: super().tp_finalize() finally: local_cleanup()

or

try: local_cleanup() finally: super().tp_finalize()

Should that order be part of the protocol or not? (well, not for __del__() I guess, but maybe for tp_finalize()?)

No, it is left to the user's preference. Since tp_finalize() is meant to be equivalent to __del__(), I think it's better if the protocols aren't subtly different (to the extent to which it is possible, of course).

Ok, fine with me. If the calls are done recursively anyway, then the child can decide when to calls into its parent.

...

...
Coming back to the __del__() vs. tp_finalize() story, if tp_finalize() first recursed into the super types, the top-most one of which then calls __del__() and returns, we'd get an execution order that runs Python-level __del__() methods before C-level tp_finalize() functions, but loose the subtype-before-supertype execution order for tp_finalize() functions.

Well... to get that, you'd have to subclass a pure Python class with a C extension type.

Maybe I'm wrong here. It's the default implementation of tp_finalize() that calls __del__, right? If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first? The one of the Python class or the one of the extension type? Stefan

Antoine Pitrou

3:49 p.m.

Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a écrit :

...

Antoine Pitrou, 06.08.2013 14:12:

...
Le Mon, 05 Aug 2013 22:30:29 +0200, Stefan Behnel a écrit :

...
Hmm, it's a bit unfortunate that tp_finalize() maps so directly to __del__(), but I think this can be fixed. In any case, each tp_finalize() function must only ever be called once, so if a subtype inherited the tp_finalize() slot from its parent, it mustn't be called again.

This is already dealt with by a custom bit in the GC header (cf. _PyGC_IS_FINALIZED, IIRC).

But that's only at an instance level. If a type in the hierarchy inherited the slot function for tp_finalize() from its parent, then the child must skip its parent in the call chain to prevent calling the same slot function twice. No instance flag can help you here.

Ah, sorry. I had misunderstood what you were talking about. Yes, you're right, a tp_finalize implementation should avoid calling itself recursively. If there's some C API that can be added to ease it, I'm ok for adding it.

...

Maybe I'm wrong here. It's the default implementation of tp_finalize() that calls __del__, right?

Yes.

...

If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first?

Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called. Regards Antoine.

Stefan Behnel

4:38 p.m.

Antoine Pitrou, 06.08.2013 17:49:

...

Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a écrit :

...
If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first?

Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called.

Ok, but then all I have to do in order to disable C level finalisation for a type is to inherit from it and provide an empty __del__ method. I think that disqualifies the feature for the use in Cython. Finalisation at the Python level is nice, but at the C level it's usually vital. I had originally read this PEP as a way to get better guarantees than what dealloc can provide, but your above statement makes it rather the opposite. Stefan

Antoine Pitrou

5:29 p.m.

On Tue, 06 Aug 2013 18:38:51 +0200 Stefan Behnel wrote:

...

Antoine Pitrou, 06.08.2013 17:49:

...
Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a écrit :

...
If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first?

Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called.

Ok, but then all I have to do in order to disable C level finalisation for a type is to inherit from it and provide an empty __del__ method.

I think that disqualifies the feature for the use in Cython. Finalisation at the Python level is nice, but at the C level it's usually vital. I had originally read this PEP as a way to get better guarantees than what dealloc can provide, but your above statement makes it rather the opposite.

Anything vital should probably be ensured by tp_dealloc. For example, you might close an fd early in tp_finalize, but also ensure it gets closed in tp_dealloc in the case tp_finalize wasn't called. (that said, you can also have fd leaks in pure Python...) Regards Antoine.

Stefan Behnel

8 Aug 8 Aug

4:08 a.m.

Stefan Behnel, 06.08.2013 18:38:

...

Antoine Pitrou, 06.08.2013 17:49:

...
Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a écrit :

...
If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first?

Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called.

Ok, but then all I have to do in order to disable C level finalisation for a type is to inherit from it and provide an empty __del__ method.

Oh, and if the Python subtype calls super().__del__() twice, then there is no longer a guarantee that the finalisers only get executed once, right? I think it's time for at least a very visible warning in the docs that the behaviour is only 'guaranteed' for types that cannot be subtyped from Python, and that Python subtypes are free to break up the call chain in whatever way they like. Stefan

Antoine Pitrou

6:30 a.m.

On Thu, 08 Aug 2013 06:08:55 +0200 Stefan Behnel wrote:

...

Stefan Behnel, 06.08.2013 18:38:

...
Antoine Pitrou, 06.08.2013 17:49:

...
Le Tue, 06 Aug 2013 17:18:59 +0200, Stefan Behnel a écrit :

...
If a Python class with a __del__ inherits from an extension type that implements tp_finalize(), then whose tp_finalize() will be executed first?

Then only the Python __del__ gets called. It should call super().__del__() manually, to ensure the extension type's tp_finalize gets called.

Ok, but then all I have to do in order to disable C level finalisation for a type is to inherit from it and provide an empty __del__ method.

Oh, and if the Python subtype calls super().__del__() twice, then there is no longer a guarantee that the finalisers only get executed once, right?

The guarantee is that the *interpreter* will call __del__ once. You're free to call it many times yourself, it's just a method. (but super() itself is supposed to do the right thing, if you're using it properly) And, by the way, I'd like to stress again the parallel with __init__: tp_init can also be called several times if the user calls __init__ manually. Regards Antoine.

3907

Age (days ago)

3911

Last active (days ago)

List overview

Download

16 comments

2 participants

participants (2)

Antoine Pitrou
Stefan Behnel

PEP 442 clarification for type hierarchies

tags

participants (2)