Mailman 3 Handle errors in cleanup code - Python-Dev

Handle errors in cleanup code

Serhiy Storchaka

June 12, 2017

7:10 a.m.

There is an idiomatic Python code: do_something() try: do_something_other() except: undo_something() raise If an error is raised when execute do_something_other(), then we should first restore the state that was before calling do_something(), and then reraise the exception. It is important that the bare "except" (or "except BaseException") is used, because undo_something() should be executed on any exception. And this is one of few cases where using the bare "except" is correct. But if an error is raised when execute undo_something(), it replaces the original exception which become chaining as the __context__ attribute. The problem is that this can change the type of the exception. If do_something_other() raises SystemExit and undo_something() raises KeyError, the final exception has type KeyError. Yury in the comment for PR 2108 [1] suggested more complicated code: do_something() try: do_something_other() except BaseException as ex: try: undo_something() finally: raise ex Does it mean that we should rewrite every chunk of code similar to the above? And if such cumbersome code is necessary and become common, maybe it needs syntax support in the language? Or changing the semantic of exceptions raised in error handlers and finally blocks? [1] https://github.com/python/cpython/pull/2108

Show replies by date

Martin (gzlist)

June 2017

7:31 a.m.

On 12/06/2017, Serhiy Storchaka <storchaka@gmail.com> wrote:

...

But if an error is raised when execute undo_something(), it replaces the original exception which become chaining as the __context__ attribute. The problem is that this can change the type of the exception. If do_something_other() raises SystemExit and undo_something() raises KeyError, the final exception has type KeyError.

For Python 2.7, I've used the following idiom, which always masks errors from the undo: do_something() try: do_something_other() except: try: undo_something() finally: raise Unfortunately, that breaks down on Python 3, as the new exception propogates with the original as context..

...

Does it mean that we should rewrite every chunk of code similar to the above? And if such cumbersome code is necessary and become common, maybe it needs syntax support in the language? Or changing the semantic of exceptions raised in error handlers and finally blocks?

What I want is a way to chain exceptions in the other direction, raising the original error, but attaching a related one. Unfortunately neither __cause__ or __context__ really help. Martin

Nick Coghlan

8:07 a.m.

On 12 June 2017 at 17:31, Martin (gzlist) via Python-Dev <python-dev@python.org> wrote:

...

On 12/06/2017, Serhiy Storchaka <storchaka@gmail.com> wrote:

...
But if an error is raised when execute undo_something(), it replaces the original exception which become chaining as the __context__ attribute. The problem is that this can change the type of the exception. If do_something_other() raises SystemExit and undo_something() raises KeyError, the final exception has type KeyError.

For Python 2.7, I've used the following idiom, which always masks errors from the undo:

do_something() try: do_something_other() except: try: undo_something() finally: raise

Unfortunately, that breaks down on Python 3, as the new exception propogates with the original as context..

Relevant tracker issue for that problem: https://bugs.python.org/issue18861

...

...
Does it mean that we should rewrite every chunk of code similar to the above? And if such cumbersome code is necessary and become common, maybe it needs syntax support in the language? Or changing the semantic of exceptions raised in error handlers and finally blocks?

What I want is a way to chain exceptions in the other direction, raising the original error, but attaching a related one. Unfortunately neither __cause__ or __context__ really help.

Aye, agreed. The key challenge we have is that we're trying to represent the exception state as a linked list, when what we really have once we start taking cleanup errors into account is an exception *tree*. The thing that finally clicked for me in this thread is that if we add contextlib.ResourceSet, and have it add a new attribute to the original exception (e.g. "__cleanup_errors__") with a list of exceptions raised while attempt to cleanup after the earlier exception, then that lets us actually start modelling that tree at runtime. Once we understand the *behaviour* we want, *then* we can consider whether we might want to add a context manager to have any raised exceptions be attached to the currently active exception as cleanup errors rather than as new exceptions in their own right. For example: do_something() try: do_something_other() except BaseException as exc: with contextlib.cleanup(exc) as reraise: # Exceptions raised in here would be added to # exc.__cleanup_errors__, rather than being # propagated normally undo_something() reraise() The need for the "as reraise:"/"reraise()" trick arises from the need to special case the situation where the original exception inherits from Exception, but one of the raised exceptions *doesn't* - we want SystemExit/KeyboardInterrupt/etc to take precedence in that case, and a bare raise statement won't handle that for us (it *could* in a hypothetical future version of Python that's natively `__cleanup_errors__` aware, but that's not going to be useful for existing versions). Since I don't see anything in the discussion so far that *requires* changes to the standard library (aside from "we may want to use this ourselves"), the right place to thrash out the design details is probably contextlib2: https://github.com/jazzband/contextlib2 That's where contextlib.ExitStack was born, and I prefer using it to iterate on context management design concepts, since we can push updates out faster, and if we make bad choices anywhere along the way, they can just sit around in contextlib2, rather than polluting the standard library indefinitely. Cheers, Nick. P.S. trio's MultiError is also likely worth looking into in this context -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Brett Cannon

5:02 p.m.

On Mon, 12 Jun 2017 at 01:08 Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On 12 June 2017 at 17:31, Martin (gzlist) via Python-Dev <python-dev@python.org> wrote:

...
On 12/06/2017, Serhiy Storchaka <storchaka@gmail.com> wrote:

...
But if an error is raised when execute undo_something(), it replaces the original exception which become chaining as the __context__ attribute. The problem is that this can change the type of the exception. If do_something_other() raises SystemExit and undo_something() raises KeyError, the final exception has type KeyError.

For Python 2.7, I've used the following idiom, which always masks errors from the undo:

do_something() try: do_something_other() except: try: undo_something() finally: raise

Unfortunately, that breaks down on Python 3, as the new exception propogates with the original as context..

Relevant tracker issue for that problem: https://bugs.python.org/issue18861

...
...
Does it mean that we should rewrite every chunk of code similar to the above? And if such cumbersome code is necessary and become common, maybe it needs syntax support in the language? Or changing the semantic of exceptions raised in error handlers and finally blocks?

What I want is a way to chain exceptions in the other direction, raising the original error, but attaching a related one. Unfortunately neither __cause__ or __context__ really help.

Aye, agreed. The key challenge we have is that we're trying to represent the exception state as a linked list, when what we really have once we start taking cleanup errors into account is an exception *tree*.

The thing that finally clicked for me in this thread is that if we add contextlib.ResourceSet, and have it add a new attribute to the original exception (e.g. "__cleanup_errors__") with a list of exceptions raised while attempt to cleanup after the earlier exception, then that lets us actually start modelling that tree at runtime.

You might want to go back and read Jake's LWN article for this year's language summit as Nathaniel Smith said he wanted some sort of multi-exception type which would go along with this idea. -Brett

...

Once we understand the *behaviour* we want, *then* we can consider whether we might want to add a context manager to have any raised exceptions be attached to the currently active exception as cleanup errors rather than as new exceptions in their own right.

For example:

do_something() try: do_something_other() except BaseException as exc: with contextlib.cleanup(exc) as reraise: # Exceptions raised in here would be added to # exc.__cleanup_errors__, rather than being # propagated normally undo_something() reraise()

The need for the "as reraise:"/"reraise()" trick arises from the need to special case the situation where the original exception inherits from Exception, but one of the raised exceptions *doesn't* - we want SystemExit/KeyboardInterrupt/etc to take precedence in that case, and a bare raise statement won't handle that for us (it *could* in a hypothetical future version of Python that's natively `__cleanup_errors__` aware, but that's not going to be useful for existing versions).

Since I don't see anything in the discussion so far that *requires* changes to the standard library (aside from "we may want to use this ourselves"), the right place to thrash out the design details is probably contextlib2: https://github.com/jazzband/contextlib2

That's where contextlib.ExitStack was born, and I prefer using it to iterate on context management design concepts, since we can push updates out faster, and if we make bad choices anywhere along the way, they can just sit around in contextlib2, rather than polluting the standard library indefinitely.

Cheers, Nick.

P.S. trio's MultiError is also likely worth looking into in this context

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org

Nathaniel Smith

4:10 a.m.

On Mon, Jun 12, 2017 at 1:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

Aye, agreed. The key challenge we have is that we're trying to represent the exception state as a linked list, when what we really have once we start taking cleanup errors into account is an exception *tree*. [...] P.S. trio's MultiError is also likely worth looking into in this context

Huh, yeah, this is some interesting convergent evolution. trio.MultiError is exactly a way of representing a tree of exceptions, though it's designed to do that for "live" exceptions rather than just context chaining. Briefly... the motivation here is that if you have multiple concurrent call-stacks (threads/tasks/whatever-your-abstraction-is-called) running at the same time, then it means you can literally have multiple uncaught exceptions propagating out at the same time, so you need some strategy for dealing with that. One popular option is to avoid the problem by throwing away exceptions that propagate "too far" (e.g., in the stdlib threading module, if an exception hits the top of the call stack of a non-main thread, then it gets printed to stderr and then the program continues normally). Trio takes a different approach: its tasks are arranged in a tree, and if a child task exits with an exception then that exception gets propagated into the parent task. This allows us avoid throwing away exceptions, but it means that we need some way to represent the situation when a parent task has *multiple* live exceptions propagate into it at the same time. That's what trio.MultiError is for. Structurally, MultiError is just an Exception that holds a list of child exceptions, like MultiError([TypeError(), OSError(), KeyboardInterrupt()]) so that they can propagate together. One design decision though is that if you put a MultiError inside a MultiError, it *isn't* collapsed, so it's also legal to have something like MultiError([ TypeError(), MultiError([OSError(), KeyboardInterrupt()]), ]) Semantically, these two MultiErrors are mostly the same; they both represent a situation where there are 3 unhandled exceptions propagating together. The reason for keeping the tree structure is that if the inner MultiError propagated for a while on its own before meeting the TypeError, then it accumulated some traceback and we need somewhere to store that information. (This generally happens when the task tree has multiple levels of nesting.) The other option would be to make two copies of this part of the traceback and attach one copy onto each of the two exceptions inside it, (a) but that's potentially expensive, and (b) if we eventually print the traceback then it's much more readable if we can say "here's where OSError was raised, and where KeyboardInterrupt was raised, and here's where they traveled together" and only print the common frames once. There's some examples of how this works on pages 38-49 of my language summit slides here: https://vorpus.org/~njs/misc/trio-language-summit-2017.pdf And here's the source for the toy example programs that I used in the slides, in case anyone wants to play with them: https://gist.github.com/njsmith/634b596e5765d5ed8b819a4f8e56d306 Then the other piece of the MultiError design is catching them. This is done with a context manager called MultiError.catch, which "maps" an exception handler (represented as a callable) over all the exceptions that propagate through it, and then simplifies the MultiError tree to collapse empty and singleton nodes. Since the exceptions inside a MultiError represent independent, potentially unrelated errors, you definitely don't want to accidentally throw away that KeyboardInterrupt just because your code knows how to handle the OSError. Or if you have something like MultiError([OSError(), TypeError()]) then trio has no idea which of those exceptions might be the error you know how to catch and handle which might be the error that indicates some terrible bug that should abort the program, so neither is allowed to mask the other - you have to decide how to handle them independently. If anyone wants to dig into it the code is here: https://github.com/python-trio/trio/blob/master/trio/_core/_multierror.py Anyway, compared to the __cleanup_errors__ idea: - Both involve a collection object that holds exceptions, but in the MultiError version the collection subclasses BaseException. One consequence is that you can put the exception collection object directly into __context__ or __cause__ instead of using a new attribute. - MultiError allows for a tree structure *within* a single collection of parallel exceptions. (And then of course on top of that each individual exception within the collection can have its own chain attached.) Since the motivation for this is wanting to preserve traceback structure accumulated while the collection was propagating, and __cleanup_errors__ is only intended for "dead" expections that don't propagate, this is solving an issue that __cleanup_errors__ doesn't have. - OTOH, it's not clear to me that you *want* to always stick cleanup errors into a __context__-like attribute where they'll be mostly ignored. Forcing the developer to guess ahead of time whether it's the original error or the cleanup errors that are most important seems pretty, well, error-prone. Maybe it would be more useful to have a version of ExitStack that collects up the errors from inside the block and from cleanup handlers and then raises them all together as a MultiError. (If nothing else, this would let you avoid having to guess which exceptions are more important than others, like you mention with your reraise() idea for trying to prioritize KeyboardInterrupt over other exceptions.)

...

Since I don't see anything in the discussion so far that *requires* changes to the standard library (aside from "we may want to use this ourselves"), the right place to thrash out the design details is so probably contextlib2: https://github.com/jazzband/contextlib2

That's where contextlib.ExitStack was born, and I prefer using it to iterate on context management design concepts, since we can push updates out faster, and if we make bad choices anywhere along the way, they can just sit around in contextlib2, rather than polluting the standard library indefinitely.

I'd also be open to extracting MultiError into a standalone library that trio and contextlib2 both consume, if there was interest in going that way. -n -- Nathaniel J. Smith -- https://vorpus.org

Nick Coghlan

7:10 a.m.

On 13 June 2017 at 14:10, Nathaniel Smith <njs@pobox.com> wrote:

...

On Mon, Jun 12, 2017 at 1:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...
Since I don't see anything in the discussion so far that *requires* changes to the standard library (aside from "we may want to use this ourselves"), the right place to thrash out the design details is so probably contextlib2: https://github.com/jazzband/contextlib2

That's where contextlib.ExitStack was born, and I prefer using it to iterate on context management design concepts, since we can push updates out faster, and if we make bad choices anywhere along the way, they can just sit around in contextlib2, rather than polluting the standard library indefinitely.

I'd also be open to extracting MultiError into a standalone library that trio and contextlib2 both consume, if there was interest in going that way.

I think that would make sense, as it occurred to me while reading your post that a construct like MultiError may also be useful when reporting failures from concurrent.futures.wait: https://pythonhosted.org/futures/#concurrent.futures.wait At the moment, it isn't at all clear how best to throw an exception that reports *all* of the raised exceptions in a concurrent map call, rather than just the first failure. So the recurring element we have that's common across all 3 scenarios is the notion of "peer exceptions", where we're unwinding the stack for multiple reasons, and we want to keep track of all of them. The "failed resource cleanup" case then becomes an asymmetric variant of that, where there's one primary exception (the one that triggered the cleanup), and one or more secondary failures. Given MultiError, that could be modelled as a two-tiered tree: class CleanupError(MultiError): pass CleanupError([ original_exc, MultiError([*cleanup_errors]), ]) And if there was no original exception and the resource cleanup failed unprovoked, you'd indicated that by having just a single child rather than two: CleanupError([ MultiError([*cleanup_errors]), ]) Figuring out how to *display* an exception tree coherently is going to be a pain (it's already problematic with just the linked list), but if we can at least model exception trees consistently, then we'd be able to share that display logic, even if the scenarios resulting in MultiErrors varied. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nathaniel Smith

8:43 a.m.

On Tue, Jun 13, 2017 at 12:10 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

reporting failures from concurrent.futures.wait: https://pythonhosted.org/futures/#concurrent.futures.wait

Yeah, and asyncio.gather is another example in the stdlib. Or there's twisted's DeferredList. Trio is unusual in effectively forcing all tasks to be run under a gather(), but the basic concept isn't unique at all.

...

Figuring out how to *display* an exception tree coherently is going to be a pain (it's already problematic with just the linked list), but if we can at least model exception trees consistently, then we'd be able to share that display logic, even if the scenarios resulting in MultiErrors varied.

It's true, it was a pain :-). And I'm sure it can be refined. But I think this is a reasonable first cut: https://github.com/python-trio/trio/blob/9e0df6159e55fe5e389ae5e24f9bbe51e9b... Basically the algorithm there is: if there's a __cause__ or __context__: print it (recursively using this algorithm) print the description line ("this exception was the direct cause" or whatever is appropriate) print the traceback and str() attached to this exception itself for each embedded exception: print "Details of embedded exception {i}:" with extra indentation: print the embedded exception (recursively using this algorithm) (+ some memoization to avoid infinite recursion on loopy structures) Of course really complicated trainwrecks that send exception shrapnel flying everywhere can still be complicated to read, but ... that's why we make the big bucks, I guess. (My fingers initially typoed that as "that's why we make the big bugs". Just throwing that out there.) Currently that code has a hard-coded assumption that the only kind of exception-container is MultiError, but I guess it wouldn't be too hard to extend that into some kind of protocol that your asymmetric CleanupError could also participate in? Like: an abstract exception container has 0 or 1 (predecessor exception, description) pairs [i.e., __cause__ or __context__], plus 0 or more (embedded exception, description) pairs, and then the printing code just walks the resulting tree using the above algorithm? If this all works out at the library level, then one can start to imagine ways that the interpreter could potentially get benefits from participating: - right now, if an exception that already has a __cause__ or __context__ is re-raised, then implicit exception chaining will overwrite the old __cause__ or __context__. Instead, it could check to see if there's already a non-None value there, and if so do exc.__context__ = MultiError([exc.__context__, new_exc]). - standardizing the protocol for walking over an exception container would let us avoid fighting over who owns sys.excepthook, and make it easier for third-party exception printers (like IPython, pytest, etc.) to play along. - MultiError.catch is a bit awkward. If we were going all-in on this then one can imagine some construct like multitry: ... do stuff ... raise MultiError([ValueError(1), MultiError([TypeError(), ValueError(2)])]) multiexcept ValueError as exc: print("caught a ValueError", exc) multiexcept TypeError: print("caught a TypeError and raising RuntimeError") raise RuntimeError which prints caught a ValueError 1 caught a TypeError and raising RuntimeError caught a ValueError 2 and then raises a RuntimeError with its __context__ pointing to the TypeError. ...but of course that's preeeeetty speculative at this point; definitely more python-ideas territory. -n -- Nathaniel J. Smith -- https://vorpus.org <http://vorpus.org>

Nick Coghlan

7:46 a.m.

On 12 June 2017 at 17:10, Serhiy Storchaka <storchaka@gmail.com> wrote:

...

Does it mean that we should rewrite every chunk of code similar to the above? And if such cumbersome code is necessary and become common, maybe it needs syntax support in the language? Or changing the semantic of exceptions raised in error handlers and finally blocks?

No, as in general there's no way for us to reliably tell the difference between intentionally changing the exception type (e.g. KeyError->AttributeError and vice-versa), unintentionally changing the exception type due to a bug in the cleanup code (e.g. SystemExit->KeyError), and the exception type changing due to external events (e.g. KeyError -> KeyboardInterrupt). So in the example given if an expected-and-allowed kind of exception can escape from "undo_something()", it's not really cleanup code, it's normal code, and needs to be refactored more like the way most close() methods work: particular known-to-be-possible exceptions get caught and reported (or silently ignored), but not propagated. That last example also shows that the suggested replacement idiom is incorrect, as it can suppress SystemExit, KeyboardInterrupt, etc. That said, as far as better builtin support for dynamic resource management goes: that would currently be contextlib.ExitStack. I'm also completely open to adding more hooks to let users tweak how that reports exceptions, including adding a contextlib.ResourceSet variant, which *doesn't* semantically nest the callback handling the way ExitStack does, and instead collects any exceptions raised into a new __cleanup_errors__ attribute on the original exception (thus leaving the original exception chain untouched). That would still need some fiddling to work out what to do in the "exception that doesn't inherit from Exception" case (probably clean up all the registered resources anyway then raise the first such exception caught, while still attaching all the others to __cleanup_errors__ on the original exception), but it's likely to be significantly more feasible than doing anything at the syntax level. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Stefan Ring

1:29 p.m.

...

Yury in the comment for PR 2108 [1] suggested more complicated code:

do_something() try: do_something_other() except BaseException as ex: try: undo_something() finally: raise ex

And this is still bad, because it loses the back trace. The way we do it is: do_something() try: do_something_other() except BaseException as ex: tb = sys.exc_info()[2] try: undo_something() finally: raise ex, None, tb

Nathaniel Smith

12:26 a.m.

On Mon, Jun 12, 2017 at 6:29 AM, Stefan Ring <stefanrin@gmail.com> wrote:

...

...
Yury in the comment for PR 2108 [1] suggested more complicated code:

do_something() try: do_something_other() except BaseException as ex: try: undo_something() finally: raise ex

And this is still bad, because it loses the back trace. The way we do it is:

do_something() try: do_something_other() except BaseException as ex: tb = sys.exc_info()[2] try: undo_something() finally: raise ex, None, tb

Are you testing on python 2? On Python 3 just plain 'raise ex' seems to give a sensible traceback for me... -n -- Nathaniel J. Smith -- https://vorpus.org

Stefan Ring

8:32 a.m.

On Tue, Jun 13, 2017 at 2:26 AM, Nathaniel Smith <njs@pobox.com> wrote:

...

On Mon, Jun 12, 2017 at 6:29 AM, Stefan Ring <stefanrin@gmail.com> wrote:

...
...
Yury in the comment for PR 2108 [1] suggested more complicated code:

do_something() try: do_something_other() except BaseException as ex: try: undo_something() finally: raise ex

And this is still bad, because it loses the back trace. The way we do it is:

do_something() try: do_something_other() except BaseException as ex: tb = sys.exc_info()[2] try: undo_something() finally: raise ex, None, tb

Are you testing on python 2? On Python 3 just plain 'raise ex' seems to give a sensible traceback for me...

Yes, on Python 2. Interesting to know that this has changed in Python 3. I'll check this out immediately.

2786

Age (days ago)

2787

Last active (days ago)

List overview

Download

10 comments

6 participants

participants (6)

Brett Cannon
Martin (gzlist)
Nathaniel Smith
Nick Coghlan
Serhiy Storchaka
Stefan Ring

Handle errors in cleanup code

tags

participants (6)