I was trying to debug a problem in some work code and I ran into some interesting oddities with contextlib.ExitStack and other context managers in Python 3.5. This program creates a temporary directory, and I wanted to give it a --keep flag to not automatically delete the tempdir at program exit. I was using an ExitStack to manage a bunch of resources, and the temporary directory is the first thing pushed into the ExitStack. At that point in the program, I check the value of --keep and if it's set, I use ExitStack.pop_all() to clear the stack, and thus, presumably, prevent the temporary directory from being deleted. There's this relevant quote in the contextlib documentation: """ Each instance [of an ExitStack] maintains a stack of registered callbacks that are called in reverse order when the instance is closed (either explicitly or implicitly at the end of a with statement). Note that callbacks are not invoked implicitly when the context stack instance is garbage collected. """ However if I didn't save the reference to the pop_all'd ExitStack, the tempdir would be immediately deleted. If I did save a reference to the pop_all'd ExitStack, the tempdir would live until the saved reference went out of scope and got refcounted away. As best I can tell this happens because TemporaryDirectory.__init__() creates a weakref finalizer which ends up calling the _cleanup() function. Although it's rather difficult to trace, it does appear that when the ExitStack is gc'd, this finalizer gets triggered (via weakref.detach()), thus causing the _cleanup() method to be called and the tmpdir to get deleted. I "fix" this by doing: def __init__(self): tmpdir = TemporaryDirectory() self._tmpdir = (tmpdir.name if keep else self.resources.enter_context(tmpdir)) There must be more to the story because when __init__() exits in the --keep case, tmpdir should have gotten refcounted away and the directory deleted, but it doesn't. I haven't dug down deep enough to figure that out. Now, while I was debugging that behavior, I ran across more interesting bits. I put this in a file to drive some tests: ------snip snip----- with ExitStack() as resources: print('enter context') tmpdir = resources.enter_context(X()) resources.pop_all() print('exit context') ------snip snip----- Let's say X is: class X: def __enter__(self): print('enter Foo') return self def __exit__(self, *args, **kws): print('exit Foo') return False the output is: enter context enter Foo exit context So far so good. A fairly standard context manager class doesn't get its __exit__() called even when the program exits. Let's try this: @contextmanager def X(): print('enter bar') yield print('exit bar') still good: enter context enter bar exit context Let's modify X a little bit to be a more common idiom: @contextmanager def X(): print('enter foo') try: yield finally: print('exit foo') enter context enter foo exit foo exit context Ah, the try-finally changes the behavior! There's probably some documentation somewhere that defines how a generator gets finalized, and that triggers the finally clause, whereas in the previous example, nothing after the yield gets run. I just can't find that anything that would describe the observed behavior. It's all very twisty, and I'm not sure Python is doing anything wrong, but I'm also not sure it's *not* doing anything wrong. ;) In any case, the contextlib documentation quoted above should probably be more liberally sprinkled with salty caveats. Just calling .pop_all() isn't necessarily enough to ensure that resources managed by an ExitStack will survive its garbage collection. Cheers, -Barry
On 18 July 2016 at 23:40, Barry Warsaw <barry@python.org> wrote:
I was trying to debug a problem in some work code and I ran into some interesting oddities with contextlib.ExitStack and other context managers in Python 3.5.
This program creates a temporary directory, and I wanted to give it a --keep flag to not automatically delete the tempdir at program exit. I was using an ExitStack to manage a bunch of resources, and the temporary directory is the first thing pushed into the ExitStack. At that point in the program, I check the value of --keep and if it's set, I use ExitStack.pop_all() to clear the stack, and thus, presumably, prevent the temporary directory from being deleted.
There's this relevant quote in the contextlib documentation:
""" Each instance [of an ExitStack] maintains a stack of registered callbacks that are called in reverse order when the instance is closed (either explicitly or implicitly at the end of a with statement). Note that callbacks are not invoked implicitly when the context stack instance is garbage collected. """
However if I didn't save the reference to the pop_all'd ExitStack, the tempdir would be immediately deleted. If I did save a reference to the pop_all'd ExitStack, the tempdir would live until the saved reference went out of scope and got refcounted away.
As best I can tell this happens because TemporaryDirectory.__init__() creates a weakref finalizer which ends up calling the _cleanup() function. Although it's rather difficult to trace, it does appear that when the ExitStack is gc'd, this finalizer gets triggered (via weakref.detach()), thus causing the _cleanup() method to be called and the tmpdir to get deleted.
This seems to be missing from the documentation, but when you delete a TemporaryDirectory instance without using a context manager or cleanup(), it complains, and then cleans up anyway:
d = TemporaryDirectory() del d /usr/lib/python3.5/tempfile.py:797: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpfzf4g96l'> _warnings.warn(warn_message, ResourceWarning)
Perhaps you will have to use the lower-level mkdtemp() function instead if you want the option of making the “temporary” directory long-lived.
I "fix" this by doing:
def __init__(self): tmpdir = TemporaryDirectory() self._tmpdir = (tmpdir.name if keep else self.resources.enter_context(tmpdir))
There must be more to the story because when __init__() exits in the --keep case, tmpdir should have gotten refcounted away and the directory deleted, but it doesn't. I haven't dug down deep enough to figure that out.
Now, while I was debugging that behavior, I ran across more interesting bits. I put this in a file to drive some tests:
------snip snip----- with ExitStack() as resources: print('enter context') tmpdir = resources.enter_context(X()) resources.pop_all() print('exit context') ------snip snip-----
Let's say X is:
class X: def __enter__(self): print('enter Foo') return self
def __exit__(self, *args, **kws): print('exit Foo') return False
the output is:
enter context enter Foo exit context
So far so good. A fairly standard context manager class doesn't get its __exit__() called even when the program exits. Let's try this:
@contextmanager def X(): print('enter bar') yield print('exit bar')
still good:
enter context enter bar exit context
Let's modify X a little bit to be a more common idiom:
@contextmanager def X(): print('enter foo') try: yield finally: print('exit foo')
enter context enter foo exit foo exit context
Ah, the try-finally changes the behavior! There's probably some documentation somewhere that defines how a generator gets finalized, and that triggers the finally clause, whereas in the previous example, nothing after the yield gets run. I just can't find that anything that would describe the observed behavior.
I suspect the documentation doesn’t spell everything out, but my understanding is that garbage collection of a generator instance effectively calls its close() method, triggering any “finally” and __exit__() handlers. IMO, in some cases if a generator would execute these handlers and gets garbage collected, it is a programming error because the programmer should have explicitly called generator.close(). In these cases, it would be nice to emit a ResourceWarning, just like forgetting to close a file, or delete your temporay directory above. But maybe there are other cases where there is no valid reason to emit a warning. I have hesitated in suggesting this change in the past, but I don’t remember why. One reason is it might be an annoyance with code that also wants to handle non-generator iterators that don’t have a close() method. In a world before “yield from”, imagine having to change: # Rough equivalent of “yield from g(...)” for item in g(...): yield item to: with contextlib.closing(g(...)) as subgen: for item in subgen: yield item It would be even worse if you had to support g() returning some other iterator not supporting close().
It's all very twisty, and I'm not sure Python is doing anything wrong, but I'm also not sure it's *not* doing anything wrong. ;)
In any case, the contextlib documentation quoted above should probably be more liberally sprinkled with salty caveats. Just calling .pop_all() isn't necessarily enough to ensure that resources managed by an ExitStack will survive its garbage collection.
On 19 July 2016 at 09:40, Barry Warsaw <barry@python.org> wrote:
Ah, the try-finally changes the behavior! There's probably some documentation somewhere that defines how a generator gets finalized, and that triggers the finally clause, whereas in the previous example, nothing after the yield gets run. I just can't find that anything that would describe the observed behavior.
For the generator case, their __del__ calls self.close(), and that throws a GeneratorExit exception into the current yield point. Since it's an exception, that will run try/finally clauses and context manager __exit__ methods, but otherwise bypass code after the yield statement.
It's all very twisty, and I'm not sure Python is doing anything wrong, but I'm also not sure it's *not* doing anything wrong. ;)
I think we're a bit inconsistent in how we treat the lazy cleanup for managed resources - sometimes __del__ handles it, sometimes we register a weakref finalizer, sometimes we don't do it at all. That makes it hard to predict precise behaviour without knowing the semantic details of the specific context managers involved in the stack.
In any case, the contextlib documentation quoted above should probably be more liberally sprinkled with salty caveats. Just calling .pop_all() isn't necessarily enough to ensure that resources managed by an ExitStack will survive its garbage collection.
Aye, I'd be open to changes that clarified that even though the ExitStack instance won't invoke any cleanup callbacks explicitly following a pop_all(), *implicit* cleanup from its references to those objects going away may still be triggered if you don't save the result of the pop_all() call somewhere. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (3)
-
Barry Warsaw
-
Martin Panter
-
Nick Coghlan