I normally wouldn't bring something like this up here, except I think
that there is possibility of something to be done--a language
documentation clarification if nothing else, though possibly an actual
code change as well.
I've been having an argument with a colleague over the last couple
days over the proper way order of statements when setting up a
try/finally to perform cleanup of some action. On some level we're
both being stubborn I think, and I'm not looking for resolution as to
who's right/wrong or I wouldn't bring it to this list in the first
place. The original argument was over setting and later restoring
os.environ, but we ended up arguing over
threading.Lock.acquire/release which I think is a more interesting
example of the problem, and he did raise a good point that I do want
to bring up.
My colleague's contention is that given
lock = threading.Lock()
this is simply *wrong*:
whereas this is okay:
Ignoring other details of how threading.Lock is actually implemented,
assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls
release() then as far as I've known ever since Python 2.5 first came
out these two examples are semantically *equivalent*, and I can't find
any way of reading PEP 343 or the Python language reference that would
However, there *is* a difference, and has to do with how signals are
handled, particularly w.r.t. context managers implemented in C (hence
we are talking CPython specifically):
If Lock.__enter__ is a pure Python method (even if it maybe calls some
C methods), and a SIGINT is handled during execution of that method,
then in almost all cases a KeyboardInterrupt exception will be raised
from within Lock.__enter__--this means the suite under the with:
statement is never evaluated, and Lock.__exit__ is never called. You
can be fairly sure the KeyboardInterrupt will be raised from somewhere
within a pure Python Lock.__enter__ because there will usually be at
least one remaining opcode to be evaluated, such as RETURN_VALUE.
Because of how delayed execution of signal handlers is implemented in
the pyeval main loop, this means the signal handler for SIGINT will be
called *before* RETURN_VALUE, resulting in the KeyboardInterrupt
exception being raised. Standard stuff.
However, if Lock.__enter__ is a PyCFunction things are quite
different. If you look at how the SETUP_WITH opcode is implemented,
it first calls the __enter__ method with _PyObjet_CallNoArg. If this
returns NULL (i.e. an exception occurred in __enter__) then "goto
error" is executed and the exception is raised. However if it returns
non-NULL the finally block is set up with PyFrame_BlockSetup and
execution proceeds to the next opcode. At this point a potentially
waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
while inside the with statement's suite, and finally block, and hence
Lock.__exit__ are entered.
Long story short, because Lock.__enter__ is a C function, assuming
that it succeeds normally then
always guarantees that Lock.__exit__ will be called if a SIGINT was
handled inside Lock.__enter__, whereas with
there is at last a small possibility that the SIGINT handler is called
after the CALL_FUNCTION op but before the try/finally block is entered
(e.g. before executing POP_TOP or SETUP_FINALLY). So the end result
is that the lock is held and never released after the
KeyboardInterrupt (whether or not it's handled somehow).
Whereas, again, if Lock.__enter__ is a pure Python function there's
less likely to be any difference (though I don't think the possibility
can be ruled out entirely).
At the very least I think this quirk of CPython should be mentioned
somewhere (since in all other cases the semantic meaning of the
"with:" statement is clear). However, I think it might be possible to
gain more consistency between these cases if pending signals are
checked/handled after any direct call to PyCFunction from within the
Sorry for the tl;dr; any thoughts?