Mailman 3 Asynchronous exception handling around with/try statement borders - Python-ideas

June 28, 2017

      Hi folks,

I normally wouldn't bring something like this up here, except I think
that there is possibility of something to be done--a language
documentation clarification if nothing else, though possibly an actual
code change as well.

I've been having an argument with a colleague over the last couple
days over the proper way order of statements when setting up a
try/finally to perform cleanup of some action.  On some level we're
both being stubborn I think, and I'm not looking for resolution as to
who's right/wrong or I wouldn't bring it to this list in the first
place.  The original argument was over setting and later restoring
os.environ, but we ended up arguing over
threading.Lock.acquire/release which I think is a more interesting
example of the problem, and he did raise a good point that I do want
to bring up.

</prologue>

My colleague's contention is that given

lock = threading.Lock()

this is simply *wrong*:

lock.acquire()
try:
    do_something()
finally:
    lock.release()

whereas this is okay:

with lock:
    do_something()

Ignoring other details of how threading.Lock is actually implemented,
assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls
release() then as far as I've known ever since Python 2.5 first came
out these two examples are semantically *equivalent*, and I can't find
any way of reading PEP 343 or the Python language reference that would
suggest otherwise.

However, there *is* a difference, and has to do with how signals are
handled, particularly w.r.t. context managers implemented in C (hence
we are talking CPython specifically):

If Lock.__enter__ is a pure Python method (even if it maybe calls some
C methods), and a SIGINT is handled during execution of that method,
then in almost all cases a KeyboardInterrupt exception will be raised
from within Lock.__enter__--this means the suite under the with:
statement is never evaluated, and Lock.__exit__ is never called.  You
can be fairly sure the KeyboardInterrupt will be raised from somewhere
within a pure Python Lock.__enter__ because there will usually be at
least one remaining opcode to be evaluated, such as RETURN_VALUE.
Because of how delayed execution of signal handlers is implemented in
the pyeval main loop, this means the signal handler for SIGINT will be
called *before* RETURN_VALUE, resulting in the KeyboardInterrupt
exception being raised.  Standard stuff.

However, if Lock.__enter__ is a PyCFunction things are quite
different.  If you look at how the SETUP_WITH opcode is implemented,
it first calls the __enter__ method with _PyObjet_CallNoArg.  If this
returns NULL (i.e. an exception occurred in __enter__) then "goto
error" is executed and the exception is raised.  However if it returns
non-NULL the finally block is set up with PyFrame_BlockSetup and
execution proceeds to the next opcode.  At this point a potentially
waiting SIGINT is handled, resulting in KeyboardInterrupt being raised
while inside the with statement's suite, and finally block, and hence
Lock.__exit__ are entered.

Long story short, because Lock.__enter__ is a C function, assuming
that it succeeds normally then

with lock:
    do_something()

always guarantees that Lock.__exit__ will be called if a SIGINT was
handled inside Lock.__enter__, whereas with

lock.acquire()
try:
    ...
finally:
    lock.release()

there is at last a small possibility that the SIGINT handler is called
after the CALL_FUNCTION op but before the try/finally block is entered
(e.g. before executing POP_TOP or SETUP_FINALLY).  So the end result
is that the lock is held and never released after the
KeyboardInterrupt (whether or not it's handled somehow).

Whereas, again, if Lock.__enter__ is a pure Python function there's
less likely to be any difference (though I don't think the possibility
can be ruled out entirely).

At the very least I think this quirk of CPython should be mentioned
somewhere (since in all other cases the semantic meaning of the
"with:" statement is clear).  However, I think it might be possible to
gain more consistency between these cases if pending signals are
checked/handled after any direct call to PyCFunction from within the
ceval loop.

Sorry for the tl;dr; any thoughts?

Asynchronous exception handling around with/try statement borders

Erik Bray

Nick Coghlan

Erik Bray

Erik Bray

Greg Ewing

Erik Bray

Greg Ewing

Nathaniel Smith

Greg Ewing

Nathaniel Smith

Greg Ewing

Kyle Lahnakoski

Chris Angelico

Erik Bray

Chris Angelico

Michael Selik

Nick Coghlan

Erik Bray

Erik Bray

Greg Ewing

Erik Bray

Greg Ewing

Nathaniel Smith

Greg Ewing

Nathaniel Smith

Greg Ewing

Kyle Lahnakoski

Chris Angelico

Erik Bray

Chris Angelico

Michael Selik

tags

participants (7)