[Python-ideas] Draft PEP on protecting finally clauses

Andrew Svetlov andrew.svetlov at gmail.com
Sat Apr 7 23:08:50 CEST 2012


I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
Thank you, Paul.

On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
> Hi,
>
> I've finally made a PEP. Any feedback is appreciated.
>
> --
> Paul
>
>
> PEP: XXX
> Title: Protecting cleanup statements from interruptions
> Version: $Revision$
> Last-Modified: $Date$
> Author: Paul Colomiets <paul at colomiets.name>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 06-Apr-2012
> Python-Version: 3.3
>
>
> Abstract
> ========
>
> This PEP proposes a way to protect python code from being interrupted inside
> finally statement or context manager.
>
>
> Rationale
> =========
>
> Python has two nice ways to do cleanup. One is a ``finally`` statement
> and the other is context manager (or ``with`` statement). Although,
> neither of them is protected from ``KeyboardInterrupt`` or
> ``generator.throw()``. For example::
>
>    lock.acquire()
>    try:
>        print('starting')
>        do_someting()
>    finally:
>        print('finished')
>        lock.release()
>
> If ``KeyboardInterrupt`` occurs just after ``print`` function is
> executed, lock will not be released. Similarly the following code
> using ``with`` statement is affected::
>
>    from threading import Lock
>
>    class MyLock:
>
>        def __init__(self):
>            self._lock_impl = lock
>
>        def __enter__(self):
>            self._lock_impl.acquire()
>            print("LOCKED")
>
>        def __exit__(self):
>            print("UNLOCKING")
>            self._lock_impl.release()
>
>    lock = MyLock()
>    with lock:
>        do_something
>
> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
> lock will never be released.
>
>
> Coroutine Use Case
> ------------------
>
> Similar case occurs with coroutines. Usually coroutine libraries want
> to interrupt coroutine with a timeout. There is a
> ``generator.throw()`` method for this use case, but there are no
> method to know is it currently yielded from inside a ``finally``.
>
> Example that uses yield-based coroutines follows. Code looks
> similar using any of the popular coroutine libraries Monocle [1]_,
> Bluelet [2]_, or Twisted [3]_. ::
>
>    def run_locked()
>        yield connection.sendall('LOCK')
>        try:
>            yield do_something()
>            yield do_something_else()
>        finally:
>            yield connection.sendall('UNLOCK')
>
>    with timeout(5):
>        yield run_locked()
>
> In the example above ``yield something`` means pause executing current
> coroutine and execute coroutine ``something`` until it finished
> execution. So that library keeps stack of generators itself. The
> ``connection.sendall`` waits until socket is writable and does thing
> similar to what ``socket.sendall`` does.
>
> The ``with`` statement ensures that all that code is executed within 5
> seconds timeout. It does so by registering a callback in main loop,
> which calls ``generator.throw()`` to the top-most frame in the
> coroutine stack when timeout happens.
>
> The ``greenlets`` extension works in similar way, except it doesn't
> need ``yield`` to enter new stack frame. Otherwise considerations are
> similar.
>
>
> Specification
> =============
>
> Frame Flag 'f_in_cleanup'
> -------------------------
>
> A new flag on frame object is proposed. It is set to ``True`` if this
> frame is currently in the ``finally`` suite.  Internally it must be
> implemented as a counter of nested finally statements currently
> executed.
>
> The internal counter is also incremented when entering ``WITH_SETUP``
> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
> leaving that bytecode. This allows to protect ``__enter__`` and
> ``__exit__`` methods too.
>
>
> Function 'sys.setcleanuphook'
> -----------------------------
>
> A new function for the ``sys`` module is proposed. This function sets
> a callback which is executed every time ``f_in_cleanup`` becomes
> ``False``. Callbacks gets ``frame`` as it's sole argument so it can
> get some evindence where it is called from.
>
> The setting is thread local and is stored inside ``PyThreadState``
> structure.
>
>
> Inspect Module Enhancements
> ---------------------------
>
> Two new functions are proposed for ``inspect`` module:
> ``isframeincleanup`` and ``getcleanupframe``.
>
> ``isframeincleanup`` given ``frame`` object or ``generator`` object as
> sole argument returns the value of ``f_in_cleanup`` attribute of a
> frame itself or of the ``gi_frame`` attribute of a generator.
>
> ``getcleanupframe`` given ``frame`` object as sole argument returns
> the innermost frame which has true value of ``f_in_cleanup`` or
> ``None`` if no frames in the stack has the attribute set. It starts to
> inspect from specified frame and walks to outer frames using
> ``f_back`` pointers, just like ``getouterframes`` does.
>
>
> Example
> =======
>
> Example implementation of ``SIGINT`` handler that interrupts safely
> might look like::
>
>    import inspect, sys, functools
>
>    def sigint_handler(sig, frame)
>        if inspect.getcleanupframe(frame) is None:
>            raise KeyboardInterrupt()
>        sys.setcleanuphook(functools.partial(sigint_handler, 0))
>
> Coroutine example is out of scope of this document, because it's
> implemention depends very much on a trampoline (or main loop) used by
> coroutine library.
>
>
> Unresolved Issues
> =================
>
> Interruption Inside With Statement Expression
> ---------------------------------------------
>
> Given the statement::
>
>    with open(filename):
>        do_something()
>
> Python can be interrupted after ``open`` is called, but before
> ``SETUP_WITH`` bytecode is executed. There are two possible decisions:
>
> * Protect expression inside ``with`` statement. This would need
>  another bytecode, since currently there is no delimiter at the start
>  of ``with`` expression
>
> * Let user write a wrapper if he considers it's important for his
>  use-case. Safe wrapper code might look like the following::
>
>    class FileWrapper(object):
>
>        def __init__(self, filename, mode):
>            self.filename = filename
>            self.mode = mode
>
>        def __enter__(self):
>            self.file = open(self.filename, self.mode)
>
>        def __exit__(self):
>            self.file.close()
>
>  Alternatively it can be written using context manager::
>
>    @contextmanager
>    def open_wrapper(filename, mode):
>        file = open(filename, mode)
>        try:
>            yield file
>        finally:
>            file.close()
>
>  This code is safe, as first part of generator (before yield) is
>  executed inside ``WITH_SETUP`` bytecode of caller
>
>
> Exception Propagation
> ---------------------
>
> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
> exited with an exception. Usually it's not a problem, since more
> important exception like ``KeyboardInterrupt`` or ``SystemExit``
> should be thrown instead. But it may be nice to be able to keep
> original exception inside a ``__context__`` attibute. So cleanup hook
> signature may grow an exception argument::
>
>    def sigint_handler(sig, frame)
>        if inspect.getcleanupframe(frame) is None:
>            raise KeyboardInterrupt()
>        sys.setcleanuphook(retry_sigint)
>
>    def retry_sigint(frame, exception=None):
>        if inspect.getcleanupframe(frame) is None:
>            raise KeyboardInterrupt() from exception
>
> .. note::
>
>    No need to have three arguments like in ``__exit__`` method since
>    we have a ``__traceback__`` attribute in exception in Python 3.x
>
> Although, this will set ``__cause__`` for the exception, which is not
> exactly what's intended. So some hidden interpeter logic may be used
> to put ``__context__`` attribute on every exception raised in cleanup
> hook.
>
>
> Interruption Between Acquiring Resource and Try Block
> -----------------------------------------------------
>
> Example from the first section is not totally safe. Let's look closer::
>
>    lock.acquire()
>    try:
>        do_something()
>    finally:
>        lock.release()
>
> There is no way it can be fixed without modifying the code. The actual
> fix of this code depends very much on use case.
>
> Usually code can be fixed using a ``with`` statement::
>
>    with lock:
>        do_something()
>
> Although, for coroutines you usually can't use ``with`` statement
> because you need to ``yield`` for both aquire and release operations.
> So code might be rewritten as following::
>
>    try:
>        yield lock.acquire()
>        do_something()
>    finally:
>        yield lock.release()
>
> The actual lock code might need more code to support this use case,
> but implementation is usually trivial, like check if lock has been
> acquired and unlock if it is.
>
>
> Setting Interruption Context Inside Finally Itself
> --------------------------------------------------
>
> Some coroutine libraries may need to set a timeout for the finally
> clause itself. For example::
>
>    try:
>        do_something()
>    finally:
>        with timeout(0.5):
>            try:
>                yield do_slow_cleanup()
>            finally:
>                yield do_fast_cleanup()
>
> With current semantics timeout will either protect
> the whole ``with`` block or nothing at all, depending on the
> implementation of a library. What the author is intended is to treat
> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
> cleanup (non-interruptible one).
>
> Similar case might occur when using greenlets or tasklets.
>
> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
> by calling cleanup hook on each decrement.  Corouting library may then
> remember the value at timeout start, and compare it on each hook
> execution.
>
> But in practice example is considered to be too obscure to take in
> account.
>
>
> Alternative Python Implementations Support
> ==========================================
>
> We consider ``f_in_cleanup`` and implementation detail. The actual
> implementation may have some fake frame-like object passed to signal
> handler, cleanup hook and returned from ``getcleanupframe``. The only
> requirement is that ``inspect`` module functions work as expected on
> that objects. For this reason we also allow to pass a ``generator``
> object to a ``isframeincleanup`` function, this disables need to use
> ``gi_frame`` attribute.
>
> It may need to be specified that ``getcleanupframe`` must return the
> same object that will be passed to cleanup hook at next invocation.
>
>
> Alternative Names
> =================
>
> Original proposal had ``f_in_finally`` flag. The original intention
> was to protect ``finally`` clauses. But as it grew up to protecting
> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
> method seems better. Although ``__enter__`` method is not a cleanup
> routine, it at least relates to cleanup done by context managers.
>
> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
> ``get_cleanup_frame``, althought they follow convention of their
> respective modules.
>
>
> Alternative Proposals
> =====================
>
> Propagating 'f_in_cleanup' Flag Automatically
> -----------------------------------------------
>
> This can make ``getcleanupframe`` unnecessary. But for yield based
> coroutines you need to propagate it yourself. Making it writable leads
> to somewhat unpredictable behavior of ``setcleanuphook``
>
>
> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
> --------------------------------------------
>
> These bytecodes can be used to protect expression inside ``with``
> statement, as well as making counter increments more explicit and easy
> to debug (visible inside a disassembly). Some middle ground might be
> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
> counter (``END_FINALLY`` is present at end of ``with`` suite).
>
> Although, adding new bytecodes must be considered very carefully.
>
>
> Expose 'f_in_cleanup' as a Counter
> ----------------------------------
>
> The original intention was to expose minimum needed functionality.
> Although, as we consider frame flag ``f_in_cleanup`` as an
> implementation detail, we may expose it as a counter.
>
> Similarly, if we have a counter we may need to have cleanup hook
> called on every counter decrement. It's unlikely have much performance
> impact as nested finally clauses are unlikely common case.
>
>
> Add code object flag 'CO_CLEANUP'
> ---------------------------------
>
> As an alternative to set flag inside ``WITH_SETUP``, and
> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
> When interpreter starts to execute code with ``CO_CLEANUP`` set, it
> sets ``f_in_cleanup`` for the whole function body.  This flag is set
> for code object of ``__enter__`` and ``__exit__`` special methods.
> Technically it might be set on functions called ``__enter__`` and
> ``__exit__``.
>
> This seems to be less clear solution. It also covers the case where
> ``__enter__`` and ``__exit__`` are called manually. This may be
> accepted either as feature or as a unnecessary side-effect (unlikely
> as a bug).
>
> It may also impose a problem when ``__enter__`` or ``__exit__``
> function are implemented in C, as there usually no frame to check for
> ``f_in_cleanup`` flag.
>
>
> Have Cleanup Callback on Frame Object Itself
> ----------------------------------------------
>
> Frame may be extended to have ``f_cleanup_callback`` which is called
> when ``f_in_cleanup`` is reset to 0. It would help to register
> different callbacks to different coroutines.
>
> Despite apparent beauty. This solution doesn't add anything. As there
> are two primary use cases:
>
> * Set callback in signal handler. The callback is inherently single
>  one for this case
>
> * Use single callback per loop for coroutine use case. And in almost
>  all cases there is only one loop per thread
>
>
> No Cleanup Hook
> ---------------
>
> Original proposal included no cleanup hook specification. As there are
> few ways to achieve the same using current tools:
>
> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
>  problem to debugging, and has big performance impact (although,
>  interrupting doesn't happen very often)
>
> * Sleep a bit more and try again. For coroutine library it's easy. For
>  signals it may be achieved using ``alert``.
>
> Both methods are considered too impractical and a way to catch exit
> from ``finally`` statement is proposed.
>
>
> References
> ==========
>
> .. [1] Monocle
>   https://github.com/saucelabs/monocle
>
> .. [2] Bluelet
>   https://github.com/sampsyo/bluelet
>
> .. [3] Twisted: inlineCallbacks
>   http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
>
> .. [4] Original discussion
>   http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
>
> ..
>   Local Variables:
>   mode: indented-text
>   indent-tabs-mode: nil
>   sentence-end-double-space: t
>   fill-column: 70
>   coding: utf-8
>   End:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Thanks,
Andrew Svetlov



More information about the Python-ideas mailing list