<div dir="ltr"><div>For what it's worth, as part of prompt_toolkit 2.0, I implemented something very similar to Nathaniel's idea some time ago.<br>It works pretty well, but I don't have a strong opinion against an alternative implementation.<br></div><div><br>- The active context is stored as a monotonically increasing integer.<br></div><div>- For each local, the actual values are stored in a dictionary that maps the context ID to the value. (Could cause a GC issue - I'm not sure.)<br></div><div>- Every time when an executor is started, I have to wrap the callable in a context manager that applies the current context to that thread.<br></div><div>- When a new 'Future' is created, I grab the context ID and apply it to the callbacks when the result is set.<br></div><div><br><a href="https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/context.py">https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/context.py</a><br><a href="https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/future.py">https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/future.py</a><br><br></div>FYI: In my case, I did not want to pass the currently active "Application" object around all of the code. But when I started supporting telnet, multiple applications could be alive at once, each with a different I/O backend. Therefore the active application needed to be stored in a kind of executing context.<br><br>When PEP550 gets approved I'll probably make this compatible. It should at least be possible to run prompt_toolkit on the asyncio event loop.<br><div><br></div><div>Jonathan<br></div><div><br><br><br><br><br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-08-13 1:35 GMT+02:00 Nathaniel Smith <span dir="ltr"><<a href="mailto:njs@pobox.com" target="_blank">njs@pobox.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I had an idea for an alternative API that exposes the same<br>

functionality/semantics as the current draft, but that might have some<br>

advantages. It would look like:<br>

<br>

# a "context item" is an object that holds a context-sensitive value<br>

# each call to create_context_item creates a new one<br>

ci = sys.create_context_item()<br>

<br>

# Set the value of this item in the current context<br>

ci.set(value)<br>

<br>

# Get the value of this item in the current context<br>

value = ci.get()<br>

value = ci.get(default)<br>

<br>

# To support async libraries, we need some way to capture the whole context<br>

# But an opaque token representing "all context item values" is enough<br>

state_token = sys.current_context_state_<wbr>token()<br>

sys.set_context_state_token(<wbr>state_token)<br>

coro.cr_state_token = state_token<br>

# etc.<br>

<br>

The advantages are:<br>

- Eliminates the current PEP's issues with namespace collision; every<br>

context item is automatically distinct from all others.<br>

- Eliminates the need for the None-means-del hack.<br>

- Lets the interpreter hide the details of garbage collecting context values.<br>

- Allows for more implementation flexibility. This could be<br>

implemented directly on top of Yury's current prototype. But it could<br>

also, for example, be implemented by storing the context values in a<br>

flat array, where each context item is assigned an index when it's<br>

allocated. In the current draft this is suggested as a possible<br>

extension for particularly performance-sensitive users, but this way<br>

we'd have the option of making everything fast without changing or<br>

extending the API.<br>

<br>

As precedent, this is basically the API that low-level thread-local<br>

storage implementations use; see e.g. pthread_key_create,<br>

pthread_getspecific, pthread_setspecific. (And the<br>

allocate-an-index-in-a-table is the implementation that fast<br>

thread-local storage implementations use too.)<br>

<span class="HOEnZb"><font color="#888888"><br>

-n<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

On Fri, Aug 11, 2017 at 3:37 PM, Yury Selivanov <<a href="mailto:yselivanov.ml@gmail.com">yselivanov.ml@gmail.com</a>> wrote:<br>

> Hi,<br>

><br>

> This is a new PEP to implement Execution Contexts in Python.<br>

><br>

> The PEP is in-flight to <a href="http://python.org" rel="noreferrer" target="_blank">python.org</a>, and in the meanwhile can<br>

> be read on GitHub:<br>

><br>

> <a href="https://github.com/python/peps/blob/master/pep-0550.rst" rel="noreferrer" target="_blank">https://github.com/python/<wbr>peps/blob/master/pep-0550.rst</a><br>

><br>

> (it contains a few diagrams and charts, so please read it there.)<br>

><br>

> Thank you!<br>

> Yury<br>

><br>

><br>

> PEP: 550<br>

> Title: Execution Context<br>

> Version: $Revision$<br>

> Last-Modified: $Date$<br>

> Author: Yury Selivanov <<a href="mailto:yury@magic.io">yury@magic.io</a>><br>

> Status: Draft<br>

> Type: Standards Track<br>

> Content-Type: text/x-rst<br>

> Created: 11-Aug-2017<br>

> Python-Version: 3.7<br>

> Post-History: 11-Aug-2017<br>

><br>

><br>

> Abstract<br>

> ========<br>

><br>

> This PEP proposes a new mechanism to manage execution state--the<br>

> logical environment in which a function, a thread, a generator,<br>

> or a coroutine executes in.<br>

><br>

> A few examples of where having a reliable state storage is required:<br>

><br>

> * Context managers like decimal contexts, ``numpy.errstate``,<br>

>   and ``warnings.catch_warnings``;<br>

><br>

> * Storing request-related data such as security tokens and request<br>

>   data in web applications;<br>

><br>

> * Profiling, tracing, and logging in complex and large code bases.<br>

><br>

> The usual solution for storing state is to use a Thread-local Storage<br>

> (TLS), implemented in the standard library as ``threading.local()``.<br>

> Unfortunately, TLS does not work for isolating state of generators or<br>

> asynchronous code because such code shares a single thread.<br>

><br>

><br>

> Rationale<br>

> =========<br>

><br>

> Traditionally a Thread-local Storage (TLS) is used for storing the<br>

> state.  However, the major flaw of using the TLS is that it works only<br>

> for multi-threaded code.  It is not possible to reliably contain the<br>

> state within a generator or a coroutine.  For example, consider<br>

> the following generator::<br>

><br>

>     def calculate(precision, ...):<br>

>         with decimal.localcontext() as ctx:<br>

>             # Set the precision for decimal calculations<br>

>             # inside this block<br>

>             ctx.prec = precision<br>

><br>

>             yield calculate_something()<br>

>             yield calculate_something_else()<br>

><br>

> Decimal context is using a TLS to store the state, and because TLS is<br>

> not aware of generators, the state can leak.  The above code will<br>

> not work correctly, if a user iterates over the ``calculate()``<br>

> generator with different precisions in parallel::<br>

><br>

>     g1 = calculate(100)<br>

>     g2 = calculate(50)<br>

><br>

>     items = list(zip(g1, g2))<br>

><br>

>     # items[0] will be a tuple of:<br>

>     #   first value from g1 calculated with 100 precision,<br>

>     #   first value from g2 calculated with 50 precision.<br>

>     #<br>

>     # items[1] will be a tuple of:<br>

>     #   second value from g1 calculated with 50 precision,<br>

>     #   second value from g2 calculated with 50 precision.<br>

><br>

> An even scarier example would be using decimals to represent money<br>

> in an async/await application: decimal calculations can suddenly<br>

> lose precision in the middle of processing a request.  Currently,<br>

> bugs like this are extremely hard to find and fix.<br>

><br>

> Another common need for web applications is to have access to the<br>

> current request object, or security context, or, simply, the request<br>

> URL for logging or submitting performance tracing data::<br>

><br>

>     async def handle_http_request(request):<br>

>         context.current_http_request = request<br>

><br>

>         await ...<br>

>         # Invoke your framework code, render templates,<br>

>         # make DB queries, etc, and use the global<br>

>         # 'current_http_request' in that code.<br>

><br>

>         # This isn't currently possible to do reliably<br>

>         # in asyncio out of the box.<br>

><br>

> These examples are just a few out of many, where a reliable way to<br>

> store context data is absolutely needed.<br>

><br>

> The inability to use TLS for asynchronous code has lead to<br>

> proliferation of ad-hoc solutions, limited to be supported only by<br>

> code that was explicitly enabled to work with them.<br>

><br>

> Current status quo is that any library, including the standard<br>

> library, that uses a TLS, will likely not work as expected in<br>

> asynchronous code or with generators (see [3]_ as an example issue.)<br>

><br>

> Some languages that have coroutines or generators recommend to<br>

> manually pass a ``context`` object to every function, see [1]_<br>

> describing the pattern for Go.  This approach, however, has limited<br>

> use for Python, where we have a huge ecosystem that was built to work<br>

> with a TLS-like context.  Moreover, passing the context explicitly<br>

> does not work at all for libraries like ``decimal`` or ``numpy``,<br>

> which use operator overloading.<br>

><br>

> .NET runtime, which has support for async/await, has a generic<br>

> solution of this problem, called ``ExecutionContext`` (see [2]_).<br>

> On the surface, working with it is very similar to working with a TLS,<br>

> but the former explicitly supports asynchronous code.<br>

><br>

><br>

> Goals<br>

> =====<br>

><br>

> The goal of this PEP is to provide a more reliable alternative to<br>

> ``threading.local()``.  It should be explicitly designed to work with<br>

> Python execution model, equally supporting threads, generators, and<br>

> coroutines.<br>

><br>

> An acceptable solution for Python should meet the following<br>

> requirements:<br>

><br>

> * Transparent support for code executing in threads, coroutines,<br>

>   and generators with an easy to use API.<br>

><br>

> * Negligible impact on the performance of the existing code or the<br>

>   code that will be using the new mechanism.<br>

><br>

> * Fast C API for packages like ``decimal`` and ``numpy``.<br>

><br>

> Explicit is still better than implicit, hence the new APIs should only<br>

> be used when there is no option to pass the state explicitly.<br>

><br>

> With this PEP implemented, it should be possible to update a context<br>

> manager like the below::<br>

><br>

>     _local = threading.local()<br>

><br>

>     @contextmanager<br>

>     def context(x):<br>

>         old_x = getattr(_local, 'x', None)<br>

>         _local.x = x<br>

>         try:<br>

>             yield<br>

>         finally:<br>

>             _local.x = old_x<br>

><br>

> to a more robust version that can be reliably used in generators<br>

> and async/await code, with a simple transformation::<br>

><br>

>     @contextmanager<br>

>     def context(x):<br>

>         old_x = get_execution_context_item('x'<wbr>)<br>

>         set_execution_context_item('x'<wbr>, x)<br>

>         try:<br>

>             yield<br>

>         finally:<br>

>             set_execution_context_item('x'<wbr>, old_x)<br>

><br>

><br>

> Specification<br>

> =============<br>

><br>

> This proposal introduces a new concept called Execution Context (EC),<br>

> along with a set of Python APIs and C APIs to interact with it.<br>

><br>

> EC is implemented using an immutable mapping.  Every modification<br>

> of the mapping produces a new copy of it.  To illustrate what it<br>

> means let's compare it to how we work with tuples in Python::<br>

><br>

>     a0 = ()<br>

>     a1 = a0 + (1,)<br>

>     a2 = a1 + (2,)<br>

><br>

>     # a0 is an empty tuple<br>

>     # a1 is (1,)<br>

>     # a2 is (1, 2)<br>

><br>

> Manipulating an EC object would be similar::<br>

><br>

>     a0 = EC()<br>

>     a1 = a0.set('foo', 'bar')<br>

>     a2 = a1.set('spam', 'ham')<br>

><br>

>     # a0 is an empty mapping<br>

>     # a1 is {'foo': 'bar'}<br>

>     # a2 is {'foo': 'bar', 'spam': 'ham'}<br>

><br>

> In CPython, every thread that can execute Python code has a<br>

> corresponding ``PyThreadState`` object.  It encapsulates important<br>

> runtime information like a pointer to the current frame, and is<br>

> being used by the ceval loop extensively.  We add a new field to<br>

> ``PyThreadState``, called ``exec_context``, which points to the<br>

> current EC object.<br>

><br>

> We also introduce a set of APIs to work with Execution Context.<br>

> In this section we will only cover two functions that are needed to<br>

> explain how Execution Context works.  See the full list of new APIs<br>

> in the `New APIs`_ section.<br>

><br>

> * ``sys.get_execution_context_<wbr>item(key, default=None)``: lookup<br>

>   ``key`` in the EC of the executing thread.  If not found,<br>

>   return ``default``.<br>

><br>

> * ``sys.set_execution_context_<wbr>item(key, value)``: get the<br>

>   current EC of the executing thread.  Add a ``key``/``value``<br>

>   item to it, which will produce a new EC object.  Set the<br>

>   new object as the current one for the executing thread.<br>

>   In pseudo-code::<br>

><br>

>       tstate = PyThreadState_GET()<br>

>       ec = tstate.exec_context<br>

>       ec2 = ec.set(key, value)<br>

>       tstate.exec_context = ec2<br>

><br>

> Note, that some important implementation details and optimizations<br>

> are omitted here, and will be covered in later sections of this PEP.<br>

><br>

> Now let's see how Execution Contexts work with regular multi-threaded<br>

> code, generators, and coroutines.<br>

><br>

><br>

> Regular & Multithreaded Code<br>

> ----------------------------<br>

><br>

> For regular Python code, EC behaves just like a thread-local.  Any<br>

> modification of the EC object produces a new one, which is immediately<br>

> set as the current one for the thread state.<br>

><br>

> .. figure:: pep-0550/functions.png<br>

>    :align: center<br>

>    :width: 90%<br>

><br>

>    Figure 1.  Execution Context flow in a thread.<br>

><br>

> As Figure 1 illustrates, if a function calls<br>

> ``set_execution_context_item()<wbr>``, the modification of the execution<br>

> context will be visible to all subsequent calls and to the caller::<br>

><br>

>     def set_foo():<br>

>         set_execution_context_item('<wbr>foo', 'spam')<br>

><br>

>     set_execution_context_item('<wbr>foo', 'bar')<br>

>     print(get_execution_context_<wbr>item('foo'))<br>

><br>

>     set_foo()<br>

>     print(get_execution_context_<wbr>item('foo'))<br>

><br>

>     # will print:<br>

>     #   bar<br>

>     #   spam<br>

><br>

><br>

> Coroutines<br>

> ----------<br>

><br>

> Python :pep:`492` coroutines are used to implement cooperative<br>

> multitasking.  For a Python end-user they are similar to threads,<br>

> especially when it comes to sharing resources or modifying<br>

> the global state.<br>

><br>

> An event loop is needed to schedule coroutines.  Coroutines that<br>

> are explicitly scheduled by the user are usually called Tasks.<br>

> When a coroutine is scheduled, it can schedule other coroutines using<br>

> an ``await`` expression.  In async/await world, awaiting a coroutine<br>

> can be viewed as a different calling convention: Tasks are similar to<br>

> threads, and awaiting on coroutines within a Task is similar to<br>

> calling functions within a thread.<br>

><br>

> By drawing a parallel between regular multithreaded code and<br>

> async/await, it becomes apparent that any modification of the<br>

> execution context within one Task should be visible to all coroutines<br>

> scheduled within it.  Any execution context modifications, however,<br>

> must not be visible to other Tasks executing within the same thread.<br>

><br>

> To achieve this, a small set of modifications to the coroutine object<br>

> is needed:<br>

><br>

> * When a coroutine object is instantiated, it saves a reference to<br>

>   the current execution context object to its ``cr_execution_context``<br>

>   attribute.<br>

><br>

> * Coroutine's ``.send()`` and ``.throw()`` methods are modified as<br>

>   follows (in pseudo-C)::<br>

><br>

>     if coro->cr_isolated_execution_<wbr>context:<br>

>         # Save a reference to the current execution context<br>

>         old_context = tstate->execution_context<br>

><br>

>         # Set our saved execution context as the current<br>

>         # for the current thread.<br>

>         tstate->execution_context = coro->cr_execution_context<br>

><br>

>         try:<br>

>             # Perform the actual `Coroutine.send()` or<br>

>             # `Coroutine.throw()` call.<br>

>             return coro->send(...)<br>

>         finally:<br>

>             # Save a reference to the updated execution_context.<br>

>             # We will need it later, when `.send()` or `.throw()`<br>

>             # are called again.<br>

>             coro->cr_execution_context = tstate->execution_context<br>

><br>

>             # Restore thread's execution context to what it was before<br>

>             # invoking this coroutine.<br>

>             tstate->execution_context = old_context<br>

>     else:<br>

>         # Perform the actual `Coroutine.send()` or<br>

>         # `Coroutine.throw()` call.<br>

>         return coro->send(...)<br>

><br>

> * ``cr_isolated_execution_<wbr>context`` is a new attribute on coroutine<br>

>   objects.  Set to ``True`` by default, it makes any execution context<br>

>   modifications performed by coroutine to stay visible only to that<br>

>   coroutine.<br>

><br>

>   When Python interpreter sees an ``await`` instruction, it flips<br>

>   ``cr_isolated_execution_<wbr>context`` to ``False`` for the coroutine<br>

>   that is about to be awaited.  This makes any changes to execution<br>

>   context made by nested coroutine calls within a Task to be visible<br>

>   throughout the Task.<br>

><br>

>   Because the top-level coroutine (Task) cannot be scheduled with<br>

>   ``await`` (in asyncio you need to call ``loop.create_task()`` or<br>

>   ``asyncio.ensure_future()`` to schedule a Task), all execution<br>

>   context modifications are guaranteed to stay within the Task.<br>

><br>

> * We always work with ``tstate->exec_context``.  We use<br>

>   ``coro->cr_execution_context`` only to store coroutine's execution<br>

>   context when it is not executing.<br>

><br>

> Figure 2 below illustrates how execution context mutations work with<br>

> coroutines.<br>

><br>

> .. figure:: pep-0550/coroutines.png<br>

>    :align: center<br>

>    :width: 90%<br>

><br>

>    Figure 2.  Execution Context flow in coroutines.<br>

><br>

> In the above diagram:<br>

><br>

> * When "coro1" is created, it saves a reference to the current<br>

>   execution context "2".<br>

><br>

> * If it makes any change to the context, it will have its own<br>

>   execution context branch "2.1".<br>

><br>

> * When it awaits on "coro2", any subsequent changes it does to<br>

>   the execution context are visible to "coro1", but not outside<br>

>   of it.<br>

><br>

> In code::<br>

><br>

>     async def inner_foo():<br>

>         print('inner_foo:', get_execution_context_item('<wbr>key'))<br>

>         set_execution_context_item('<wbr>key', 2)<br>

><br>

>     async def foo():<br>

>         print('foo:', get_execution_context_item('<wbr>key'))<br>

><br>

>         set_execution_context_item('<wbr>key', 1)<br>

>         await inner_foo()<br>

><br>

>         print('foo:', get_execution_context_item('<wbr>key'))<br>

><br>

><br>

>     set_execution_context_item('<wbr>key', 'spam')<br>

>     print('main:', get_execution_context_item('<wbr>key'))<br>

><br>

>     asyncio.get_event_loop().run_<wbr>until_complete(foo())<br>

><br>

>     print('main:', get_execution_context_item('<wbr>key'))<br>

><br>

> which will output::<br>

><br>

>     main: spam<br>

>     foo: spam<br>

>     inner_foo: 1<br>

>     foo: 2<br>

>     main: spam<br>

><br>

> Generator-based coroutines (generators decorated with<br>

> ``types.coroutine`` or ``asyncio.coroutine``) behave exactly as<br>

> native coroutines with regards to execution context management:<br>

> their ``yield from`` expression is semantically equivalent to<br>

> ``await``.<br>

><br>

><br>

> Generators<br>

> ----------<br>

><br>

> Generators in Python, while similar to Coroutines, are used in a<br>

> fundamentally different way.  They are producers of data, and<br>

> they use ``yield`` expression to suspend/resume their execution.<br>

><br>

> A crucial difference between ``await coro`` and ``yield value`` is<br>

> that the former expression guarantees that the ``coro`` will be<br>

> executed to the end, while the latter is producing ``value`` and<br>

> suspending the generator until it gets iterated again.<br>

><br>

> Generators share 99% of their implementation with coroutines, and<br>

> thus have similar new attributes ``gi_execution_context`` and<br>

> ``gi_isolated_execution_<wbr>context``.  Similar to coroutines, generators<br>

> save a reference to the current execution context when they are<br>

> instantiated.  The have the same implementation of ``.send()`` and<br>

> ``.throw()`` methods.<br>

><br>

> The only difference is that<br>

> ``gi_isolated_execution_<wbr>context`` is always set to ``True``, and<br>

> is never modified by the interpreter.  ``yield from o`` expression in<br>

> regular generators that are not decorated with ``types.coroutine``,<br>

> is semantically equivalent to ``for v in o: yield v``.<br>

><br>

> .. figure:: pep-0550/generators.png<br>

>    :align: center<br>

>    :width: 90%<br>

><br>

>    Figure 3.  Execution Context flow in a generator.<br>

><br>

> In the above diagram:<br>

><br>

> * When "gen1" is created, it saves a reference to the current<br>

>   execution context "2".<br>

><br>

> * If it makes any change to the context, it will have its own<br>

>   execution context branch "2.1".<br>

><br>

> * When "gen2" is created, it saves a reference to the current<br>

>   execution context for it -- "2.1".<br>

><br>

> * Any subsequent execution context updated in "gen2" will only<br>

>   be visible to "gen2".<br>

><br>

> * Likewise, any context changes that "gen1" will do after it<br>

>   created "gen2" will not be visible to "gen2".<br>

><br>

> In code::<br>

><br>

>     def inner_foo():<br>

>         for i in range(3):<br>

>             print('inner_foo:', get_execution_context_item('<wbr>key'))<br>

>             set_execution_context_item('<wbr>key', i)<br>

>             yield i<br>

><br>

><br>

>     def foo():<br>

>         set_execution_context_item('<wbr>key', 'spam')<br>

>         print('foo:', get_execution_context_item('<wbr>key'))<br>

><br>

>         inner = inner_foo()<br>

><br>

>         while True:<br>

>             val = next(inner, None)<br>

>             if val is None:<br>

>                 break<br>

>             yield val<br>

>             print('foo:', get_execution_context_item('<wbr>key'))<br>

><br>

>     set_execution_context_item('<wbr>key', 'spam')<br>

>     print('main:', get_execution_context_item('<wbr>key'))<br>

><br>

>     list(foo())<br>

><br>

>     print('main:', get_execution_context_item('<wbr>key'))<br>

><br>

> which will output::<br>

><br>

>     main: ham<br>

>     foo: spam<br>

>     inner_foo: spam<br>

>     foo: spam<br>

>     inner_foo: 0<br>

>     foo: spam<br>

>     inner_foo: 1<br>

>     foo: spam<br>

>     main: ham<br>

><br>

> As we see, any modification of the execution context in a generator<br>

> is visible only to the generator itself.<br>

><br>

> There is one use-case where it is desired for generators to affect<br>

> the surrounding execution context: ``contextlib.contextmanager``<br>

> decorator.  To make the following work::<br>

><br>

>     @contextmanager<br>

>     def context(x):<br>

>         old_x = get_execution_context_item('x'<wbr>)<br>

>         set_execution_context_item('x'<wbr>, x)<br>

>         try:<br>

>             yield<br>

>         finally:<br>

>             set_execution_context_item('x'<wbr>, old_x)<br>

><br>

> we modified ``contextmanager`` to flip<br>

> ``gi_isolated_execution_<wbr>context`` flag to ``False`` on its generator.<br>

><br>

><br>

> Greenlets<br>

> ---------<br>

><br>

> Greenlet is an alternative implementation of cooperative<br>

> scheduling for Python.  Although greenlet package is not part of<br>

> CPython, popular frameworks like gevent rely on it, and it is<br>

> important that greenlet can be modified to support execution<br>

> contexts.<br>

><br>

> In a nutshell, greenlet design is very similar to design of<br>

> generators.  The main difference is that for generators, the stack<br>

> is managed by the Python interpreter.  Greenlet works outside of the<br>

> Python interpreter, and manually saves some ``PyThreadState``<br>

> fields and pushes/pops the C-stack.  Since Execution Context is<br>

> implemented on top of ``PyThreadState``, it's easy to add<br>

> transparent support of it to greenlet.<br>

><br>

><br>

> New APIs<br>

> ========<br>

><br>

> Even though this PEP adds a number of new APIs, please keep in mind,<br>

> that most Python users will likely ever use only two of them:<br>

> ``sys.get_execution_context_<wbr>item()`` and<br>

> ``sys.set_execution_context_<wbr>item()``.<br>

><br>

><br>

> Python<br>

> ------<br>

><br>

> 1. ``sys.get_execution_context_<wbr>item(key, default=None)``: lookup<br>

>    ``key`` for the current Execution Context.  If not found,<br>

>    return ``default``.<br>

><br>

> 2. ``sys.set_execution_context_<wbr>item(key, value)``: set<br>

>    ``key``/``value`` item for the current Execution Context.<br>

>    If ``value`` is ``None``, the item will be removed.<br>

><br>

> 3. ``sys.get_execution_context()`<wbr>`: return the current Execution<br>

>    Context object: ``sys.ExecutionContext``.<br>

><br>

> 4. ``sys.set_execution_context(<wbr>ec)``: set the passed<br>

>    ``sys.ExecutionContext`` instance as a current one for the current<br>

>    thread.<br>

><br>

> 5. ``sys.ExecutionContext`` object.<br>

><br>

>    Implementation detail: ``sys.ExecutionContext`` wraps a low-level<br>

>    ``PyExecContextData`` object.  ``sys.ExecutionContext`` has a<br>

>    mutable mapping API, abstracting away the real immutable<br>

>    ``PyExecContextData``.<br>

><br>

>    * ``ExecutionContext()``: construct a new, empty, execution<br>

>      context.<br>

><br>

>    * ``ec.run(func, *args)`` method: run ``func(*args)`` in the<br>

>      ``ec`` execution context.<br>

><br>

>    * ``ec[key]``: lookup ``key`` in ``ec`` context.<br>

><br>

>    * ``ec[key] = value``: assign ``key``/``value`` item to the ``ec``.<br>

><br>

>    * ``ec.get()``, ``ec.items()``, ``ec.values()``, ``ec.keys()``, and<br>

>      ``ec.copy()`` are similar to that of ``dict`` object.<br>

><br>

><br>

> C API<br>

> -----<br>

><br>

> C API is different from the Python one because it operates directly<br>

> on the low-level immutable ``PyExecContextData`` object.<br>

><br>

> 1. New ``PyThreadState->exec_context`<wbr>` field, pointing to a<br>

>    ``PyExecContextData`` object.<br>

><br>

> 2. ``PyThreadState_<wbr>SetExecContextItem`` and<br>

>    ``PyThreadState_<wbr>GetExecContextItem`` similar to<br>

>    ``sys.set_execution_context_<wbr>item()`` and<br>

>    ``sys.get_execution_context_<wbr>item()``.<br>

><br>

> 3. ``PyThreadState_<wbr>GetExecContext``: similar to<br>

>    ``sys.get_execution_context()`<wbr>`.  Always returns an<br>

>    ``PyExecContextData`` object.  If ``PyThreadState->exec_context`<wbr>`<br>

>    is ``NULL`` an new and empty one will be created and assigned<br>

>    to ``PyThreadState->exec_context`<wbr>`.<br>

><br>

> 4. ``PyThreadState_<wbr>SetExecContext``: similar to<br>

>    ``sys.set_execution_context()`<wbr>`.<br>

><br>

> 5. ``PyExecContext_New``: create a new empty ``PyExecContextData``<br>

>    object.<br>

><br>

> 6. ``PyExecContext_SetItem`` and ``PyExecContext_GetItem``.<br>

><br>

> The exact layout ``PyExecContextData`` is private, which allows<br>

> to switch it to a different implementation later.  More on that<br>

> in the `Implementation Details`_ section.<br>

><br>

><br>

> Modifications in Standard Library<br>

> ==============================<wbr>===<br>

><br>

> * ``contextlib.contextmanager`` was updated to flip the new<br>

>   ``gi_isolated_execution_<wbr>context`` attribute on the generator.<br>

><br>

> * ``asyncio.events.Handle`` object now captures the current<br>

>   execution context when it is created, and uses the saved<br>

>   execution context to run the callback (with<br>

>   ``ExecutionContext.run()`` method.)  This makes<br>

>   ``loop.call_soon()`` to run callbacks in the execution context<br>

>   they were scheduled.<br>

><br>

>   No modifications in ``asyncio.Task`` or ``asyncio.Future`` were<br>

>   necessary.<br>

><br>

> Some standard library modules like ``warnings`` and ``decimal``<br>

> can be updated to use new execution contexts.  This will be considered<br>

> in separate issues if this PEP is accepted.<br>

><br>

><br>

> Backwards Compatibility<br>

> =======================<br>

><br>

> This proposal preserves 100% backwards compatibility.<br>

><br>

><br>

> Performance<br>

> ===========<br>

><br>

> Implementation Details<br>

> ----------------------<br>

><br>

> The new ``PyExecContextData`` object is wrapping a ``dict`` object.<br>

> Any modification requires creating a shallow copy of the dict.<br>

><br>

> While working on the reference implementation of this PEP, we were<br>

> able to optimize ``dict.copy()`` operation **5.5x**, see [4]_ for<br>

> details.<br>

><br>

> .. figure:: pep-0550/dict_copy.png<br>

>    :align: center<br>

>    :width: 100%<br>

><br>

>    Figure 4.<br>

><br>

> Figure 4 shows that the performance of immutable dict implemented<br>

> with shallow copying is expectedly O(n) for the ``set()`` operation.<br>

> However, this is tolerable until dict has more than 100 items<br>

> (1 ``set()`` takes about a microsecond.)<br>

><br>

> Judging by the number of modules that need EC in Standard Library<br>

> it is likely that real world Python applications will use<br>

> significantly less than 100 execution context variables.<br>

><br>

> The important point is that the cost of accessing a key in<br>

> Execution Context is always O(1).<br>

><br>

> If the ``set()`` operation performance is a major concern, we discuss<br>

> alternative approaches that have O(1) or close ``set()`` performance<br>

> in `Alternative Immutable Dict Implementation`_, `Faster C API`_, and<br>

> `Copy-on-write Execution Context`_ sections.<br>

><br>

><br>

> Generators and Coroutines<br>

> -------------------------<br>

><br>

> Using a microbenchmark for generators and coroutines from :pep:`492`<br>

> ([12]_), it was possible to observe 0.5 to 1% performance degradation.<br>

><br>

> asyncio echoserver microbechmarks from the uvloop project [13]_<br>

> showed 1-1.5% performance degradation for asyncio code.<br>

><br>

> asyncpg benchmarks [14]_, that execute more code and are closer to a<br>

> real-world application did not exhibit any noticeable performance<br>

> change.<br>

><br>

><br>

> Overall Performance Impact<br>

> --------------------------<br>

><br>

> The total number of changed lines in the ceval loop is 2 -- in the<br>

> ``YIELD_FROM`` opcode implementation.  Only performance of generators<br>

> and coroutines can be affected by the proposal.<br>

><br>

> This was confirmed by running Python Performance Benchmark Suite<br>

> [15]_, which demonstrated that there is no difference between<br>

> 3.7 master branch and this PEP reference implementation branch<br>

> (full benchmark results can be found here [16]_.)<br>

><br>

><br>

> Design Considerations<br>

> =====================<br>

><br>

> Alternative Immutable Dict Implementation<br>

> ------------------------------<wbr>-----------<br>

><br>

> Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT)<br>

> to implement high performance immutable collections [5]_, [6]_.<br>

><br>

> Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N)<br>

> performance for both ``set()`` and ``get()`` operations, which will<br>

> be essentially O(1) for relatively small mappings in EC.<br>

><br>

> To assess if HAMT can be used for Execution Context, we implemented<br>

> it in CPython [7]_.<br>

><br>

> .. figure:: pep-0550/hamt_vs_dict.png<br>

>    :align: center<br>

>    :width: 100%<br>

><br>

>    Figure 5.  Benchmark code can be found here: [9]_.<br>

><br>

> Figure 5 shows that HAMT indeed displays O(1) performance for all<br>

> benchmarked dictionary sizes.  For dictionaries with less than 100<br>

> items, HAMT is a bit slower than Python dict/shallow copy.<br>

><br>

> .. figure:: pep-0550/lookup_hamt.png<br>

>    :align: center<br>

>    :width: 100%<br>

><br>

>    Figure 6.  Benchmark code can be found here: [10]_.<br>

><br>

> Figure 6 below shows comparison of lookup costs between Python dict<br>

> and an HAMT immutable mapping.  HAMT lookup time is 30-40% worse<br>

> than Python dict lookups on average, which is a very good result,<br>

> considering how well Python dicts are optimized.<br>

><br>

> Note, that according to [8]_, HAMT design can be further improved.<br>

><br>

> The bottom line is that the current approach with implementing<br>

> an immutable mapping with shallow-copying dict will likely perform<br>

> adequately in real-life applications.  The HAMT solution is more<br>

> future proof, however.<br>

><br>

> The proposed API is designed in such a way that the underlying<br>

> implementation of the mapping can be changed completely without<br>

> affecting the Execution Context `Specification`_, which allows<br>

> us to switch to HAMT at some point if necessary.<br>

><br>

><br>

> Copy-on-write Execution Context<br>

> ------------------------------<wbr>-<br>

><br>

> The implementation of Execution Context in .NET is different from<br>

> this PEP. .NET uses copy-on-write mechanism and a regular mutable<br>

> mapping.<br>

><br>

> One way to implement this in CPython would be to have two new<br>

> fields in ``PyThreadState``:<br>

><br>

> * ``exec_context`` pointing to the current Execution Context mapping;<br>

> * ``exec_context_copy_on_write`` flag, set to ``0`` initially.<br>

><br>

> The idea is that whenever we are modifying the EC, the copy-on-write<br>

> flag is checked, and if it is set to ``1``, the EC is copied.<br>

><br>

> Modifications to Coroutine and Generator ``.send()`` and ``.throw()``<br>

> methods described in the `Coroutines`_ section will be almost the<br>

> same, except that in addition to the ``gi_execution_context`` they<br>

> will have a ``gi_exec_context_copy_on_<wbr>write`` flag.  When a coroutine<br>

> or a generator starts, the flag will be set to ``1``.  This will<br>

> ensure that any modification of the EC performed within a coroutine<br>

> or a generator will be isolated.<br>

><br>

> This approach has one advantage:<br>

><br>

> * For Execution Context that contains a large number of items,<br>

>   copy-on-write is a more efficient solution than the shallow-copy<br>

>   dict approach.<br>

><br>

> However, we believe that copy-on-write disadvantages are more<br>

> important to consider:<br>

><br>

> * Copy-on-write behaviour for generators and coroutines makes<br>

>   EC semantics less predictable.<br>

><br>

>   With immutable EC approach, generators and coroutines always<br>

>   execute in the EC that was current at the moment of their<br>

>   creation.  Any modifications to the outer EC while a generator<br>

>   or a coroutine is executing are not visible to them::<br>

><br>

>     def generator():<br>

>         yield 1<br>

>         print(get_execution_context_<wbr>item('key'))<br>

>         yield 2<br>

><br>

>     set_execution_context_item('<wbr>key', 'spam')<br>

>     gen = iter(generator())<br>

>     next(gen)<br>

>     set_execution_context_item('<wbr>key', 'ham')<br>

>     next(gen)<br>

><br>

>   The above script will always print 'spam' with immutable EC.<br>

><br>

>   With a copy-on-write approach, the above script will print 'ham'.<br>

>   Now, consider that ``generator()`` was refactored to call some<br>

>   library function, that uses Execution Context::<br>

><br>

>     def generator():<br>

>         yield 1<br>

>         some_function_that_uses_<wbr>decimal_context()<br>

>         print(get_execution_context_<wbr>item('key'))<br>

>         yield 2<br>

><br>

>   Now, the script will print 'spam', because<br>

>   ``some_function_that_uses_<wbr>decimal_context`` forced the EC to copy,<br>

>   and ``set_execution_context_item('<wbr>key', 'ham')`` line did not<br>

>   affect the ``generator()`` code after all.<br>

><br>

> * Similarly to the previous point, ``sys.ExecutionContext.run()``<br>

>   method will also become less predictable, as<br>

>   ``sys.get_execution_context()`<wbr>` would still return a reference to<br>

>   the current mutable EC.<br>

><br>

>   We can't modify ``sys.get_execution_context()`<wbr>` to return a shallow<br>

>   copy of the current EC, because this would seriously harm<br>

>   performance of ``asyncio.call_soon()`` and similar places, where<br>

>   it is important to propagate the Execution Context.<br>

><br>

> * Even though copy-on-write requires to shallow copy the execution<br>

>   context object less frequently, copying will still take place<br>

>   in coroutines and generators.  In which case, HAMT approach will<br>

>   perform better for medium to large sized execution contexts.<br>

><br>

> All in all, we believe that the copy-on-write approach introduces<br>

> very subtle corner cases that could lead to bugs that are<br>

> exceptionally hard to discover and fix.<br>

><br>

> The immutable EC solution in comparison is always predictable and<br>

> easy to reason about.  Therefore we believe that any slight<br>

> performance gain that the copy-on-write solution might offer is not<br>

> worth it.<br>

><br>

><br>

> Faster C API<br>

> ------------<br>

><br>

> Packages like numpy and standard library modules like decimal need<br>

> to frequently query the global state for some local context<br>

> configuration.  It is important that the APIs that they use is as<br>

> fast as possible.<br>

><br>

> The proposed ``PyThreadState_<wbr>SetExecContextItem`` and<br>

> ``PyThreadState_<wbr>GetExecContextItem`` functions need to get the<br>

> current thread state with ``PyThreadState_GET()`` (fast) and then<br>

> perform a hash lookup (relatively slow).  We can eliminate the hash<br>

> lookup by adding three additional C API functions:<br>

><br>

> * ``Py_ssize_t PyExecContext_RequestIndex(<wbr>char *key_name)``:<br>

>   a function similar to the existing ``_PyEval_<wbr>RequestCodeExtraIndex``<br>

>   introduced :pep:`523`.  The idea is to request a unique index<br>

>   that can later be used to lookup context items.<br>

><br>

>   The ``key_name`` can later be used by ``sys.ExecutionContext`` to<br>

>   introspect items added with this API.<br>

><br>

> * ``PyThreadState_<wbr>SetExecContextIndexedItem(Py_<wbr>ssize_t index, PyObject *val)``<br>

>   and ``PyThreadState_<wbr>GetExecContextIndexedItem(Py_<wbr>ssize_t index)``<br>

>   to request an item by its index, avoiding the cost of hash lookup.<br>

><br>

><br>

> Why setting a key to None removes the item?<br>

> ------------------------------<wbr>-------------<br>

><br>

> Consider a context manager::<br>

><br>

>     @contextmanager<br>

>     def context(x):<br>

>         old_x = get_execution_context_item('x'<wbr>)<br>

>         set_execution_context_item('x'<wbr>, x)<br>

>         try:<br>

>             yield<br>

>         finally:<br>

>             set_execution_context_item('x'<wbr>, old_x)<br>

><br>

> With ``set_execution_context_item(<wbr>key, None)`` call removing the<br>

> ``key``, the user doesn't need to write additional code to remove<br>

> the ``key`` if it wasn't in the execution context already.<br>

><br>

> An alternative design with ``del_execution_context_item()<wbr>`` method<br>

> would look like the following::<br>

><br>

>     @contextmanager<br>

>     def context(x):<br>

>         not_there = object()<br>

>         old_x = get_execution_context_item('x'<wbr>, not_there)<br>

>         set_execution_context_item('x'<wbr>, x)<br>

>         try:<br>

>             yield<br>

>         finally:<br>

>             if old_x is not_there:<br>

>                 del_execution_context_item('x'<wbr>)<br>

>             else:<br>

>                 set_execution_context_item('x'<wbr>, old_x)<br>

><br>

><br>

> Can we fix ``PyThreadState_GetDict()``?<br>

> ------------------------------<wbr>---------<br>

><br>

> ``PyThreadState_GetDict`` is a TLS, and some of its existing users<br>

> might depend on it being just a TLS.  Changing its behaviour to follow<br>

> the Execution Context semantics would break backwards compatibility.<br>

><br>

><br>

> PEP 521<br>

> -------<br>

><br>

> :pep:`521` proposes an alternative solution to the problem:<br>

> enhance Context Manager Protocol with two new methods: ``__suspend__``<br>

> and ``__resume__``.  To make it compatible with async/await,<br>

> the Asynchronous Context Manager Protocol will also need to be<br>

> extended with ``__asuspend__`` and ``__aresume__``.<br>

><br>

> This allows to implement context managers like decimal context and<br>

> ``numpy.errstate`` for generators and coroutines.<br>

><br>

> The following code::<br>

><br>

>     class Context:<br>

><br>

>         def __enter__(self):<br>

>             self.old_x = get_execution_context_item('x'<wbr>)<br>

>             set_execution_context_item('x'<wbr>, 'something')<br>

><br>

>         def __exit__(self, *err):<br>

>             set_execution_context_item('x'<wbr>, self.old_x)<br>

><br>

> would become this::<br>

><br>

>     class Context:<br>

><br>

>         def __enter__(self):<br>

>             self.old_x = get_execution_context_item('x'<wbr>)<br>

>             set_execution_context_item('x'<wbr>, 'something')<br>

><br>

>         def __suspend__(self):<br>

>             set_execution_context_item('x'<wbr>, self.old_x)<br>

><br>

>         def __resume__(self):<br>

>             set_execution_context_item('x'<wbr>, 'something')<br>

><br>

>         def __exit__(self, *err):<br>

>             set_execution_context_item('x'<wbr>, self.old_x)<br>

><br>

> Besides complicating the protocol, the implementation will likely<br>

> negatively impact performance of coroutines, generators, and any code<br>

> that uses context managers, and will notably complicate the<br>

> interpreter implementation.  It also does not solve the leaking state<br>

> problem for greenlet/gevent.<br>

><br>

> :pep:`521` also does not provide any mechanism to propagate state<br>

> in a local context, like storing a request object in an HTTP request<br>

> handler to have better logging.<br>

><br>

><br>

> Can Execution Context be implemented outside of CPython?<br>

> ------------------------------<wbr>--------------------------<br>

><br>

> Because async/await code needs an event loop to run it, an EC-like<br>

> solution can be implemented in a limited way for coroutines.<br>

><br>

> Generators, on the other hand, do not have an event loop or<br>

> trampoline, making it impossible to intercept their ``yield`` points<br>

> outside of the Python interpreter.<br>

><br>

><br>

> Reference Implementation<br>

> ========================<br>

><br>

> The reference implementation can be found here: [11]_.<br>

><br>

><br>

> References<br>

> ==========<br>

><br>

> .. [1] <a href="https://blog.golang.org/context" rel="noreferrer" target="_blank">https://blog.golang.org/<wbr>context</a><br>

><br>

> .. [2] <a href="https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.aspx" rel="noreferrer" target="_blank">https://msdn.microsoft.com/en-<wbr>us/library/system.threading.<wbr>executioncontext.aspx</a><br>

><br>

> .. [3] <a href="https://github.com/numpy/numpy/issues/9444" rel="noreferrer" target="_blank">https://github.com/numpy/<wbr>numpy/issues/9444</a><br>

><br>

> .. [4] <a href="http://bugs.python.org/issue31179" rel="noreferrer" target="_blank">http://bugs.python.org/<wbr>issue31179</a><br>

><br>

> .. [5] <a href="https://en.wikipedia.org/wiki/Hash_array_mapped_trie" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/<wbr>Hash_array_mapped_trie</a><br>

><br>

> .. [6] <a href="http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii.html" rel="noreferrer" target="_blank">http://blog.higher-order.net/<wbr>2010/08/16/assoc-and-clojures-<wbr>persistenthashmap-part-ii.html</a><br>

><br>

> .. [7] <a href="https://github.com/1st1/cpython/tree/hamt" rel="noreferrer" target="_blank">https://github.com/1st1/<wbr>cpython/tree/hamt</a><br>

><br>

> .. [8] <a href="https://michael.steindorfer.name/publications/oopsla15.pdf" rel="noreferrer" target="_blank">https://michael.steindorfer.<wbr>name/publications/oopsla15.pdf</a><br>

><br>

> .. [9] <a href="https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd" rel="noreferrer" target="_blank">https://gist.github.com/1st1/<wbr>9004813d5576c96529527d44c5457d<wbr>cd</a><br>

><br>

> .. [10] <a href="https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e" rel="noreferrer" target="_blank">https://gist.github.com/1st1/<wbr>dbe27f2e14c30cce6f0b5fddfc8c43<wbr>7e</a><br>

><br>

> .. [11] <a href="https://github.com/1st1/cpython/tree/pep550" rel="noreferrer" target="_blank">https://github.com/1st1/<wbr>cpython/tree/pep550</a><br>

><br>

> .. [12] <a href="https://www.python.org/dev/peps/pep-0492/#async-await" rel="noreferrer" target="_blank">https://www.python.org/dev/<wbr>peps/pep-0492/#async-await</a><br>

><br>

> .. [13] <a href="https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.py" rel="noreferrer" target="_blank">https://github.com/MagicStack/<wbr>uvloop/blob/master/examples/<wbr>bench/echoserver.py</a><br>

><br>

> .. [14] <a href="https://github.com/MagicStack/pgbench" rel="noreferrer" target="_blank">https://github.com/MagicStack/<wbr>pgbench</a><br>

><br>

> .. [15] <a href="https://github.com/python/performance" rel="noreferrer" target="_blank">https://github.com/python/<wbr>performance</a><br>

><br>

> .. [16] <a href="https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c" rel="noreferrer" target="_blank">https://gist.github.com/1st1/<wbr>6b7a614643f91ead3edf37c4451a6b<wbr>4c</a><br>

><br>

><br>

> Copyright<br>

> =========<br>

><br>

> This document has been placed in the public domain.<br>

> ______________________________<wbr>_________________<br>

> Python-ideas mailing list<br>

> <a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

> <a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

> Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

<br>

<br>

<br>

</div></div><span class="im HOEnZb">--<br>

Nathaniel J. Smith -- <a href="https://vorpus.org" rel="noreferrer" target="_blank">https://vorpus.org</a><br>

</span><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

</div></div></blockquote></div><br></div>