Mailman 3 August 2017 - Python-ideas

New PEP 550: Execution Context
by Yury Selivanov Aug. 15, 2017

Aug. 15, 2017

Hi, This is a new PEP to implement Execution Contexts in Python. The PEP is in-flight to python.org, and in the meanwhile can be read on GitHub: https://github.com/python/peps/blob/master/pep-0550.rst (it contains a few diagrams and charts, so please read it there.) Thank you! Yury PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury(a)magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 … [View More]Python-Version: 3.7 Post-History: 11-Aug-2017 Abstract ======== This PEP proposes a new mechanism to manage execution state--the logical environment in which a function, a thread, a generator, or a coroutine executes in. A few examples of where having a reliable state storage is required: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``; * Storing request-related data such as security tokens and request data in web applications; * Profiling, tracing, and logging in complex and large code bases. The usual solution for storing state is to use a Thread-local Storage (TLS), implemented in the standard library as ``threading.local()``. Unfortunately, TLS does not work for isolating state of generators or asynchronous code because such code shares a single thread. Rationale ========= Traditionally a Thread-local Storage (TLS) is used for storing the state. However, the major flaw of using the TLS is that it works only for multi-threaded code. It is not possible to reliably contain the state within a generator or a coroutine. For example, consider the following generator:: def calculate(precision, ...): with decimal.localcontext() as ctx: # Set the precision for decimal calculations # inside this block ctx.prec = precision yield calculate_something() yield calculate_something_else() Decimal context is using a TLS to store the state, and because TLS is not aware of generators, the state can leak. The above code will not work correctly, if a user iterates over the ``calculate()`` generator with different precisions in parallel:: g1 = calculate(100) g2 = calculate(50) items = list(zip(g1, g2)) # items[0] will be a tuple of: # first value from g1 calculated with 100 precision, # first value from g2 calculated with 50 precision. # # items[1] will be a tuple of: # second value from g1 calculated with 50 precision, # second value from g2 calculated with 50 precision. An even scarier example would be using decimals to represent money in an async/await application: decimal calculations can suddenly lose precision in the middle of processing a request. Currently, bugs like this are extremely hard to find and fix. Another common need for web applications is to have access to the current request object, or security context, or, simply, the request URL for logging or submitting performance tracing data:: async def handle_http_request(request): context.current_http_request = request await ... # Invoke your framework code, render templates, # make DB queries, etc, and use the global # 'current_http_request' in that code. # This isn't currently possible to do reliably # in asyncio out of the box. These examples are just a few out of many, where a reliable way to store context data is absolutely needed. The inability to use TLS for asynchronous code has lead to proliferation of ad-hoc solutions, limited to be supported only by code that was explicitly enabled to work with them. Current status quo is that any library, including the standard library, that uses a TLS, will likely not work as expected in asynchronous code or with generators (see [3]_ as an example issue.) Some languages that have coroutines or generators recommend to manually pass a ``context`` object to every function, see [1]_ describing the pattern for Go. This approach, however, has limited use for Python, where we have a huge ecosystem that was built to work with a TLS-like context. Moreover, passing the context explicitly does not work at all for libraries like ``decimal`` or ``numpy``, which use operator overloading. .NET runtime, which has support for async/await, has a generic solution of this problem, called ``ExecutionContext`` (see [2]_). On the surface, working with it is very similar to working with a TLS, but the former explicitly supports asynchronous code. Goals ===== The goal of this PEP is to provide a more reliable alternative to ``threading.local()``. It should be explicitly designed to work with Python execution model, equally supporting threads, generators, and coroutines. An acceptable solution for Python should meet the following requirements: * Transparent support for code executing in threads, coroutines, and generators with an easy to use API. * Negligible impact on the performance of the existing code or the code that will be using the new mechanism. * Fast C API for packages like ``decimal`` and ``numpy``. Explicit is still better than implicit, hence the new APIs should only be used when there is no option to pass the state explicitly. With this PEP implemented, it should be possible to update a context manager like the below:: _local = threading.local() @contextmanager def context(x): old_x = getattr(_local, 'x', None) _local.x = x try: yield finally: _local.x = old_x to a more robust version that can be reliably used in generators and async/await code, with a simple transformation:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) Specification ============= This proposal introduces a new concept called Execution Context (EC), along with a set of Python APIs and C APIs to interact with it. EC is implemented using an immutable mapping. Every modification of the mapping produces a new copy of it. To illustrate what it means let's compare it to how we work with tuples in Python:: a0 = () a1 = a0 + (1,) a2 = a1 + (2,) # a0 is an empty tuple # a1 is (1,) # a2 is (1, 2) Manipulating an EC object would be similar:: a0 = EC() a1 = a0.set('foo', 'bar') a2 = a1.set('spam', 'ham') # a0 is an empty mapping # a1 is {'foo': 'bar'} # a2 is {'foo': 'bar', 'spam': 'ham'} In CPython, every thread that can execute Python code has a corresponding ``PyThreadState`` object. It encapsulates important runtime information like a pointer to the current frame, and is being used by the ceval loop extensively. We add a new field to ``PyThreadState``, called ``exec_context``, which points to the current EC object. We also introduce a set of APIs to work with Execution Context. In this section we will only cover two functions that are needed to explain how Execution Context works. See the full list of new APIs in the `New APIs`_ section. * ``sys.get_execution_context_item(key, default=None)``: lookup ``key`` in the EC of the executing thread. If not found, return ``default``. * ``sys.set_execution_context_item(key, value)``: get the current EC of the executing thread. Add a ``key``/``value`` item to it, which will produce a new EC object. Set the new object as the current one for the executing thread. In pseudo-code:: tstate = PyThreadState_GET() ec = tstate.exec_context ec2 = ec.set(key, value) tstate.exec_context = ec2 Note, that some important implementation details and optimizations are omitted here, and will be covered in later sections of this PEP. Now let's see how Execution Contexts work with regular multi-threaded code, generators, and coroutines. Regular & Multithreaded Code ---------------------------- For regular Python code, EC behaves just like a thread-local. Any modification of the EC object produces a new one, which is immediately set as the current one for the thread state. .. figure:: pep-0550/functions.png :align: center :width: 90% Figure 1. Execution Context flow in a thread. As Figure 1 illustrates, if a function calls ``set_execution_context_item()``, the modification of the execution context will be visible to all subsequent calls and to the caller:: def set_foo(): set_execution_context_item('foo', 'spam') set_execution_context_item('foo', 'bar') print(get_execution_context_item('foo')) set_foo() print(get_execution_context_item('foo')) # will print: # bar # spam Coroutines ---------- Python :pep:`492` coroutines are used to implement cooperative multitasking. For a Python end-user they are similar to threads, especially when it comes to sharing resources or modifying the global state. An event loop is needed to schedule coroutines. Coroutines that are explicitly scheduled by the user are usually called Tasks. When a coroutine is scheduled, it can schedule other coroutines using an ``await`` expression. In async/await world, awaiting a coroutine can be viewed as a different calling convention: Tasks are similar to threads, and awaiting on coroutines within a Task is similar to calling functions within a thread. By drawing a parallel between regular multithreaded code and async/await, it becomes apparent that any modification of the execution context within one Task should be visible to all coroutines scheduled within it. Any execution context modifications, however, must not be visible to other Tasks executing within the same thread. To achieve this, a small set of modifications to the coroutine object is needed: * When a coroutine object is instantiated, it saves a reference to the current execution context object to its ``cr_execution_context`` attribute. * Coroutine's ``.send()`` and ``.throw()`` methods are modified as follows (in pseudo-C):: if coro->cr_isolated_execution_context: # Save a reference to the current execution context old_context = tstate->execution_context # Set our saved execution context as the current # for the current thread. tstate->execution_context = coro->cr_execution_context try: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro->send(...) finally: # Save a reference to the updated execution_context. # We will need it later, when `.send()` or `.throw()` # are called again. coro->cr_execution_context = tstate->execution_context # Restore thread's execution context to what it was before # invoking this coroutine. tstate->execution_context = old_context else: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro->send(...) * ``cr_isolated_execution_context`` is a new attribute on coroutine objects. Set to ``True`` by default, it makes any execution context modifications performed by coroutine to stay visible only to that coroutine. When Python interpreter sees an ``await`` instruction, it flips ``cr_isolated_execution_context`` to ``False`` for the coroutine that is about to be awaited. This makes any changes to execution context made by nested coroutine calls within a Task to be visible throughout the Task. Because the top-level coroutine (Task) cannot be scheduled with ``await`` (in asyncio you need to call ``loop.create_task()`` or ``asyncio.ensure_future()`` to schedule a Task), all execution context modifications are guaranteed to stay within the Task. * We always work with ``tstate->exec_context``. We use ``coro->cr_execution_context`` only to store coroutine's execution context when it is not executing. Figure 2 below illustrates how execution context mutations work with coroutines. .. figure:: pep-0550/coroutines.png :align: center :width: 90% Figure 2. Execution Context flow in coroutines. In the above diagram: * When "coro1" is created, it saves a reference to the current execution context "2". * If it makes any change to the context, it will have its own execution context branch "2.1". * When it awaits on "coro2", any subsequent changes it does to the execution context are visible to "coro1", but not outside of it. In code:: async def inner_foo(): print('inner_foo:', get_execution_context_item('key')) set_execution_context_item('key', 2) async def foo(): print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 1) await inner_foo() print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 'spam') print('main:', get_execution_context_item('key')) asyncio.get_event_loop().run_until_complete(foo()) print('main:', get_execution_context_item('key')) which will output:: main: spam foo: spam inner_foo: 1 foo: 2 main: spam Generator-based coroutines (generators decorated with ``types.coroutine`` or ``asyncio.coroutine``) behave exactly as native coroutines with regards to execution context management: their ``yield from`` expression is semantically equivalent to ``await``. Generators ---------- Generators in Python, while similar to Coroutines, are used in a fundamentally different way. They are producers of data, and they use ``yield`` expression to suspend/resume their execution. A crucial difference between ``await coro`` and ``yield value`` is that the former expression guarantees that the ``coro`` will be executed to the end, while the latter is producing ``value`` and suspending the generator until it gets iterated again. Generators share 99% of their implementation with coroutines, and thus have similar new attributes ``gi_execution_context`` and ``gi_isolated_execution_context``. Similar to coroutines, generators save a reference to the current execution context when they are instantiated. The have the same implementation of ``.send()`` and ``.throw()`` methods. The only difference is that ``gi_isolated_execution_context`` is always set to ``True``, and is never modified by the interpreter. ``yield from o`` expression in regular generators that are not decorated with ``types.coroutine``, is semantically equivalent to ``for v in o: yield v``. .. figure:: pep-0550/generators.png :align: center :width: 90% Figure 3. Execution Context flow in a generator. In the above diagram: * When "gen1" is created, it saves a reference to the current execution context "2". * If it makes any change to the context, it will have its own execution context branch "2.1". * When "gen2" is created, it saves a reference to the current execution context for it -- "2.1". * Any subsequent execution context updated in "gen2" will only be visible to "gen2". * Likewise, any context changes that "gen1" will do after it created "gen2" will not be visible to "gen2". In code:: def inner_foo(): for i in range(3): print('inner_foo:', get_execution_context_item('key')) set_execution_context_item('key', i) yield i def foo(): set_execution_context_item('key', 'spam') print('foo:', get_execution_context_item('key')) inner = inner_foo() while True: val = next(inner, None) if val is None: break yield val print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 'spam') print('main:', get_execution_context_item('key')) list(foo()) print('main:', get_execution_context_item('key')) which will output:: main: ham foo: spam inner_foo: spam foo: spam inner_foo: 0 foo: spam inner_foo: 1 foo: spam main: ham As we see, any modification of the execution context in a generator is visible only to the generator itself. There is one use-case where it is desired for generators to affect the surrounding execution context: ``contextlib.contextmanager`` decorator. To make the following work:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) we modified ``contextmanager`` to flip ``gi_isolated_execution_context`` flag to ``False`` on its generator. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. In a nutshell, greenlet design is very similar to design of generators. The main difference is that for generators, the stack is managed by the Python interpreter. Greenlet works outside of the Python interpreter, and manually saves some ``PyThreadState`` fields and pushes/pops the C-stack. Since Execution Context is implemented on top of ``PyThreadState``, it's easy to add transparent support of it to greenlet. New APIs ======== Even though this PEP adds a number of new APIs, please keep in mind, that most Python users will likely ever use only two of them: ``sys.get_execution_context_item()`` and ``sys.set_execution_context_item()``. Python ------ 1. ``sys.get_execution_context_item(key, default=None)``: lookup ``key`` for the current Execution Context. If not found, return ``default``. 2. ``sys.set_execution_context_item(key, value)``: set ``key``/``value`` item for the current Execution Context. If ``value`` is ``None``, the item will be removed. 3. ``sys.get_execution_context()``: return the current Execution Context object: ``sys.ExecutionContext``. 4. ``sys.set_execution_context(ec)``: set the passed ``sys.ExecutionContext`` instance as a current one for the current thread. 5. ``sys.ExecutionContext`` object. Implementation detail: ``sys.ExecutionContext`` wraps a low-level ``PyExecContextData`` object. ``sys.ExecutionContext`` has a mutable mapping API, abstracting away the real immutable ``PyExecContextData``. * ``ExecutionContext()``: construct a new, empty, execution context. * ``ec.run(func, *args)`` method: run ``func(*args)`` in the ``ec`` execution context. * ``ec[key]``: lookup ``key`` in ``ec`` context. * ``ec[key] = value``: assign ``key``/``value`` item to the ``ec``. * ``ec.get()``, ``ec.items()``, ``ec.values()``, ``ec.keys()``, and ``ec.copy()`` are similar to that of ``dict`` object. C API ----- C API is different from the Python one because it operates directly on the low-level immutable ``PyExecContextData`` object. 1. New ``PyThreadState->exec_context`` field, pointing to a ``PyExecContextData`` object. 2. ``PyThreadState_SetExecContextItem`` and ``PyThreadState_GetExecContextItem`` similar to ``sys.set_execution_context_item()`` and ``sys.get_execution_context_item()``. 3. ``PyThreadState_GetExecContext``: similar to ``sys.get_execution_context()``. Always returns an ``PyExecContextData`` object. If ``PyThreadState->exec_context`` is ``NULL`` an new and empty one will be created and assigned to ``PyThreadState->exec_context``. 4. ``PyThreadState_SetExecContext``: similar to ``sys.set_execution_context()``. 5. ``PyExecContext_New``: create a new empty ``PyExecContextData`` object. 6. ``PyExecContext_SetItem`` and ``PyExecContext_GetItem``. The exact layout ``PyExecContextData`` is private, which allows to switch it to a different implementation later. More on that in the `Implementation Details`_ section. Modifications in Standard Library ================================= * ``contextlib.contextmanager`` was updated to flip the new ``gi_isolated_execution_context`` attribute on the generator. * ``asyncio.events.Handle`` object now captures the current execution context when it is created, and uses the saved execution context to run the callback (with ``ExecutionContext.run()`` method.) This makes ``loop.call_soon()`` to run callbacks in the execution context they were scheduled. No modifications in ``asyncio.Task`` or ``asyncio.Future`` were necessary. Some standard library modules like ``warnings`` and ``decimal`` can be updated to use new execution contexts. This will be considered in separate issues if this PEP is accepted. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Performance =========== Implementation Details ---------------------- The new ``PyExecContextData`` object is wrapping a ``dict`` object. Any modification requires creating a shallow copy of the dict. While working on the reference implementation of this PEP, we were able to optimize ``dict.copy()`` operation **5.5x**, see [4]_ for details. .. figure:: pep-0550/dict_copy.png :align: center :width: 100% Figure 4. Figure 4 shows that the performance of immutable dict implemented with shallow copying is expectedly O(n) for the ``set()`` operation. However, this is tolerable until dict has more than 100 items (1 ``set()`` takes about a microsecond.) Judging by the number of modules that need EC in Standard Library it is likely that real world Python applications will use significantly less than 100 execution context variables. The important point is that the cost of accessing a key in Execution Context is always O(1). If the ``set()`` operation performance is a major concern, we discuss alternative approaches that have O(1) or close ``set()`` performance in `Alternative Immutable Dict Implementation`_, `Faster C API`_, and `Copy-on-write Execution Context`_ sections. Generators and Coroutines ------------------------- Using a microbenchmark for generators and coroutines from :pep:`492` ([12]_), it was possible to observe 0.5 to 1% performance degradation. asyncio echoserver microbechmarks from the uvloop project [13]_ showed 1-1.5% performance degradation for asyncio code. asyncpg benchmarks [14]_, that execute more code and are closer to a real-world application did not exhibit any noticeable performance change. Overall Performance Impact -------------------------- The total number of changed lines in the ceval loop is 2 -- in the ``YIELD_FROM`` opcode implementation. Only performance of generators and coroutines can be affected by the proposal. This was confirmed by running Python Performance Benchmark Suite [15]_, which demonstrated that there is no difference between 3.7 master branch and this PEP reference implementation branch (full benchmark results can be found here [16]_.) Design Considerations ===================== Alternative Immutable Dict Implementation ----------------------------------------- Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT) to implement high performance immutable collections [5]_, [6]_. Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N) performance for both ``set()`` and ``get()`` operations, which will be essentially O(1) for relatively small mappings in EC. To assess if HAMT can be used for Execution Context, we implemented it in CPython [7]_. .. figure:: pep-0550/hamt_vs_dict.png :align: center :width: 100% Figure 5. Benchmark code can be found here: [9]_. Figure 5 shows that HAMT indeed displays O(1) performance for all benchmarked dictionary sizes. For dictionaries with less than 100 items, HAMT is a bit slower than Python dict/shallow copy. .. figure:: pep-0550/lookup_hamt.png :align: center :width: 100% Figure 6. Benchmark code can be found here: [10]_. Figure 6 below shows comparison of lookup costs between Python dict and an HAMT immutable mapping. HAMT lookup time is 30-40% worse than Python dict lookups on average, which is a very good result, considering how well Python dicts are optimized. Note, that according to [8]_, HAMT design can be further improved. The bottom line is that the current approach with implementing an immutable mapping with shallow-copying dict will likely perform adequately in real-life applications. The HAMT solution is more future proof, however. The proposed API is designed in such a way that the underlying implementation of the mapping can be changed completely without affecting the Execution Context `Specification`_, which allows us to switch to HAMT at some point if necessary. Copy-on-write Execution Context ------------------------------- The implementation of Execution Context in .NET is different from this PEP. .NET uses copy-on-write mechanism and a regular mutable mapping. One way to implement this in CPython would be to have two new fields in ``PyThreadState``: * ``exec_context`` pointing to the current Execution Context mapping; * ``exec_context_copy_on_write`` flag, set to ``0`` initially. The idea is that whenever we are modifying the EC, the copy-on-write flag is checked, and if it is set to ``1``, the EC is copied. Modifications to Coroutine and Generator ``.send()`` and ``.throw()`` methods described in the `Coroutines`_ section will be almost the same, except that in addition to the ``gi_execution_context`` they will have a ``gi_exec_context_copy_on_write`` flag. When a coroutine or a generator starts, the flag will be set to ``1``. This will ensure that any modification of the EC performed within a coroutine or a generator will be isolated. This approach has one advantage: * For Execution Context that contains a large number of items, copy-on-write is a more efficient solution than the shallow-copy dict approach. However, we believe that copy-on-write disadvantages are more important to consider: * Copy-on-write behaviour for generators and coroutines makes EC semantics less predictable. With immutable EC approach, generators and coroutines always execute in the EC that was current at the moment of their creation. Any modifications to the outer EC while a generator or a coroutine is executing are not visible to them:: def generator(): yield 1 print(get_execution_context_item('key')) yield 2 set_execution_context_item('key', 'spam') gen = iter(generator()) next(gen) set_execution_context_item('key', 'ham') next(gen) The above script will always print 'spam' with immutable EC. With a copy-on-write approach, the above script will print 'ham'. Now, consider that ``generator()`` was refactored to call some library function, that uses Execution Context:: def generator(): yield 1 some_function_that_uses_decimal_context() print(get_execution_context_item('key')) yield 2 Now, the script will print 'spam', because ``some_function_that_uses_decimal_context`` forced the EC to copy, and ``set_execution_context_item('key', 'ham')`` line did not affect the ``generator()`` code after all. * Similarly to the previous point, ``sys.ExecutionContext.run()`` method will also become less predictable, as ``sys.get_execution_context()`` would still return a reference to the current mutable EC. We can't modify ``sys.get_execution_context()`` to return a shallow copy of the current EC, because this would seriously harm performance of ``asyncio.call_soon()`` and similar places, where it is important to propagate the Execution Context. * Even though copy-on-write requires to shallow copy the execution context object less frequently, copying will still take place in coroutines and generators. In which case, HAMT approach will perform better for medium to large sized execution contexts. All in all, we believe that the copy-on-write approach introduces very subtle corner cases that could lead to bugs that are exceptionally hard to discover and fix. The immutable EC solution in comparison is always predictable and easy to reason about. Therefore we believe that any slight performance gain that the copy-on-write solution might offer is not worth it. Faster C API ------------ Packages like numpy and standard library modules like decimal need to frequently query the global state for some local context configuration. It is important that the APIs that they use is as fast as possible. The proposed ``PyThreadState_SetExecContextItem`` and ``PyThreadState_GetExecContextItem`` functions need to get the current thread state with ``PyThreadState_GET()`` (fast) and then perform a hash lookup (relatively slow). We can eliminate the hash lookup by adding three additional C API functions: * ``Py_ssize_t PyExecContext_RequestIndex(char *key_name)``: a function similar to the existing ``_PyEval_RequestCodeExtraIndex`` introduced :pep:`523`. The idea is to request a unique index that can later be used to lookup context items. The ``key_name`` can later be used by ``sys.ExecutionContext`` to introspect items added with this API. * ``PyThreadState_SetExecContextIndexedItem(Py_ssize_t index, PyObject *val)`` and ``PyThreadState_GetExecContextIndexedItem(Py_ssize_t index)`` to request an item by its index, avoiding the cost of hash lookup. Why setting a key to None removes the item? ------------------------------------------- Consider a context manager:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) With ``set_execution_context_item(key, None)`` call removing the ``key``, the user doesn't need to write additional code to remove the ``key`` if it wasn't in the execution context already. An alternative design with ``del_execution_context_item()`` method would look like the following:: @contextmanager def context(x): not_there = object() old_x = get_execution_context_item('x', not_there) set_execution_context_item('x', x) try: yield finally: if old_x is not_there: del_execution_context_item('x') else: set_execution_context_item('x', old_x) Can we fix ``PyThreadState_GetDict()``? --------------------------------------- ``PyThreadState_GetDict`` is a TLS, and some of its existing users might depend on it being just a TLS. Changing its behaviour to follow the Execution Context semantics would break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem: enhance Context Manager Protocol with two new methods: ``__suspend__`` and ``__resume__``. To make it compatible with async/await, the Asynchronous Context Manager Protocol will also need to be extended with ``__asuspend__`` and ``__aresume__``. This allows to implement context managers like decimal context and ``numpy.errstate`` for generators and coroutines. The following code:: class Context: def __enter__(self): self.old_x = get_execution_context_item('x') set_execution_context_item('x', 'something') def __exit__(self, *err): set_execution_context_item('x', self.old_x) would become this:: class Context: def __enter__(self): self.old_x = get_execution_context_item('x') set_execution_context_item('x', 'something') def __suspend__(self): set_execution_context_item('x', self.old_x) def __resume__(self): set_execution_context_item('x', 'something') def __exit__(self, *err): set_execution_context_item('x', self.old_x) Besides complicating the protocol, the implementation will likely negatively impact performance of coroutines, generators, and any code that uses context managers, and will notably complicate the interpreter implementation. It also does not solve the leaking state problem for greenlet/gevent. :pep:`521` also does not provide any mechanism to propagate state in a local context, like storing a request object in an HTTP request handler to have better logging. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- Because async/await code needs an event loop to run it, an EC-like solution can be implemented in a limited way for coroutines. Generators, on the other hand, do not have an event loop or trampoline, making it impossible to intercept their ``yield`` points outside of the Python interpreter. Reference Implementation ======================== The reference implementation can be found here: [11]_. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.… .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashma… .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.… .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c Copyright ========= This document has been placed in the public domain. [View Less]

11 44

Re: [Python-ideas] New PEP 550: Execution Context
by Yury Selivanov Aug. 13, 2017

Aug. 13, 2017

[replying to the list] On Sun, Aug 13, 2017 at 6:14 AM, Nick Coghlan <ncoghlan(a)gmail.com> wrote: > On 13 August 2017 at 16:01, Yury Selivanov <yselivanov.ml(a)gmail.com> wrote: >> On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan <ncoghlan(a)gmail.com> wrote: >> [..] >>> As Nathaniel suggestion, getting/setting/deleting individual items in >>> the current context would be implemented as methods on the ContextItem >>> objects, allowing … [View More]the return value of "get_context_items" to be a >>> plain dictionary, rather than a special type that directly supported >>> updates to the underlying context. >> >> The current PEP 550 design returns a "snapshot" of the current EC with >> sys.get_execution_context(). >> >> I.e. if you do >> >> ec = sys.get_execution_context() >> ec['a'] = 'b' >> >> # sys.get_execution_context_item('a') will return None >> >> You did get a snapshot and you modified it -- but your modifications >> are not visible anywhere. You can run a function in that modified EC >> with `ec.run(function)` and that function will see that new 'a' key, >> but that's it. There's no "magical" updates to the underlying context. > > In that case, I think "get_execution_context()" is quite misleading as > a name, and is going to be prone to exactly the confusion we currently > have with the mapping returned by locals(), which is that regardless > of whether writes to it affect the target namespace or not, it's going > to be surprising in at least some situations. > > So despite being initially in favour of exposing a mapping-like API at > the Python level, I'm now coming around to Armin Ronacher's point of > view: the copy-on-write semantics for the active context are > sufficiently different from any other mapping type in Python that we > should just avoid the use of __setitem__ and __delitem__ as syntactic > sugar entirely. I agree. I'll be redesigning the PEP to use the following API (please ignore the naming peculiarities, there are so many proposals at this point that I'll just stick to something I have in my head): 1. sys.new_execution_context_key('description') -> sys.ContextItem (or maybe we should just expose the sys.ContextItem type and let people instantiate it?) A key (or "token") to use with the execution context. Besides eliminating the names collision issue, it'll also have a slightly better performance, because its __hash__ method will always return a constant. (Strings cache their __hash__, but other types don't). 2. ContextItem.has(), ContextItem.get(), ContextItem.set(), ContextItem.delete() -- pretty self-explanatory. 3. sys.get_active_context() -> sys.ExecutionContext -- an immutable object, has no methods to modify the context. 3a. sys.ExecutionContext.run(callable, *args) -- run a callable(*args) in some execution context. 3b. sys.ExecutionContext.items() -- an iterator of ContextItem -> value for introspection and debugging purposes. 4. No sys.set_execution_context() method. At this point I'm not sure it's a good idea to allow users to change the current execution context to something else entirely. For use cases like enabling concurrent.futures to run your function within the current EC, you just use the sys.get_active_context()/ExecutionContext.run combination. If anything, we can add this function later. > Instead, we'd lay out the essential primitive operations that *only* > the interpreter can provide and define procedural interfaces for > those, and if anyone wanted to build a higher level object-oriented > interface on top of those primitives, they'd be free to do so, with > the procedural API acting as the abstraction layer that decouples "how > interpreters actually implement it" (e.g. copy-on-write mappings) from > "how libraries and frameworks model it for their own use" (e.g. rich > application context objects). That way, each interpreter would also be > free to define their *internal* object model in whichever way made the > most sense for them, rather than enshrining a point-in-time snaphot of > CPython's preferred implementation model as part of the language > definition. I agree. I like that this idea gives us more flexibility with the exact implementation strategy. [..] > The essential capabilities for active context manipulation would then be: > > - get_active_context_token() > - set_active_context(context_token) As I mentioned above, at this point I'm not entirely sure that we even need "set_active_context". The only useful thing for it that I can imagine is creating a decorator that isolates any changes of the context, but the only usecase for this I see is unittests. But even for unittests, a better solution is to use a decorator that detects keys that were added but not deleted during the test (leaks). > - implicitly saving and reverting the active context around various operations Usually we need to save/revert one particular context item, not the whole context. > - accessing the active context id for suspended coroutines and > generators (so parent contexts can opt-in to seeing changes made in > child contexts) Yes, this might be useful, let's keep it. > > Running commands in a particular context *wouldn't* be a primitive > operation given those building blocks, since you can implement that > for yourself using the above primitives: > > def run_in_context(target_context_token, func, *args, **kwds): > old_context_token = get_active_context_token() > set_active_context(target_context_token) > try: > func(*args, **kwds) > finally: > set_active_context(old_context_token) I'd still prefer to implement this as part of the spec. There are some tricks that I want to use to make ExecutionContext.run() much faster than a pure Python version. This is a highly performance critical part of the PEP -- call_soon in asyncio is a VERY frequent thing. Besides, having ExecutionContext.run eliminates the need to sys.set_active_context() -- again, we need to discuss this, but I see less and less utility for it now. > > The public manipulation API here would be deliberately based on opaque > tokens to make it clear that creating and mutating execution contexts > is entirely within the realm of the interpreter implementation, and > user level code can only control *which* execution context is active > in the current thread, not create arbitrary new execution contexts of > its own (at least, not without writing a CPython-specific C > extension). > > For manipulation of values within the active context, looking at other > comparable APIs, I think the main prior art within the language would > be: > > 1. threading.local(), which uses the descriptor protocol to handle > arbitrary attributes > 2. Cell variable references in function `__closure__` attributes, > which also uses the descriptor protocol by way of the "cell_contents" > attribute > > In 3.7, those two examples are being brought closer by way of > `cell_contents` becoming a read/write attribute: > > >>> def f(i): > ... def g(): > ... nonlocal i > ... return i > ... return g > ... > >>> g = f(0) > >>> g() > 0 > >>> cell = g.__closure__[0] > >>> cell.cell_contents > 0 > >>> cell.cell_contents = 5 > >>> g() > 5 > >>> del cell.cell_contents > >>> g() > Traceback (most recent call last): > ... > NameError: free variable 'i' referenced before assignment in enclosing scope > >>> cell.cell_contents = 0 > >>> g() > 0 > > This is very similar to the way manipulation of entries within a > thread local namespace works, but with each cell containing exactly > one attribute. > > For context items, I agree with Nathaniel that the cell-style > one-value-per-item approach is likely to be the way to go. To > emphasise that changes to that attribute only affect the *active* > context, I think "active_value" would be a good name: > > >>> request_id = > sys.create_context_item("my_web_framework.request_id", "Request > identifier for my_web_framework") > >>> request_id.active_value > Traceback (most recent call last): > ... > RuntimeError: Context item "my_web_framework.request" not set in > context <context token> > >>> request_id.active_value = "12345" > >>> request_id.active_value > '12345' I myself prefer a functional API to to __getattr__. I don't like the "del local.x" syntax. I don't think we are forced to follow the threading.local() API here, aren't we? Yury > > Finally, given opaque context tokens, and context items that worked > like closure cells (only accessing the active context rather than > lexically scoped variables), the one introspection primitive the > *interpreter* would need to provide is either: > > 1. Given a context token, return a mapping from context items to their > defined values in the given context > 2. A way to get a listing of the context items defined in the active context > > Since either of those can be defined in terms of the other, my own > preference goes to the first one, since using it to implement the > second alternative just requires a simple > `sys.get_active_context_token()` call, while implementing the first > one in terms of the second one requires a helper like > `run_in_context()` above to manipulate the active context in the > current thread. > > The first one also makes it fairly straightforward to *diff* a given > context against the active one - get the mappings for both contexts, > check which keys they have in common, compare the values for the > common keys, and then report on > > - keys that appear in one context but not the other > - values which differ between them for common keys > - (optionally) values which are the same for common keys > > Cheers, > Nick. [View Less]

2 2

Re: [Python-ideas] Python-ideas Digest, Vol 129, Issue 44
by 王宣赵 Aug. 13, 2017

Aug. 13, 2017

Thank you for your consideration. 获取 Outlook for Android<https://aka.ms/ghei36> 发件人: python-ideas-request(a)python.org 发送时间: 8月14日星期一 03:14 主题: Python-ideas Digest, Vol 129, Issue 44 收件人: python-ideas(a)python.org Send Python-ideas mailing list submissions to python-ideas(a)python.org To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to python-ideas-request(a)… [View More]python.org You can reach the person managing the list at python-ideas-owner(a)python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: 1. How do you think about these language extensions? (?? ?) 2. Re: New PEP 550: Execution Context (Yury Selivanov) 3. Re: New PEP 550: Execution Context (Nathaniel Smith) ---------------------------------------------------------------------- Message: 1 Date: Sun, 13 Aug 2017 12:49:45 +0000 From: ?? ? To: "python-ideas(a)python.org" Subject: [Python-ideas] How do you think about these language extensions? Message-ID: Content-Type: text/plain; charset="iso-2022-jp" Hi all, I've just finished a language extension for CPython 3.6.x to support some additional grammars like Pattern Matching. And It's compatible with CPython. I'm looking for constructive advice, and I wonder if you will be interested in this one. ? the project address is https://github.com/thautwarm/flowpython) [https://avatars1.githubusercontent.com/u/22536460?v=4&s=400] thautwarm/flowpython github.com flowpython - tasty feature extensions for python(python3). Some examples here: # where syntax from math import pi r = 1 # the radius h = 10 # the height S = (2*S_top + S_side) where: S_top = pi*r**2 S_side = C * h where: C = 2*pi*r # lambda&curry : lambda x: lambda y: lambda z: ret where: ret = x+y ret -= z .x -> .y -> .z -> ret where: ret = x+y ret -= z as-with x def as y def as z def ret where: ret = x+y ret -= z # arrow transform (to avoid endless parentheses and try to be more readable. >> range(5) -> map(.x->x+2, _) -> list(_) >> [2,3,4,5,6] # pattern matching # use "condic" as keyword is for avoiding the conflictions against the standard libraries and packages from third party. "switch" and "match" both lead to conflictions. condic+(type) 1: case a:int => assert a == 1 and type(a) == 1 [>] case 0 => assert 1 > 0 [is not] case 1 => assert 1 is not 1 otherwise => print("nothing") condic+() [1,2,3]: case (a,*b)->b:list => sum(b) +[] case [] => print('empty list') +[==] case (a,b):(1,2) => print("the list is [1,2]") The grammars with more details and examples can be found in https://github.com/thautwarm/flowpython/wiki Does it interest you? If so, you can try it if you have CPython 3.6.x. pip install flowpython python -m flowpython -m enable/disable Here is an example to use flowpython, which gives the permutations of a sequence. from copy import deepcopy permutations = .seq -> seq_seq where: condic+[] seq: case (a, ) => seq_seq = [a,] case (a, b) => seq_seq = [[a,b],[b,a]] case (a,*b) => seq_seq = permutations(b) -> map(.x -> insertAll(x, a), _) -> sum(_, []) where: insertAll = . x, a -> ret where: ret = [ deepcopy(x) -> _.insert(i, a) or _ for i in (len(x) -> range(_+1)) ] If the object permutations are defined, try these codes in console: >> range(3) -> permutations(_) >> [[0, 1, 2], [1, 0, 2], [1, 2, 0], [0, 2, 1], [2, 0, 1], [2, 1, 0]] Does it seem to be interesting? Thanks, Thautwarm -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 2 Date: Sun, 13 Aug 2017 14:44:24 -0400 From: Yury Selivanov To: Nick Coghlan Cc: Nathaniel Smith , "python-ideas(a)python.org" Subject: Re: [Python-ideas] New PEP 550: Execution Context Message-ID: Content-Type: text/plain; charset="UTF-8" I'll start a new thread to discuss is we want this specific semantics change soon (with some updates). Yury ------------------------------ Message: 3 Date: Sun, 13 Aug 2017 12:14:07 -0700 From: Nathaniel Smith To: Yury Selivanov Cc: Nick Coghlan , Python-Ideas Subject: Re: [Python-ideas] New PEP 550: Execution Context Message-ID: Content-Type: text/plain; charset="UTF-8" On Sun, Aug 13, 2017 at 9:57 AM, Yury Selivanov wrote: > 2. ContextItem.has(), ContextItem.get(), ContextItem.set(), > ContextItem.delete() -- pretty self-explanatory. It might make sense to simplify even further and declare that context items are initialized to None to start, and the only operations are set() and get(). And then get() can't fail, b/c there is no "value missing" state. -n -- Nathaniel J. Smith -- https://vorpus.org ------------------------------ Subject: Digest Footer _______________________________________________ Python-ideas mailing list Python-ideas(a)python.org https://mail.python.org/mailman/listinfo/python-ideas ------------------------------ End of Python-ideas Digest, Vol 129, Issue 44 ********************************************* [View Less]

1 0

New PEP 550: Execution Context
by Stefan Krah Aug. 13, 2017

Aug. 13, 2017

Yury Selivanov wrote: > This is a new PEP to implement Execution Contexts in Python. The idea is of course great! A couple of issues for decimal: > Moreover, passing the context explicitly does not work at all for > libraries like ``decimal`` or ``numpy``, which use operator overloading. Instead of "with localcontext() ...", each coroutine can create a new Context() and use its methods, without any loss of functionality. All one loses is the inline operator syntax sugar. I'm … [View More]

2 1

Re: [Python-ideas] New PEP 550: Execution Context
by rymg19＠gmail.com Aug. 12, 2017

Aug. 12, 2017

So, I'm hardly an expert when it comes to things like this, but there are two things about this that don't seem right to me. (Also, I'd love to respond inline, but that's kind of difficult from a mobile phone.) The first is how set/get_execution_context_item take strings. Inevitably, people are going to do things like: CONTEXT_ITEM_NAME = 'foo-bar' ... sys.set_execution_context_item(CONTEXT_ITEM_NAME, 'stuff') IMO it would be nicer if there could be a key object used instead, e.g. my_key = … [View More]sys.execution_context_key('name-here-for-debugging-purposes') sys.set_execution_context_item(my_key, 'stuff') The advantage here would be no need for string constants and no potential naming conflicts (the string passed to the key creator would be used just for debugging, kind of like Thread names). Second thing is this: def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) If this would be done frequently, a context manager would be a *lot* more Pythonic, e.g.: with sys.temp_change_execution_context('x', new_x): # ... -- Ryan (ライアン) Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone elsehttp://refi64.com On Aug 11, 2017 at 5:38 PM, <Yury Selivanov <yselivanov.ml(a)gmail.com>> wrote: Hi, This is a new PEP to implement Execution Contexts in Python. The PEP is in-flight to python.org, and in the meanwhile can be read on GitHub: https://github.com/python/peps/blob/master/pep-0550.rst (it contains a few diagrams and charts, so please read it there.) Thank you! Yury PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury(a)magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017 Abstract ======== This PEP proposes a new mechanism to manage execution state--the logical environment in which a function, a thread, a generator, or a coroutine executes in. A few examples of where having a reliable state storage is required: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``; * Storing request-related data such as security tokens and request data in web applications; * Profiling, tracing, and logging in complex and large code bases. The usual solution for storing state is to use a Thread-local Storage (TLS), implemented in the standard library as ``threading.local()``. Unfortunately, TLS does not work for isolating state of generators or asynchronous code because such code shares a single thread. Rationale ========= Traditionally a Thread-local Storage (TLS) is used for storing the state. However, the major flaw of using the TLS is that it works only for multi-threaded code. It is not possible to reliably contain the state within a generator or a coroutine. For example, consider the following generator:: def calculate(precision, ...): with decimal.localcontext() as ctx: # Set the precision for decimal calculations # inside this block ctx.prec = precision yield calculate_something() yield calculate_something_else() Decimal context is using a TLS to store the state, and because TLS is not aware of generators, the state can leak. The above code will not work correctly, if a user iterates over the ``calculate()`` generator with different precisions in parallel:: g1 = calculate(100) g2 = calculate(50) items = list(zip(g1, g2)) # items[0] will be a tuple of: # first value from g1 calculated with 100 precision, # first value from g2 calculated with 50 precision. # # items[1] will be a tuple of: # second value from g1 calculated with 50 precision, # second value from g2 calculated with 50 precision. An even scarier example would be using decimals to represent money in an async/await application: decimal calculations can suddenly lose precision in the middle of processing a request. Currently, bugs like this are extremely hard to find and fix. Another common need for web applications is to have access to the current request object, or security context, or, simply, the request URL for logging or submitting performance tracing data:: async def handle_http_request(request): context.current_http_request = request await ... # Invoke your framework code, render templates, # make DB queries, etc, and use the global # 'current_http_request' in that code. # This isn't currently possible to do reliably # in asyncio out of the box. These examples are just a few out of many, where a reliable way to store context data is absolutely needed. The inability to use TLS for asynchronous code has lead to proliferation of ad-hoc solutions, limited to be supported only by code that was explicitly enabled to work with them. Current status quo is that any library, including the standard library, that uses a TLS, will likely not work as expected in asynchronous code or with generators (see [3]_ as an example issue.) Some languages that have coroutines or generators recommend to manually pass a ``context`` object to every function, see [1]_ describing the pattern for Go. This approach, however, has limited use for Python, where we have a huge ecosystem that was built to work with a TLS-like context. Moreover, passing the context explicitly does not work at all for libraries like ``decimal`` or ``numpy``, which use operator overloading. .NET runtime, which has support for async/await, has a generic solution of this problem, called ``ExecutionContext`` (see [2]_). On the surface, working with it is very similar to working with a TLS, but the former explicitly supports asynchronous code. Goals ===== The goal of this PEP is to provide a more reliable alternative to ``threading.local()``. It should be explicitly designed to work with Python execution model, equally supporting threads, generators, and coroutines. An acceptable solution for Python should meet the following requirements: * Transparent support for code executing in threads, coroutines, and generators with an easy to use API. * Negligible impact on the performance of the existing code or the code that will be using the new mechanism. * Fast C API for packages like ``decimal`` and ``numpy``. Explicit is still better than implicit, hence the new APIs should only be used when there is no option to pass the state explicitly. With this PEP implemented, it should be possible to update a context manager like the below:: _local = threading.local() @contextmanager def context(x): old_x = getattr(_local, 'x', None) _local.x = x try: yield finally: _local.x = old_x to a more robust version that can be reliably used in generators and async/await code, with a simple transformation:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) Specification ============= This proposal introduces a new concept called Execution Context (EC), along with a set of Python APIs and C APIs to interact with it. EC is implemented using an immutable mapping. Every modification of the mapping produces a new copy of it. To illustrate what it means let's compare it to how we work with tuples in Python:: a0 = () a1 = a0 + (1,) a2 = a1 + (2,) # a0 is an empty tuple # a1 is (1,) # a2 is (1, 2) Manipulating an EC object would be similar:: a0 = EC() a1 = a0.set('foo', 'bar') a2 = a1.set('spam', 'ham') # a0 is an empty mapping # a1 is {'foo': 'bar'} # a2 is {'foo': 'bar', 'spam': 'ham'} In CPython, every thread that can execute Python code has a corresponding ``PyThreadState`` object. It encapsulates important runtime information like a pointer to the current frame, and is being used by the ceval loop extensively. We add a new field to ``PyThreadState``, called ``exec_context``, which points to the current EC object. We also introduce a set of APIs to work with Execution Context. In this section we will only cover two functions that are needed to explain how Execution Context works. See the full list of new APIs in the `New APIs`_ section. * ``sys.get_execution_context_item(key, default=None)``: lookup ``key`` in the EC of the executing thread. If not found, return ``default``. * ``sys.set_execution_context_item(key, value)``: get the current EC of the executing thread. Add a ``key``/``value`` item to it, which will produce a new EC object. Set the new object as the current one for the executing thread. In pseudo-code:: tstate = PyThreadState_GET() ec = tstate.exec_context ec2 = ec.set(key, value) tstate.exec_context = ec2 Note, that some important implementation details and optimizations are omitted here, and will be covered in later sections of this PEP. Now let's see how Execution Contexts work with regular multi-threaded code, generators, and coroutines. Regular & Multithreaded Code ---------------------------- For regular Python code, EC behaves just like a thread-local. Any modification of the EC object produces a new one, which is immediately set as the current one for the thread state. .. figure:: pep-0550/functions.png :align: center :width: 90% Figure 1. Execution Context flow in a thread. As Figure 1 illustrates, if a function calls ``set_execution_context_item()``, the modification of the execution context will be visible to all subsequent calls and to the caller:: def set_foo(): set_execution_context_item('foo', 'spam') set_execution_context_item('foo', 'bar') print(get_execution_context_item('foo')) set_foo() print(get_execution_context_item('foo')) # will print: # bar # spam Coroutines ---------- Python :pep:`492` coroutines are used to implement cooperative multitasking. For a Python end-user they are similar to threads, especially when it comes to sharing resources or modifying the global state. An event loop is needed to schedule coroutines. Coroutines that are explicitly scheduled by the user are usually called Tasks. When a coroutine is scheduled, it can schedule other coroutines using an ``await`` expression. In async/await world, awaiting a coroutine can be viewed as a different calling convention: Tasks are similar to threads, and awaiting on coroutines within a Task is similar to calling functions within a thread. By drawing a parallel between regular multithreaded code and async/await, it becomes apparent that any modification of the execution context within one Task should be visible to all coroutines scheduled within it. Any execution context modifications, however, must not be visible to other Tasks executing within the same thread. To achieve this, a small set of modifications to the coroutine object is needed: * When a coroutine object is instantiated, it saves a reference to the current execution context object to its ``cr_execution_context`` attribute. * Coroutine's ``.send()`` and ``.throw()`` methods are modified as follows (in pseudo-C):: if coro->cr_isolated_execution_context: # Save a reference to the current execution context old_context = tstate->execution_context # Set our saved execution context as the current # for the current thread. tstate->execution_context = coro->cr_execution_context try: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro->send(...) finally: # Save a reference to the updated execution_context. # We will need it later, when `.send()` or `.throw()` # are called again. coro->cr_execution_context = tstate->execution_context # Restore thread's execution context to what it was before # invoking this coroutine. tstate->execution_context = old_context else: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro->send(...) * ``cr_isolated_execution_context`` is a new attribute on coroutine objects. Set to ``True`` by default, it makes any execution context modifications performed by coroutine to stay visible only to that coroutine. When Python interpreter sees an ``await`` instruction, it flips ``cr_isolated_execution_context`` to ``False`` for the coroutine that is about to be awaited. This makes any changes to execution context made by nested coroutine calls within a Task to be visible throughout the Task. Because the top-level coroutine (Task) cannot be scheduled with ``await`` (in asyncio you need to call ``loop.create_task()`` or ``asyncio.ensure_future()`` to schedule a Task), all execution context modifications are guaranteed to stay within the Task. * We always work with ``tstate->exec_context``. We use ``coro->cr_execution_context`` only to store coroutine's execution context when it is not executing. Figure 2 below illustrates how execution context mutations work with coroutines. .. figure:: pep-0550/coroutines.png :align: center :width: 90% Figure 2. Execution Context flow in coroutines. In the above diagram: * When "coro1" is created, it saves a reference to the current execution context "2". * If it makes any change to the context, it will have its own execution context branch "2.1". * When it awaits on "coro2", any subsequent changes it does to the execution context are visible to "coro1", but not outside of it. In code:: async def inner_foo(): print('inner_foo:', get_execution_context_item('key')) set_execution_context_item('key', 2) async def foo(): print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 1) await inner_foo() print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 'spam') print('main:', get_execution_context_item('key')) asyncio.get_event_loop().run_until_complete(foo()) print('main:', get_execution_context_item('key')) which will output:: main: spam foo: spam inner_foo: 1 foo: 2 main: spam Generator-based coroutines (generators decorated with ``types.coroutine`` or ``asyncio.coroutine``) behave exactly as native coroutines with regards to execution context management: their ``yield from`` expression is semantically equivalent to ``await``. Generators ---------- Generators in Python, while similar to Coroutines, are used in a fundamentally different way. They are producers of data, and they use ``yield`` expression to suspend/resume their execution. A crucial difference between ``await coro`` and ``yield value`` is that the former expression guarantees that the ``coro`` will be executed to the end, while the latter is producing ``value`` and suspending the generator until it gets iterated again. Generators share 99% of their implementation with coroutines, and thus have similar new attributes ``gi_execution_context`` and ``gi_isolated_execution_context``. Similar to coroutines, generators save a reference to the current execution context when they are instantiated. The have the same implementation of ``.send()`` and ``.throw()`` methods. The only difference is that ``gi_isolated_execution_context`` is always set to ``True``, and is never modified by the interpreter. ``yield from o`` expression in regular generators that are not decorated with ``types.coroutine``, is semantically equivalent to ``for v in o: yield v``. .. figure:: pep-0550/generators.png :align: center :width: 90% Figure 3. Execution Context flow in a generator. In the above diagram: * When "gen1" is created, it saves a reference to the current execution context "2". * If it makes any change to the context, it will have its own execution context branch "2.1". * When "gen2" is created, it saves a reference to the current execution context for it -- "2.1". * Any subsequent execution context updated in "gen2" will only be visible to "gen2". * Likewise, any context changes that "gen1" will do after it created "gen2" will not be visible to "gen2". In code:: def inner_foo(): for i in range(3): print('inner_foo:', get_execution_context_item('key')) set_execution_context_item('key', i) yield i def foo(): set_execution_context_item('key', 'spam') print('foo:', get_execution_context_item('key')) inner = inner_foo() while True: val = next(inner, None) if val is None: break yield val print('foo:', get_execution_context_item('key')) set_execution_context_item('key', 'spam') print('main:', get_execution_context_item('key')) list(foo()) print('main:', get_execution_context_item('key')) which will output:: main: ham foo: spam inner_foo: spam foo: spam inner_foo: 0 foo: spam inner_foo: 1 foo: spam main: ham As we see, any modification of the execution context in a generator is visible only to the generator itself. There is one use-case where it is desired for generators to affect the surrounding execution context: ``contextlib.contextmanager`` decorator. To make the following work:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) we modified ``contextmanager`` to flip ``gi_isolated_execution_context`` flag to ``False`` on its generator. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. In a nutshell, greenlet design is very similar to design of generators. The main difference is that for generators, the stack is managed by the Python interpreter. Greenlet works outside of the Python interpreter, and manually saves some ``PyThreadState`` fields and pushes/pops the C-stack. Since Execution Context is implemented on top of ``PyThreadState``, it's easy to add transparent support of it to greenlet. New APIs ======== Even though this PEP adds a number of new APIs, please keep in mind, that most Python users will likely ever use only two of them: ``sys.get_execution_context_item()`` and ``sys.set_execution_context_item()``. Python ------ 1. ``sys.get_execution_context_item(key, default=None)``: lookup ``key`` for the current Execution Context. If not found, return ``default``. 2. ``sys.set_execution_context_item(key, value)``: set ``key``/``value`` item for the current Execution Context. If ``value`` is ``None``, the item will be removed. 3. ``sys.get_execution_context()``: return the current Execution Context object: ``sys.ExecutionContext``. 4. ``sys.set_execution_context(ec)``: set the passed ``sys.ExecutionContext`` instance as a current one for the current thread. 5. ``sys.ExecutionContext`` object. Implementation detail: ``sys.ExecutionContext`` wraps a low-level ``PyExecContextData`` object. ``sys.ExecutionContext`` has a mutable mapping API, abstracting away the real immutable ``PyExecContextData``. * ``ExecutionContext()``: construct a new, empty, execution context. * ``ec.run(func, *args)`` method: run ``func(*args)`` in the ``ec`` execution context. * ``ec[key]``: lookup ``key`` in ``ec`` context. * ``ec[key] = value``: assign ``key``/``value`` item to the ``ec``. * ``ec.get()``, ``ec.items()``, ``ec.values()``, ``ec.keys()``, and ``ec.copy()`` are similar to that of ``dict`` object. C API ----- C API is different from the Python one because it operates directly on the low-level immutable ``PyExecContextData`` object. 1. New ``PyThreadState->exec_context`` field, pointing to a ``PyExecContextData`` object. 2. ``PyThreadState_SetExecContextItem`` and ``PyThreadState_GetExecContextItem`` similar to ``sys.set_execution_context_item()`` and ``sys.get_execution_context_item()``. 3. ``PyThreadState_GetExecContext``: similar to ``sys.get_execution_context()``. Always returns an ``PyExecContextData`` object. If ``PyThreadState->exec_context`` is ``NULL`` an new and empty one will be created and assigned to ``PyThreadState->exec_context``. 4. ``PyThreadState_SetExecContext``: similar to ``sys.set_execution_context()``. 5. ``PyExecContext_New``: create a new empty ``PyExecContextData`` object. 6. ``PyExecContext_SetItem`` and ``PyExecContext_GetItem``. The exact layout ``PyExecContextData`` is private, which allows to switch it to a different implementation later. More on that in the `Implementation Details`_ section. Modifications in Standard Library ================================= * ``contextlib.contextmanager`` was updated to flip the new ``gi_isolated_execution_context`` attribute on the generator. * ``asyncio.events.Handle`` object now captures the current execution context when it is created, and uses the saved execution context to run the callback (with ``ExecutionContext.run()`` method.) This makes ``loop.call_soon()`` to run callbacks in the execution context they were scheduled. No modifications in ``asyncio.Task`` or ``asyncio.Future`` were necessary. Some standard library modules like ``warnings`` and ``decimal`` can be updated to use new execution contexts. This will be considered in separate issues if this PEP is accepted. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Performance =========== Implementation Details ---------------------- The new ``PyExecContextData`` object is wrapping a ``dict`` object. Any modification requires creating a shallow copy of the dict. While working on the reference implementation of this PEP, we were able to optimize ``dict.copy()`` operation **5.5x**, see [4]_ for details. .. figure:: pep-0550/dict_copy.png :align: center :width: 100% Figure 4. Figure 4 shows that the performance of immutable dict implemented with shallow copying is expectedly O(n) for the ``set()`` operation. However, this is tolerable until dict has more than 100 items (1 ``set()`` takes about a microsecond.) Judging by the number of modules that need EC in Standard Library it is likely that real world Python applications will use significantly less than 100 execution context variables. The important point is that the cost of accessing a key in Execution Context is always O(1). If the ``set()`` operation performance is a major concern, we discuss alternative approaches that have O(1) or close ``set()`` performance in `Alternative Immutable Dict Implementation`_, `Faster C API`_, and `Copy-on-write Execution Context`_ sections. Generators and Coroutines ------------------------- Using a microbenchmark for generators and coroutines from :pep:`492` ([12]_), it was possible to observe 0.5 to 1% performance degradation. asyncio echoserver microbechmarks from the uvloop project [13]_ showed 1-1.5% performance degradation for asyncio code. asyncpg benchmarks [14]_, that execute more code and are closer to a real-world application did not exhibit any noticeable performance change. Overall Performance Impact -------------------------- The total number of changed lines in the ceval loop is 2 -- in the ``YIELD_FROM`` opcode implementation. Only performance of generators and coroutines can be affected by the proposal. This was confirmed by running Python Performance Benchmark Suite [15]_, which demonstrated that there is no difference between 3.7 master branch and this PEP reference implementation branch (full benchmark results can be found here [16]_.) Design Considerations ===================== Alternative Immutable Dict Implementation ----------------------------------------- Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT) to implement high performance immutable collections [5]_, [6]_. Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N) performance for both ``set()`` and ``get()`` operations, which will be essentially O(1) for relatively small mappings in EC. To assess if HAMT can be used for Execution Context, we implemented it in CPython [7]_. .. figure:: pep-0550/hamt_vs_dict.png :align: center :width: 100% Figure 5. Benchmark code can be found here: [9]_. Figure 5 shows that HAMT indeed displays O(1) performance for all benchmarked dictionary sizes. For dictionaries with less than 100 items, HAMT is a bit slower than Python dict/shallow copy. .. figure:: pep-0550/lookup_hamt.png :align: center :width: 100% Figure 6. Benchmark code can be found here: [10]_. Figure 6 below shows comparison of lookup costs between Python dict and an HAMT immutable mapping. HAMT lookup time is 30-40% worse than Python dict lookups on average, which is a very good result, considering how well Python dicts are optimized. Note, that according to [8]_, HAMT design can be further improved. The bottom line is that the current approach with implementing an immutable mapping with shallow-copying dict will likely perform adequately in real-life applications. The HAMT solution is more future proof, however. The proposed API is designed in such a way that the underlying implementation of the mapping can be changed completely without affecting the Execution Context `Specification`_, which allows us to switch to HAMT at some point if necessary. Copy-on-write Execution Context ------------------------------- The implementation of Execution Context in .NET is different from this PEP. .NET uses copy-on-write mechanism and a regular mutable mapping. One way to implement this in CPython would be to have two new fields in ``PyThreadState``: * ``exec_context`` pointing to the current Execution Context mapping; * ``exec_context_copy_on_write`` flag, set to ``0`` initially. The idea is that whenever we are modifying the EC, the copy-on-write flag is checked, and if it is set to ``1``, the EC is copied. Modifications to Coroutine and Generator ``.send()`` and ``.throw()`` methods described in the `Coroutines`_ section will be almost the same, except that in addition to the ``gi_execution_context`` they will have a ``gi_exec_context_copy_on_write`` flag. When a coroutine or a generator starts, the flag will be set to ``1``. This will ensure that any modification of the EC performed within a coroutine or a generator will be isolated. This approach has one advantage: * For Execution Context that contains a large number of items, copy-on-write is a more efficient solution than the shallow-copy dict approach. However, we believe that copy-on-write disadvantages are more important to consider: * Copy-on-write behaviour for generators and coroutines makes EC semantics less predictable. With immutable EC approach, generators and coroutines always execute in the EC that was current at the moment of their creation. Any modifications to the outer EC while a generator or a coroutine is executing are not visible to them:: def generator(): yield 1 print(get_execution_context_item('key')) yield 2 set_execution_context_item('key', 'spam') gen = iter(generator()) next(gen) set_execution_context_item('key', 'ham') next(gen) The above script will always print 'spam' with immutable EC. With a copy-on-write approach, the above script will print 'ham'. Now, consider that ``generator()`` was refactored to call some library function, that uses Execution Context:: def generator(): yield 1 some_function_that_uses_decimal_context() print(get_execution_context_item('key')) yield 2 Now, the script will print 'spam', because ``some_function_that_uses_decimal_context`` forced the EC to copy, and ``set_execution_context_item('key', 'ham')`` line did not affect the ``generator()`` code after all. * Similarly to the previous point, ``sys.ExecutionContext.run()`` method will also become less predictable, as ``sys.get_execution_context()`` would still return a reference to the current mutable EC. We can't modify ``sys.get_execution_context()`` to return a shallow copy of the current EC, because this would seriously harm performance of ``asyncio.call_soon()`` and similar places, where it is important to propagate the Execution Context. * Even though copy-on-write requires to shallow copy the execution context object less frequently, copying will still take place in coroutines and generators. In which case, HAMT approach will perform better for medium to large sized execution contexts. All in all, we believe that the copy-on-write approach introduces very subtle corner cases that could lead to bugs that are exceptionally hard to discover and fix. The immutable EC solution in comparison is always predictable and easy to reason about. Therefore we believe that any slight performance gain that the copy-on-write solution might offer is not worth it. Faster C API ------------ Packages like numpy and standard library modules like decimal need to frequently query the global state for some local context configuration. It is important that the APIs that they use is as fast as possible. The proposed ``PyThreadState_SetExecContextItem`` and ``PyThreadState_GetExecContextItem`` functions need to get the current thread state with ``PyThreadState_GET()`` (fast) and then perform a hash lookup (relatively slow). We can eliminate the hash lookup by adding three additional C API functions: * ``Py_ssize_t PyExecContext_RequestIndex(char *key_name)``: a function similar to the existing ``_PyEval_RequestCodeExtraIndex`` introduced :pep:`523`. The idea is to request a unique index that can later be used to lookup context items. The ``key_name`` can later be used by ``sys.ExecutionContext`` to introspect items added with this API. * ``PyThreadState_SetExecContextIndexedItem(Py_ssize_t index, PyObject *val)`` and ``PyThreadState_GetExecContextIndexedItem(Py_ssize_t index)`` to request an item by its index, avoiding the cost of hash lookup. Why setting a key to None removes the item? ------------------------------------------- Consider a context manager:: @contextmanager def context(x): old_x = get_execution_context_item('x') set_execution_context_item('x', x) try: yield finally: set_execution_context_item('x', old_x) With ``set_execution_context_item(key, None)`` call removing the ``key``, the user doesn't need to write additional code to remove the ``key`` if it wasn't in the execution context already. An alternative design with ``del_execution_context_item()`` method would look like the following:: @contextmanager def context(x): not_there = object() old_x = get_execution_context_item('x', not_there) set_execution_context_item('x', x) try: yield finally: if old_x is not_there: del_execution_context_item('x') else: set_execution_context_item('x', old_x) Can we fix ``PyThreadState_GetDict()``? --------------------------------------- ``PyThreadState_GetDict`` is a TLS, and some of its existing users might depend on it being just a TLS. Changing its behaviour to follow the Execution Context semantics would break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem: enhance Context Manager Protocol with two new methods: ``__suspend__`` and ``__resume__``. To make it compatible with async/await, the Asynchronous Context Manager Protocol will also need to be extended with ``__asuspend__`` and ``__aresume__``. This allows to implement context managers like decimal context and ``numpy.errstate`` for generators and coroutines. The following code:: class Context: def __enter__(self): self.old_x = get_execution_context_item('x') set_execution_context_item('x', 'something') def __exit__(self, *err): set_execution_context_item('x', self.old_x) would become this:: class Context: def __enter__(self): self.old_x = get_execution_context_item('x') set_execution_context_item('x', 'something') def __suspend__(self): set_execution_context_item('x', self.old_x) def __resume__(self): set_execution_context_item('x', 'something') def __exit__(self, *err): set_execution_context_item('x', self.old_x) Besides complicating the protocol, the implementation will likely negatively impact performance of coroutines, generators, and any code that uses context managers, and will notably complicate the interpreter implementation. It also does not solve the leaking state problem for greenlet/gevent. :pep:`521` also does not provide any mechanism to propagate state in a local context, like storing a request object in an HTTP request handler to have better logging. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- Because async/await code needs an event loop to run it, an EC-like solution can be implemented in a limited way for coroutines. Generators, on the other hand, do not have an event loop or trampoline, making it impossible to intercept their ``yield`` points outside of the Python interpreter. Reference Implementation ======================== The reference implementation can be found here: [11]_. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.… .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashma… .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.… .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-ideas mailing list Python-ideas(a)python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ [View Less]

2 1

Towards harmony with JavaScript?
by Jason H Aug. 12, 2017

Aug. 12, 2017

Before I done my firesuit, I'd like to say that I much prefer python and I rail on JS whenever I can. However these days it is quite common to be doing work in both Python and Javascript. Harmonizing the two would help JS developers pick up the language as well as people like me that are stuck working in JS as well. TIOBE has Python at 5 and JS at 8 https://www.tiobe.com/tiobe-index/ Redmonk: 1 and 1, respectively http://redmonk.com/sogrady/2017/06/08/language-rankings-6-17/ PYPL: 2 and 5 … [View More]

10 25

Generator syntax hooks?
by Soni L. Aug. 11, 2017

Aug. 11, 2017

The generator syntax, (x for x in i if c), currently always creates a new generator. I find this quite inefficient: {x for x in integers if 1000 <= x < 1000000} # never completes, because it's trying to iterate over all integers What if, somehow, object `integers` could hook the generator and produce the equivalent of {x for x in range(1000, 1000000)}, which does complete? What if, (x for x in integers if 1000 <= x < 1000000), was syntax sugar for (x for x in range(1000, … [View More]

13 38

PyPI JSON Metadata Standardization for Mirrors
by Cooper Ry Lees Aug. 11, 2017

Aug. 11, 2017

Hi all, First time emailer, so please be kind. Also, if this is not the right mailing list for PyPA talk, I apologize. Please point me in the right direction if so. The main reason I have emailed here is I believe it may be PEP time to standardize the JSON metadata that PyPI makes available, like what was done for the `'simple API` described in PEP503. I've been doing a bit of work on `bandersnatch` (I didn't name it), which is a PEP 381 mirroring package and wanted to enhance it to also … [View More]

3 2

Mimetypes Include application/json
by Nate. Aug. 10, 2017

Aug. 10, 2017

Hi, A friend and I have hit a funny situation with the `mimetypes.py` library guessing the type for a '.json' file. Is there a reason why '.json' hasn't been added to the mapping? Without `mailcap` installed: [root@de169da8cc46 /]# python3 -m mimetypes build.json I don't know anything about type build.json With `mailcap` installed: [root@de169da8cc46 /]# python3 -m mimetypes build.json type: application/json encoding: None We experimented with adding a mapping for '.json' to 'application/… [View More]

5 5

Pseudo methods
by Paul Laos Aug. 10, 2017

Aug. 10, 2017

Hi folks I was thinking about how sometimes, a function sometimes acts on classes, and behaves very much like a method. Adding new methods to classes existing classes is currently somewhat difficult, and having pseudo methods would make that easier. Code example: (The syntax can most likely be improved upon) def has_vowels(self: str): for vowel in ["a", "e,", "i", "o", "u"]: if vowel in self: return True This allows one to wring `string.has_vowels()` instead of `… [View More]

8 11