
Hi, Here's the PEP 550 version 2. Thanks to a very active and insightful discussion here on Python-ideas, we've discovered a number of problems with the first version of the PEP. This version is a complete rewrite (only Abstract, Rationale, and Goals sections were not updated). The updated PEP is live on python.org: https://www.python.org/dev/peps/pep-0550/ There is no reference implementation at this point, but I'm confident that this version of the spec will have the same extremely low runtime overhead as the first version. Thanks to the new ContextItem design, accessing values in the context is even faster now. Thank you! PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017 Abstract ======== This PEP proposes a new mechanism to manage execution state--the logical environment in which a function, a thread, a generator, or a coroutine executes in. A few examples of where having a reliable state storage is required: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``; * Storing request-related data such as security tokens and request data in web applications, implementing i18n; * Profiling, tracing, and logging in complex and large code bases. The usual solution for storing state is to use a Thread-local Storage (TLS), implemented in the standard library as ``threading.local()``. Unfortunately, TLS does not work for the purpose of state isolation for generators or asynchronous code, because such code executes concurrently in a single thread. Rationale ========= Traditionally, a Thread-local Storage (TLS) is used for storing the state. However, the major flaw of using the TLS is that it works only for multi-threaded code. It is not possible to reliably contain the state within a generator or a coroutine. For example, consider the following generator:: def calculate(precision, ...): with decimal.localcontext() as ctx: # Set the precision for decimal calculations # inside this block ctx.prec = precision yield calculate_something() yield calculate_something_else() Decimal context is using a TLS to store the state, and because TLS is not aware of generators, the state can leak. If a user iterates over the ``calculate()`` generator with different precisions one by one using a ``zip()`` built-in, the above code will not work correctly. For example:: g1 = calculate(precision=100) g2 = calculate(precision=50) items = list(zip(g1, g2)) # items[0] will be a tuple of: # first value from g1 calculated with 100 precision, # first value from g2 calculated with 50 precision. # # items[1] will be a tuple of: # second value from g1 calculated with 50 precision (!!!), # second value from g2 calculated with 50 precision. An even scarier example would be using decimals to represent money in an async/await application: decimal calculations can suddenly lose precision in the middle of processing a request. Currently, bugs like this are extremely hard to find and fix. Another common need for web applications is to have access to the current request object, or security context, or, simply, the request URL for logging or submitting performance tracing data:: async def handle_http_request(request): context.current_http_request = request await ... # Invoke your framework code, render templates, # make DB queries, etc, and use the global # 'current_http_request' in that code. # This isn't currently possible to do reliably # in asyncio out of the box. These examples are just a few out of many, where a reliable way to store context data is absolutely needed. The inability to use TLS for asynchronous code has lead to proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases. Current status quo is that any library, including the standard library, that uses a TLS, will likely not work as expected in asynchronous code or with generators (see [3]_ as an example issue.) Some languages that have coroutines or generators recommend to manually pass a ``context`` object to every function, see [1]_ describing the pattern for Go. This approach, however, has limited use for Python, where we have a huge ecosystem that was built to work with a TLS-like context. Moreover, passing the context explicitly does not work at all for libraries like ``decimal`` or ``numpy``, which use operator overloading. .NET runtime, which has support for async/await, has a generic solution of this problem, called ``ExecutionContext`` (see [2]_). On the surface, working with it is very similar to working with a TLS, but the former explicitly supports asynchronous code. Goals ===== The goal of this PEP is to provide a more reliable alternative to ``threading.local()``. It should be explicitly designed to work with Python execution model, equally supporting threads, generators, and coroutines. An acceptable solution for Python should meet the following requirements: * Transparent support for code executing in threads, coroutines, and generators with an easy to use API. * Negligible impact on the performance of the existing code or the code that will be using the new mechanism. * Fast C API for packages like ``decimal`` and ``numpy``. Explicit is still better than implicit, hence the new APIs should only be used when there is no acceptable way of passing the state explicitly. Specification ============= Execution Context is a mechanism of storing and accessing data specific to a logical thread of execution. We consider OS threads, generators, and chains of coroutines (such as ``asyncio.Task``) to be variants of a logical thread. In this specification, we will use the following terminology: * **Local Context**, or LC, is a key/value mapping that stores the context of a logical thread. * **Execution Context**, or EC, is an OS-thread-specific dynamic stack of Local Contexts. * **Context Item**, or CI, is an object used to set and get values from the Execution Context. Please note that throughout the specification we use simple pseudo-code to illustrate how the EC machinery works. The actual algorithms and data structures that we will use to implement the PEP are discussed in the `Implementation Strategy`_ section. Context Item Object ------------------- The ``sys.new_context_item(description)`` function creates a new ``ContextItem`` object. The ``description`` parameter is a ``str``, explaining the nature of the context key for introspection and debugging purposes. ``ContextItem`` objects have the following methods and attributes: * ``.description``: read-only description; * ``.set(o)`` method: set the value to ``o`` for the context item in the execution context. * ``.get()`` method: return the current EC value for the context item. Context items are initialized with ``None`` when created, so this method call never fails. The below is an example of how context items can be used:: my_context = sys.new_context_item(description='mylib.context') my_context.set('spam') # Later, to access the value of my_context: print(my_context.get()) Thread State and Multi-threaded code ------------------------------------ Execution Context is implemented on top of Thread-local Storage. For every thread there is a separate stack of Local Contexts -- mappings of ``ContextItem`` objects to their values in the LC. New threads always start with an empty EC. For CPython:: PyThreadState: execution_context: ExecutionContext([ LocalContext({ci1: val1, ci2: val2, ...}), ... ]) The ``ContextItem.get()`` and ``.set()`` methods are defined as follows (in pseudo-code):: class ContextItem: def get(self): tstate = PyThreadState_Get() for local_context in reversed(tstate.execution_context): if self in local_context: return local_context[self] def set(self, value): tstate = PyThreadState_Get() if not tstate.execution_context: tstate.execution_context = [LocalContext()] tstate.execution_context[-1][self] = value With the semantics defined so far, the Execution Context can already be used as an alternative to ``threading.local()``:: def print_foo(): print(ci.get() or 'nothing') ci = sys.new_context_item(description='test') ci.set('foo') # Will print "foo": print_foo() # Will print "nothing": threading.Thread(target=print_foo).start() Manual Context Management ------------------------- Execution Context is generally managed by the Python interpreter, but sometimes it is desirable for the user to take the control over it. A few examples when this is needed: * running a computation in ``concurrent.futures.ThreadPoolExecutor`` with the current EC; * reimplementing generators with iterators (more on that later); * managing contexts in asynchronous frameworks (implement proper EC support in ``asyncio.Task`` and ``asyncio.loop.call_soon``.) For these purposes we add a set of new APIs (they will be used in later sections of this specification): * ``sys.new_local_context()``: create an empty ``LocalContext`` object. * ``sys.new_execution_context()``: create an empty ``ExecutionContext`` object. * Both ``LocalContext`` and ``ExecutionContext`` objects are opaque to Python code, and there are no APIs to modify them. * ``sys.get_execution_context()`` function. The function returns a copy of the current EC: an ``ExecutionContext`` instance. The runtime complexity of the actual implementation of this function can be O(1), but for the purposes of this section it is equivalent to:: def get_execution_context(): tstate = PyThreadState_Get() return copy(tstate.execution_context) * ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)`` runs ``func(*args, **kwargs)`` in the provided execution context:: def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_ec = tstate.execution_context tstate.execution_context = ExecutionContext( ec.local_contexts + [LocalContext()] ) try: return func(*args, **kwargs) finally: tstate.execution_context = old_ec Any changes to Local Context by ``func`` will be ignored. This allows to reuse one ``ExecutionContext`` object for multiple invocations of different functions, without them being able to affect each other's environment:: ci = sys.new_context_item('example') ci.set('spam') def func(): print(ci.get()) ci.set('ham') ec = sys.get_execution_context() sys.run_with_execution_context(ec, func) sys.run_with_execution_context(ec, func) # Will print: # spam # spam * ``sys.run_with_local_context(lc: LocalContext, func, *args, **kwargs)`` runs ``func(*args, **kwargs)`` in the current execution context using the specified local context. Any changes that ``func`` does to the local context will be persisted in ``lc``. This behaviour is different from the ``run_with_execution_context()`` function, which always creates a new throw-away local context. In pseudo-code:: def run_with_local_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_ec = tstate.execution_context tstate.execution_context = ExecutionContext( old_ec.local_contexts + [lc] ) try: return func(*args, **kwargs) finally: tstate.execution_context = old_ec Using the previous example:: ci = sys.new_context_item('example') ci.set('spam') def func(): print(ci.get()) ci.set('ham') ec = sys.get_execution_context() lc = sys.new_local_context() sys.run_with_local_context(lc, func) sys.run_with_local_context(lc, func) # Will print: # spam # ham As an example, let's make a subclass of ``concurrent.futures.ThreadPoolExecutor`` that preserves the execution context for scheduled functions:: class Executor(concurrent.futures.ThreadPoolExecutor): def submit(self, fn, *args, **kwargs): context = sys.get_execution_context() fn = functools.partial( sys.run_with_execution_context, context, fn, *args, **kwargs) return super().submit(fn) EC Semantics for Coroutines --------------------------- Python :pep:`492` coroutines are used to implement cooperative multitasking. For a Python end-user they are similar to threads, especially when it comes to sharing resources or modifying the global state. An event loop is needed to schedule coroutines. Coroutines that are explicitly scheduled by the user are usually called Tasks. When a coroutine is scheduled, it can schedule other coroutines using an ``await`` expression. In async/await world, awaiting a coroutine is equivalent to a regular function call in synchronous code. Thus, Tasks are similar to threads. By drawing a parallel between regular multithreaded code and async/await, it becomes apparent that any modification of the execution context within one Task should be visible to all coroutines scheduled within it. Any execution context modifications, however, must not be visible to other Tasks executing within the same OS thread. Coroutine Object Modifications ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To achieve this, a small set of modifications to the coroutine object is needed: * New ``cr_local_context`` attribute. This attribute is readable and writable for Python code. * When a coroutine object is instantiated, its ``cr_local_context`` is initialized with an empty Local Context. * Coroutine's ``.send()`` and ``.throw()`` methods are modified as follows (in pseudo-C):: if coro.cr_local_context is not None: tstate = PyThreadState_Get() tstate.execution_context.push(coro.cr_local_context) try: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro.send(...) finally: coro.cr_local_context = tstate.execution_context.pop() else: # Perform the actual `Coroutine.send()` or # `Coroutine.throw()` call. return coro.send(...) * When Python interpreter sees an ``await`` instruction, it inspects the ``cr_local_context`` attribute of the coroutine that is about to be awaited. For ``await coro``: * If ``coro.cr_local_context`` is an empty ``LocalContext`` object that ``coro`` was created with, the interpreter will set ``coro.cr_local_context`` to ``None``. * If ``coro.cr_local_context`` was modified by Python code, the interpreter will leave it as is. This makes any changes to execution context made by nested coroutine calls within a Task to be visible throughout the Task:: ci = sys.new_context_item('example') async def nested(): ci.set('nested') asynd def main(): ci.set('main') print('before:', ci.get()) await nested() print('after:', ci.get()) # Will print: # before: main # after: nested Essentially, coroutines work with Execution Context items similarly to threads, and ``await`` expression acts like a function call. This mechanism also works for ``yield from`` in generators decorated with ``@types.coroutine`` or ``@asyncio.coroutine``, which are called "generator-based coroutines" according to :pep:`492`, and should be fully compatible with native async/await coroutines. Tasks ^^^^^ In asynchronous frameworks like asyncio, coroutines are run by an event loop, and need to be explicitly scheduled (in asyncio coroutines are run by ``asyncio.Task``.) With the currently defined semantics, the interpreter makes coroutines linked by an ``await`` expression share the same Local Context. The interpreter, however, is not aware of the Task concept, and cannot help with ensuring that new Tasks started in coroutines, use the correct EC:: current_request = sys.new_context_item(description='request') async def child(): print('current request:', repr(current_request.get())) async def handle_request(request): current_request.set(request) event_loop.create_task(child) run(top_coro()) # Will print: # current_request: None To enable correct Execution Context propagation into Tasks, the asynchronous framework needs to assist the interpreter: * When ``create_task`` is called, it should capture the current execution context with ``sys.get_execution_context()`` and save it on the Task object. * When the Task object runs its coroutine object, it should execute ``.send()`` and ``.throw()`` methods within the captured execution context, using the ``sys.run_with_execution_context()`` function. With help from the asynchronous framework, the above snippet will run correctly, and the ``child()`` coroutine will be able to access the current request object through the ``current_request`` Context Item. Event Loop Callbacks ^^^^^^^^^^^^^^^^^^^^ Similarly to Tasks, functions like asyncio's ``loop.call_soon()`` should capture the current execution context with ``sys.get_execution_context()`` and execute callbacks within it with ``sys.run_with_execution_context()``. This way the following code will work:: current_request = sys.new_context_item(description='request') def log(): request = current_request.get() print(request) async def request_handler(request): current_request.set(request) get_event_loop.call_soon(log) Generators ---------- Generators in Python, while similar to Coroutines, are used in a fundamentally different way. They are producers of data, and they use ``yield`` expression to suspend/resume their execution. A crucial difference between ``await coro`` and ``yield value`` is that the former expression guarantees that the ``coro`` will be executed fully, while the latter is producing ``value`` and suspending the generator until it gets iterated again. Generators, similarly to coroutines, have a ``gi_local_context`` attribute, which is set to an empty Local Context when created. Contrary to coroutines though, ``yield from o`` expression in generators (that are not generator-based coroutines) is semantically equivalent to ``for v in o: yield v``, therefore the interpreter does not attempt to control their ``gi_local_context``. EC Semantics for Generators ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Every generator object has its own Local Context that stores only its own local modifications of the context. When a generator is being iterated, its local context will be put in the EC stack of the current thread. This means that the generator will be able to see access items from the surrounding context:: local = sys.new_context_item("local") global = sys.new_context_item("global") def generator(): local.set('inside gen:') while True: print(local.get(), global.get()) yield g = gen() local.set('hello') global.set('spam') next(g) local.set('world') global.set('ham') next(g) # Will print: # inside gen: spam # inside gen: ham Any changes to the EC in nested generators are invisible to the outer generator:: local = sys.new_context_item("local") def inner_gen(): local.set('spam') yield def outer_gen(): local.set('ham') yield from gen() print(local.get()) list(outer_gen()) # Will print: # ham Running generators without LC ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Similarly to coroutines, generators with ``gi_local_context`` set to ``None`` simply use the outer Local Context. The ``@contextlib.contextmanager`` decorator uses this mechanism to allow its generator to affect the EC:: item = sys.new_context_item('test') @contextmanager def context(x): old = item.get() item.set('x') try: yield finally: item.set(old) with context('spam'): with context('ham'): print(1, item.get()) print(2, item.get()) # Will print: # 1 ham # 2 spam Implementing Generators with Iterators ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The Execution Context API allows to fully replicate EC behaviour imposed on generators with a regular Python iterator class:: class Gen: def __init__(self): self.local_context = sys.new_local_context() def __iter__(self): return self def __next__(self): return sys.run_with_local_context( self.local_context, self._next_impl) def _next_impl(self): # Actual __next__ implementation. ... Asynchronous Generators ----------------------- Asynchronous Generators (AG) interact with the Execution Context similarly to regular generators. They have an ``ag_local_context`` attribute, which, similarly to regular generators, can be set to ``None`` to make them use the outer Local Context. This is used by the new ``contextlib.asynccontextmanager`` decorator. The EC support of ``await`` expression is implemented using the same approach as in coroutines, see the `Coroutine Object Modifications`_ section. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. In a nutshell, greenlet design is very similar to design of generators. The main difference is that for generators, the stack is managed by the Python interpreter. Greenlet works outside of the Python interpreter, and manually saves some ``PyThreadState`` fields and pushes/pops the C-stack. Thus the ``greenlet`` package can be easily updated to use the new low-level `C API`_ to enable full support of EC. New APIs ======== Python ------ Python APIs were designed to completely hide the internal implementation details, but at the same time provide enough control over EC and LC to re-implement all of Python built-in objects in pure Python. 1. ``sys.new_context_item(description='...')``: create a ``ContextItem`` object used to access/set values in EC. 2. ``ContextItem``: * ``.description``: read-only attribute. * ``.get()``: return the current value for the item. * ``.set(o)``: set the current value in the EC for the item. 3. ``sys.get_execution_context()``: return the current ``ExecutionContext``. 4. ``sys.new_execution_context()``: create a new empty ``ExecutionContext``. 5. ``sys.new_local_context()``: create a new empty ``LocalContext``. 6. ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)``. 7. ``sys.run_with_local_context(lc:LocalContext, func, *args, **kwargs)``. C API ----- 1. ``PyContextItem * PyContext_NewItem(char *desc)``: create a ``PyContextItem`` object. 2. ``PyObject * PyContext_GetItem(PyContextItem *)``: get the current value for the context item. 3. ``int PyContext_SetItem(PyContextItem *, PyObject *)``: set the current value for the context item. 4. ``PyLocalContext * PyLocalContext_New()``: create a new empty ``PyLocalContext``. 5. ``PyLocalContext * PyExecutionContext_New()``: create a new empty ``PyExecutionContext``. 6. ``PyExecutionContext * PyExecutionContext_Get()``: get the EC for the active thread state. 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the passed EC object as the current for the active thread state. 8. ``int PyExecutionContext_SetWithLocalContext(PyExecutionContext *, PyLocalContext *)``: allows to implement ``sys.run_with_local_context`` Python API. Implementation Strategy ======================= LocalContext is a Weak Key Mapping ---------------------------------- Using a weak key mapping for ``LocalContext`` implementation enables the following properties with regards to garbage collection: * ``ContextItem`` objects are strongly-referenced only from the application code, not from any of the Execution Context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their garbage collection. * Values put in the Execution Context are guaranteed to be kept alive while there is a ``ContextItem`` key referencing them in the thread. * If a ``ContextItem`` is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed. * If a thread has ended its execution, its thread state will be cleaned up along with its ``ExecutionContext``, cleaning up all values bound to all Context Items in the thread. ContextItem.get() Cache ----------------------- We can add three new fields to ``PyThreadState`` and ``PyInterpreterState`` structs: * ``uint64_t PyThreadState->unique_id``: a globally unique thread state identifier (we can add a counter to ``PyInterpreterState`` and increment it when a new thread state is created.) * ``uint64_t PyInterpreterState->context_item_deallocs``: every time a ``ContextItem`` is GCed, all Execution Contexts in all threads will lose track of it. ``context_item_deallocs`` will simply count all ``ContextItem`` deallocations. * ``uint64_t PyThreadState->execution_context_ver``: every time a new item is set, or an existing item is updated, or the stack of execution contexts is changed in the thread, we increment this counter. The above two fields allow implementing a fast cache path in ``ContextItem.get()``, in pseudo-code:: class ContextItem: def get(self): tstate = PyThreadState_Get() if (self.last_tstate_id == tstate.unique_id and self.last_ver == tstate.execution_context_ver self.last_deallocs == tstate.iterp.context_item_deallocs): return self.last_value value = None for mapping in reversed(tstate.execution_context): if self in mapping: value = mapping[self] break self.last_value = value self.last_tstate_id = tstate.unique_id self.last_ver = tstate.execution_context_ver self.last_deallocs = tstate.interp.context_item_deallocs return value This is similar to the trick that decimal C implementation uses for caching the current decimal context, and will have the same performance characteristics, but available to all Execution Context users. Approach #1: Use a dict for LocalContext ---------------------------------------- The straightforward way of implementing the proposed EC mechanisms is to create a ``WeakKeyDict`` on top of Python ``dict`` type. To implement the ``ExecutionContext`` type we can use Python ``list`` (or a custom stack implementation with some pre-allocation optimizations). This approach will have the following runtime complexity: * O(M) for ``ContextItem.get()``, where ``M`` is the number of Local Contexts in the stack. It is important to note that ``ContextItem.get()`` will implement a cache making the operation O(1) for packages like ``decimal`` and ``numpy``. * O(1) for ``ContextItem.set()``. * O(N) for ``sys.get_execution_context()``, where ``N`` is the total number of items in the current **execution** context. Approach #2: Use HAMT for LocalContext -------------------------------------- Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT) to implement high performance immutable collections [5]_, [6]_. Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N) performance for both ``set()``, ``get()``, and ``merge()`` operations, which is essentially O(1) for relatively small mappings (read about HAMT performance in CPython in the `Appendix: HAMT Performance`_ section.) In this approach we use the same design of the ``ExecutionContext`` as in Approach #1, but we will use HAMT backed weak key Local Context implementation. With that we will have the following runtime complexity: * O(M * log\ :sub:`32`\ N) for ``ContextItem.get()``, where ``M`` is the number of Local Contexts in the stack, and ``N`` is the number of items in the EC. The operation will essentially be O(M), because execution contexts are normally not expected to have more than a few dozen of items. (``ContextItem.get()`` will have the same caching mechanism as in Approach #1.) * O(log\ :sub:`32`\ N) for ``ContextItem.set()`` where ``N`` is the number of items in the current **local** context. This will essentially be an O(1) operation most of the time. * O(log\ :sub:`32`\ N) for ``sys.get_execution_context()``, where ``N`` is the total number of items in the current **execution** context. Essentially, using HAMT for Local Contexts instead of Python dicts, allows to bring down the complexity of ``sys.get_execution_context()`` from O(N) to O(log\ :sub:`32`\ N) because of the more efficient merge algorithm. Approach #3: Use HAMT and Immutable Linked List ----------------------------------------------- We can make an alternative ``ExecutionContext`` design by using a linked list. Each ``LocalContext`` in the ``ExecutionContext`` object will be wrapped in a linked-list node. ``LocalContext`` objects will use an HAMT backed weak key implementation described in the Approach #2. Every modification to the current ``LocalContext`` will produce a new version of it, which will be wrapped in a **new linked list node**. Essentially this means, that ``ExecutionContext`` is an immutable forest of ``LocalContext`` objects, and can be safely copied by reference in ``sys.get_execution_context()`` (eliminating the expensive "merge" operation.) With this approach, ``sys.get_execution_context()`` will be an **O(1) operation**. Summary ------- We believe that approach #3 enables an efficient and complete Execution Context implementation, with excellent runtime performance. `ContextItem.get() Cache`_ enables fast retrieval of context items for performance critical libraries like decimal and numpy. Fast ``sys.get_execution_context()`` enables efficient management of execution contexts in asynchronous libraries like asyncio. Design Considerations ===================== Can we fix ``PyThreadState_GetDict()``? --------------------------------------- ``PyThreadState_GetDict`` is a TLS, and some of its existing users might depend on it being just a TLS. Changing its behaviour to follow the Execution Context semantics would break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem: enhance Context Manager Protocol with two new methods: ``__suspend__`` and ``__resume__``. To make it compatible with async/await, the Asynchronous Context Manager Protocol will also need to be extended with ``__asuspend__`` and ``__aresume__``. This allows to implement context managers like decimal context and ``numpy.errstate`` for generators and coroutines. The following code:: class Context: def __enter__(self): self.old_x = get_execution_context_item('x') set_execution_context_item('x', 'something') def __exit__(self, *err): set_execution_context_item('x', self.old_x) would become this:: local = threading.local() class Context: def __enter__(self): self.old_x = getattr(local, 'x', None) local.x = 'something' def __suspend__(self): local.x = self.old_x def __resume__(self): local.x = 'something' def __exit__(self, *err): local.x = self.old_x Besides complicating the protocol, the implementation will likely negatively impact performance of coroutines, generators, and any code that uses context managers, and will notably complicate the interpreter implementation. :pep:`521` also does not provide any mechanism to propagate state in a local context, like storing a request object in an HTTP request handler to have better logging. Nor does it solve the leaking state problem for greenlet/gevent. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- Because async/await code needs an event loop to run it, an EC-like solution can be implemented in a limited way for coroutines. Generators, on the other hand, do not have an event loop or trampoline, making it impossible to intercept their ``yield`` points outside of the Python interpreter. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Appendix: HAMT Performance ========================== To assess if HAMT can be used for Execution Context, we implemented it in CPython [7]_. .. figure:: pep-0550-hamt_vs_dict.png :align: center :width: 100% Figure 1. Benchmark code can be found here: [9]_. Figure 1 shows that HAMT indeed displays O(1) performance for all benchmarked dictionary sizes. For dictionaries with less than 100 items, HAMT is a bit slower than Python dict/shallow copy. .. figure:: pep-0550-lookup_hamt.png :align: center :width: 100% Figure 2. Benchmark code can be found here: [10]_. Figure 2 shows comparison of lookup costs between Python dict and an HAMT immutable mapping. HAMT lookup time is 30-40% worse than Python dict lookups on average, which is a very good result, considering how well Python dicts are optimized. Note, that according to [8]_, HAMT design can be further improved. Acknowledgments =============== I thank Elvis Pranskevichus and Victor Petrovykh for countless discussions around the topic and PEP proof reading and edits. Thanks to Nathaniel Smith for proposing the ``ContextItem`` design [17]_ [18]_, for pushing the PEP towards a more complete design, and coming up with the idea of having a stack of contexts in the thread state. Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.a... .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap... .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.p... .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c .. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html .. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html .. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046780.html Copyright ========= This document has been placed in the public domain.

2017-08-16 1:55 GMT+02:00 Yury Selivanov <yselivanov.ml@gmail.com>:
Minor suggestion: Could we allow something like `sys.set_new_context_item(description='mylib.context', initial_value='spam')`? That would make it easier for type checkers to infer the type of a ContextItem, and it would save a line of code in the common case. With this modification, the type of new_context_item would be @overload def new_context_item(*, description: str, initial_value: T) -> ContextItem[T]: ... @overload def new_context_item(*, description: str) -> ContextItem[Any]: ... If we only allow the second variant, type checkers would need some sort of special casing to figure out that after .set(), .get() will return the same type.

On Tue, Aug 15, 2017 at 11:53 PM, Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
This is a really handy feature in general, actually! In fact all of asyncio's thread-locals define initial values (using a trick involving subclassing threading.local), and I recently added this feature to trio.TaskLocal as well just because it's so convenient. However, something that you realize almost immediately when trying to use this is that in many cases, what you actually want is an initial value *factory*. Like, if you write new_context_item(initial_value=[]) then you're going to have a bad time. So, should we support something like new_context_item(initializer=lambda: [])? The semantics are a little bit subtle. I guess it would be something like: if ci.get() goes to find the value and fails at all levels, then we call the factory function and assign its return value to the *deepest* LC, EC[0]. The idea being that we're pretending that the value was there all along in the outermost scope, you just didn't notice before now.
I'm not super familiar with PEP 484. Would using a factory function instead of an initial value break this type inference? If you want to automatically infer that whatever type I use to initialize the value is the only type it can ever have, is there a way for users to easily override that? Like could I write something like my_ci: ContextItem[int, str] = new_context_item(initial_value=0) ? -n -- Nathaniel J. Smith -- https://vorpus.org

On 16 August 2017 at 18:37, Nathaniel Smith <njs@pobox.com> wrote:
I actually wondered about this in the context of the PEP saying that "context items are set to None by default", as it isn't clear what that means for the behaviour of sys.new_execution_context(). The PEP states that the latter API creates an "empty" execution context, but the notion of a fresh EC being truly empty conflicts with the notion of all defined config items having a default value of None. I think your idea resolves that nicely: if context_item.get() failed to find a suitable context entry, it would do: base_context = ec.local_contexts[0] default_value = sys.run_with_local_context(base_context, self.default_factory) sys.run_with_local_context(base_context, self.set, default_value) The default setting for default_factory could then be to raise RuntimeError complaining that the context item isn't set in the current context. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 2:53 AM, Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote: [..]
I think that trying to infer the type of CI values by its default value is not the way to go: ci = sys.ContextItem(default=1) Is CI an int? Likely. Can it be set to None? Maybe, for some use-cases it might be what you want. The correct way IMO is to extend the typing module: ci1: typing.ContextItem[int] = sys.ContextItem(default=1) # ci1: is an int, and can't be anything else. ci2: typing.ContextItem[typing.Optional[int]] = sys.ContextItem(default=42) # ci2 is 42 by default, but can be reset to None. ci3: typing.ContextItem[typing.Union[int, str]] = sys.ContextItem(default='spam') # ci3 can be an int or str, can't be None. This is also forward compatible with proposals to add a `default_factory` or `initializer` parameter to ContextItems. Yury

On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Hi,
Here's the PEP 550 version 2.
Awesome! Some of the changes from v1 to v2 might be a bit confusing -- in particular the thing where ExecutionContext is now a stack of LocalContext objects instead of just being a mapping. So here's the big picture as I understand it: In discussions on the mailing list and off-line, we realized that the main reason people use "thread locals" is to implement fake dynamic scoping. Of course, generators/async/await mean that currently it's impossible to *really* fake dynamic scoping in Python -- that's what PEP 550 is trying to fix. So PEP 550 v1 essentially added "generator locals" as a refinement of "thread locals". But... it turns out that "generator locals" aren't enough to properly implement dynamic scoping either! So the goal in PEP 550 v2 is to provide semantics strong enough to *really* get this right. I wrote up some notes on what I mean by dynamic scoping, and why neither thread-locals nor generator-locals can fake it: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb
If you're more familiar with dynamic scoping, then you can think of an LC as a single dynamic scope...
* **Execution Context**, or EC, is an OS-thread-specific dynamic stack of Local Contexts.
...and an EC as a stack of scopes. Looking up a ContextItem in an EC proceeds by checking the first LC (innermost scope), then if it doesn't find what it's looking for it checks the second LC (the next-innermost scope), etc.
Two issues here, that both require some expansion of this API to reveal a *bit* more information about the EC structure. 1) For trio's cancel scope use case I described in the last, I actually need some way to read out all the values on the LocalContext stack. (It would also be helpful if there were some fast way to check the depth of the ExecutionContext stack -- or at least tell whether it's 1 deep or more-than-1 deep. I know that any cancel scopes that are in the bottommost LC will always be attached to the given Task, so I can set up the scope->task mapping once and re-use it indefinitely. OTOH for scopes that are stored in higher LCs, I have to check at every yield whether they're currently in effect. And I want to minimize the per-yield workload as much as possible.) 2) For classic decimal.localcontext context managers, the idea is still that you save/restore the value, so that you can nest multiple context managers without having to push/pop LCs all the time. But the above API is not actually sufficient to implement a proper save/restore, for a subtle reason: if you do ci.set(ci.get()) then you just (potentially) moved the value from a lower LC up to the top LC. Here's an example of a case where this can produce user-visible effects: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of... There are probably a bunch of options for fixing this. But basically we need some API that makes it possible to temporarily set a value in the top LC, and then restore that value to what it was before (either the previous value, or 'unset' to unshadow a value in a lower LC). One simple option would be to make the idiom be something like: @contextmanager def local_value(new_value): state = ci.get_local_state() ci.set(new_value) try: yield finally: ci.set_local_state(state) where 'state' is something like a tuple (ci in EC[-1], EC[-1].get(ci)). A downside with this is that it's a bit error-prone (very easy for an unwary user to accidentally use get/set instead of get_local_state/set_local_state). But I'm sure we can come up with something.
If there are enough of these functions then it might make sense to stick them in their own module instead of adding more stuff to sys. I guess worrying about that can wait until the API details are more firm though.
I like all the ideas in this section, but this specific point feels a bit weird. Coroutine objects need a second hidden field somewhere to keep track of whether the object they end up with is the same one they were created with? If I set cr_local_context to something else, and then set it back to the original value, does that trigger the magic await behavior or not? What if I take the initial LocalContext off of one coroutine and attach it to another, does that trigger the magic await behavior? Maybe it would make more sense to have two sentinel values: UNINITIALIZED and INHERIT?
I wonder if it would be useful to have an option to squash this execution context down into a single LocalContext, since we know we'll be using it for a while and once we've copied an ExecutionContext it becomes impossible to tell the difference between one that has lots of internal LocalContexts and one that doesn't. This could also be handy for trio/curio's semantics where they initialize a new task's context to be a shallow copy of the parent task: you could do new_task_coro.cr_local_context = get_current_context().squash() and then skip having to wrap every send() call in a run_in_context.
Hmm. I assume you're simplifying for expository purposes, but 'yield from' isn't the same as 'for v in o: yield v'. In fact PEP 380 says: "Motivation: [...] a piece of code containing a yield cannot be factored out and put into a separate function in the same way as other code. [...] If yielding of values is the only concern, this can be performed without much difficulty using a loop such as 'for v in g: yield v'. However, if the subgenerator is to interact properly with the caller in the case of calls to send(), throw() and close(), things become considerably more difficult. As will be seen later, the necessary code is very complicated, and it is tricky to handle all the corner cases correctly." So it seems to me that the whole idea of 'yield from' is that it's supposed to handle all the tricky bits needed to guarantee that if you take some code out of a generator and refactor it into a subgenerator, then everything works the same as before. This suggests that 'yield from' should do the same magic as 'await', where by default the subgenerator shares the same LocalContext as the parent generator. (And as a bonus it makes things simpler if 'yield from' and 'await' work the same.)
You showed how to make an iterator that acts like a generator. Is it also possible to make an async iterator that acts like an async generator? It's not immediately obvious, because you need to make sure that the local context gets restored each time you re-enter the __anext__ generator. I think it's something like: class AIter: def __init__(self): self._local_context = ... # Note: intentionally not async def __anext__(self): coro = self._real_anext() coro.cr_local_context = self._local_context return coro async def _real_anext(self): ... Does that look right?
I think this can be refined further (and I don't understand context_item_deallocs -- maybe it's a mistake?). AFAICT the things that invalidate a ContextItem's cache are: 1) switching threadstates 2) popping or pushing a non-empty LocalContext off the current threadstate's ExecutionContext 3) calling ContextItem.set() on *that* context item So I'd suggest tracking the thread state id, a counter of how many non-empty LocalContexts have been pushed/popped on this thread state, and a *per ContextItem* counter of how many times set() has been called.
While this is mostly true in the strict sense, in practice this PEP is useless if existing thread-local users like decimal and numpy can't migrate to it without breaking backcompat. So maybe this section should discuss that? (For example, one constraint on the design is that we can't provide only a pure push/pop API, even though that's what would be most convenient context managers like decimal.localcontext or numpy.errstate, because we also need to provide some backcompat story for legacy functions like decimal.setcontext and numpy.seterr.) -n -- Nathaniel J. Smith -- https://vorpus.org

On 16 August 2017 at 17:18, Nathaniel Smith <njs@pobox.com> wrote: [Yury wrote]
I'm actually wondering if it may be worth defining a _contextlib module (to export the interpreter level APIs to Python code), and making contextlib the official home of the user facing API. That we we can use contextlib2 to at least attempt to polyfill the coroutine parts of the proposal for 3.5+, even if the implicit generator changes are restricted to 3.7+ .
It feels odd to me as well, and I'm wondering if we can actually simplify this by saying: 1. Generator contexts (both sync and async) are isolated by default (__local_context__ = LocalContext()) 2. Coroutine contexts are *not* isolated by default (__local_context__ = None) Running top level task coroutines in separate execution contexts then becomes the responsibility of the event loop, which the PEP already lists as a required change in 3rd party libraries to get this all to work properly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 5:36 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
This is an interesting twist, and I like it. This will change asyncio.Task from: class Task: def __init__(self, coro): ... self.exec_context = sys.get_execution_context() def step(): sys.run_with_execution_context(self.coro.send) to: class Task: def __init__(self, coro): ... self.local_context = sys.new_local_context() def step(): sys.run_with_local_context(self.local_context, self.coro.send) And we don't need ceval to do anything for "await", which means that with this approach we won't touch ceval.c at all. Yury

On Wed, Aug 16, 2017 at 12:51 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
And immediately after I hit "send" I realized that this is a bit more complicated. In order for Tasks to remember the full execution context of where they were created, we need a new method that would allow to run with *both* exec and local contexts: class Task: def __init__(self, coro): ... self.local_context = sys.new_local_context() self.exec_context = sys.get_execution_context() def step(): sys.run_with_contexts(self.exec_context, self.local_context, self.coro.send) This is needed for the following PEP example to work properly: current_request = sys.new_context_item(description='request') async def child(): print('current request:', repr(current_request.get())) async def handle_request(request): current_request.set(request) event_loop.create_task(child) run(top_coro()) See https://www.python.org/dev/peps/pep-0550/#tasks Yury

On Wed, Aug 16, 2017 at 12:55 PM, Yury Selivanov [..]
Never mind, the actual implementation would be as simple as: class Task: def __init__(self, coro): ... coro.cr_local_context = sys.new_local_context() self.exec_context = sys.get_execution_context() def step(): sys.run_with_execution_context(self.exec_contex , self.coro.send) No need for another "run_with_context" function. Yury

On 17 August 2017 at 02:55, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I don't think that's entirely true, since you can nest the calls even without a combined API: sys.run_with_execution_context(self.exec_context, sys.run_with_local_context, self.local_context, self.coro.send) Offering a combined API may still make sense for usability and efficiency reasons, but it isn't strictly necessary. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 12:18:23AM -0700, Nathaniel Smith wrote:
I'm still trying to digest this with very little time for it. It *is* slightly confusing. Perhaps it would be possible to name the data structures by their functionality. E.g. if ExecutionContext is a stack, use ExecutionStack? Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope or even DynamicScope. This sounds like bikeshedding, but I find it difficult to have ExecutionContext, ContextItem, LocalContext in addition to the actual decimal.localcontext() and PyDecContext. For example, should PyDecContext inherit from ContextItem? I don't fully understand. :-/ Stefan Krah

On Wed, Aug 16, 2017 at 10:25 AM, Stefan Krah <stefan@bytereef.org> wrote:
I'm -1 on calling this thing a "scope" or "dynamic scope", as I think it will be even more confusing to Python users. When I think of "scoping" I usually think about Python name scopes -- locals, globals, nonlocals, etc. I'm afraid that adding another dimension to this vocabulary won't help anyone. "Context" is an established term for what PEP 550 tries to accomplish. It's used in multiple languages and runtimes, and while researching this topic I didn't see anybody confused with the concept on StackOverflow/etc.
No, you wouldn't be able to extend ContextItem type. The way for decimal it so simply do the following: In Python: _current_ctx = sys.ContextItem('decimal context') # later when you set decimal context _current_ctx.set(DecimalContext) # whenever you need to get the current context dc = _current_ctx.get() In C: PyContextItem * _current_ctx = PyContext_NewItem("decimal context"); if (_current_ctx == NULL) { /* error */ } # later when you set decimal context PyDecContextObject *ctx; ... if (PyContext_SetItem(_current_ctx, (PyObject*)ctx)) { /* error */ } # whenever you need to get the current context PyDecContextObject *ctx = PyContext_GetItem(_current_ctx); if (ctx == NULL) { /* error */ } if (ctx == Py_None) { /* not initialized, nothing is there */ } We didn't really discuss C APIs at this point, and it's very likely that they will be adjusted, but the general idea should stay the same. All in all, the complexity of _decimal.c will only decrease with PEP 550, while getting better support for generators/async. Yury

On Wed, Aug 16, 2017 at 11:00:43AM -0400, Yury Selivanov wrote:
For me a context is a "single thing" that is usually used to thread state through functions. I guess I'd call "environment" what you call "context".
Thanks! This makes it a lot clearer. I'd probably use (stealing Nick's key suggestion): PyEnvKey *_current_context_key = PyEnv_NewKey("___DECIMAL_CONTEXT__"); ... PyDecContextObject *ctx = PyEnv_GetItem(_current_ctx_key); Stefan Krah

On Wed, Aug 16, 2017 at 12:40:26PM -0400, Yury Selivanov wrote:
Yeah, I usually think about symbol tables. FWIW, I find this terminology quite reasonable: https://hackernoon.com/execution-context-in-javascript-319dd72e8e2c The main points are ExecutionContextStack/FunctionalExecutionContext vs. ExecutionContext/LocalContext. Stefan Krah

On Wed, Aug 16, 2017 at 1:13 PM, Stefan Krah <stefan@bytereef.org> wrote:
Thanks for the link! I think it actually explains the JS language spec wrt how scoping of regular variables is implemented.
The main points are ExecutionContextStack/FunctionalExecutionContext
vs. ExecutionContext/LocalContext.
While I'm trying to avoid using scoping terminology for PEP 550, there's one parallel -- as with regular Python scoping you have global variables and you have local variables. You can use the locals() to access to your local scope, and you can use globals() to access to your global scope. Similarly in PEP 550, you have your LocalContext and ExecutionContext. We don't want to call ExecutionContext a "Global Context" because it is fundamentally OS-thread-specific (contrary to Python globals). LocalContexts are created for threads, generators, coroutines and are really similar to local scoping. Adding more names for local contexts like CoroutineLocalContext, GeneratorLocalContext won't solve anything either. All in all, Local Context is what its name stands for -- it's a local context for your current logical scope, be it a coroutine or a generator. At this point PEP 550 is very different from ExecutionContext in .NET, but there are still many similarities. That's a +1 to keep its current name. ExecutionContextStack and ExecutionContextChain reflect the implementation of PEP 550 on some level, but for most Python users they won't mean anything. If they want to learn how EC works, they just need to read the PEP (or documentation). Otherwise they will just use the ContextKey API and it should just work for them. So IMO, ExecutionContext and LocalContext are really the best names of all that were proposed so far. Yury

On 17 August 2017 at 04:38, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
To be honest, the difference between LocalContext and ExecutionContext feels more like the difference between locals() and lexical closure variables than it does the difference between between locals() and globals(). It's just that where the scoping rules are a compile time thing related to lexical closures, PEP 550 is about defining a dynamic context.
In addition to it being different from the way the decimal module already uses the phrase, one of the reasons I don't want to call it a LocalContext is because doing so brings in the suggestion that it is somehow connected to the locals() scope, and it isn't - there are plenty of things (most notably, function calls) that will change the active local namespace, but *won't* change the active execution context.
But unlike locals() itself, it *isn't* linked to a specific frame of execution - it's deliberately designed to be shared *between* frames. If you don't like either of the ExecutionContext/ExecutionEnvironment or ExecutionContext/ExecutionContextChain combinations, how would you feel about ExecutionContext + DynamicContext? Saying that "ck.set_value(value) sets the value corresponding to the given context key in the currently active execution context" is still my preferred terminology for setting values, and I think the following would work well for reading values: ck.get_value() attempts to look up the value for that key in the currently active execution context. If it doesn't find one, it then tries each of the execution contexts in the currently active dynamic context. If it *still* doesn't find one, then it will set the default value in the outermost execution context and then return that value. One thing I like about that phrasing is that we'd be using the word dynamic in exactly the same sense that dynamic scoping uses it, and the dynamic context mechanism would become PEP 550's counterpart to the lexical closure support in Python's normal scoping rules. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 6:25 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
I really like DynamicContext -- if you know the classic dynamic/static terminology in language design then it works as a precise technical description, but it also makes sense as plain non-technical English. And it avoids the confusingly overloaded word "scope". Apropos Guido's point about container naming, how about DynamicContext and DynamicContextStack? That's only 3 letters longer than ExecutionContext. -n -- Nathaniel J. Smith -- https://vorpus.org

On 17 August 2017 at 00:25, Stefan Krah <stefan@bytereef.org> wrote:
Agreed, I don't think we have the terminology quite right yet. For "ContextItem" for example, we may actually be better off calling it "ContextKey", and have the methods be "ck.get_value()" and "ck.set_value()". That would get us closer to the POSIX TSS terminology, and emphasises that the objects themselves are best seen as opaque references to a key that lets you get and set the corresponding value in the active execution context. I do think we should stick with "context" rather than bringing dynamic scopes into the mix - while dynamic scoping *is* an accurate term for what we're doing at a computer science level, Python itself tends to reserve the term scoping for the way the compiler resolves names, which we're deliberately *not* touching here. Avoiding a naming collision with decimal.localcontext() would also be desirable. Yury, what do you think about moving the ExecutionContext name to what the PEP currently calls LocalContext, and renaming the current ExecutionContext type to ExecutionContextChain? The latter name then hints at the collections.ChainMap style behaviour of ck.get_value() lookups, without making any particular claims about what the internal implementation data structures actually are. The run methods could then be sys.run_with_context_chain() (to ignore the current context entirely and use a completely separate context chain) and sys.run_with_active_context() (to append a single execution context onto the end of the current context chain) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 11:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
With the confusion of what "empty ExecutionContext" and "ContextItem is set to None by default", I tend to agree that "ContextKey" might be a better name. A default for "ContextKey" means something that will be returned if the lookup failed, plain and simple.
+1, I feel the same about this.
Avoiding a naming collision with decimal.localcontext() would also be desirable.
The ContextItem (or ContextKey) that decimal will be using will be an implementation detail, and it must not be exposed to the public API of the module.
While I think that the naming issue is important, the API that will be used most of the time is ContextItem. That's the name in the spotlight.
sys.run_with_context_chain and sys.run_with_active_context sound *really* confusing to me. Maybe it's because I spent too much time thinking about the current PEP 550 naming. To be honest, I really like Execution Context and Local Context names. I'm curious if other people are confused with them. Yury

I'm also confused by these, because they share the noun part of their name, but their use and meaning is quite different. The PEP defines an EC as a stack of LCs, and (apart from strings :-) it's usually not a good idea to use the same term for a container and its items. On Fri, Aug 18, 2017 at 6:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Aug 16, 2017 at 3:18 AM, Nathaniel Smith <njs@pobox.com> wrote:
Thanks! [..]
Yes. We touched upon this topic in parallel threads, so I'll just briefly mention this here: I deliberately avoided using "scope" in PEP 550 naming, as "scoping" in Python is usually associated with names/globals/locals/nonlocals etc. Adding another "level" of scoping will be very confusing for users (IMO).
We can add an API for returning the full stack of values for a CI: ContextItem.iter_stack() -> Iterator # or ContextItem.get_stack() -> List Because some of the LC will be empty, what you'll get is a list with some None values in it, like: [None, val1, None, None, val2] The length of the list will tell you how deep the stack is.
Yeah, this is tricky. The main issue is indeed the confusion of what methods you need to call -- "get/set" or "get_local_state/set_local_state". On some level the problem is very similar to regular Python scoping rules: 1. we have local hames 2. we have global names 3. we nave 'nonlocal' modifier IOW scoping isn't easy, and you need to be conscious of what you do. It's just that we are so used to these scoping rules that they have a low cognitive effort for us. One of the ideas that I have in mind is to add another level of indirection to separate "global get" from "local set/get": 1. Rename ContextItem to ContextKey (reasoning for that in parallel thread) 2. Remove ContextKey.set() method 3. Add a new ContextKey.value() -> ContextValue ck = ContextKey() with ck.value() as val: val.set(spam) yield or val = ck.value() val.set(spam) try: yield finally: val.clear() Essentially ContextValue will be the only API to set values in execution context. ContextKey.get() will be used to get them. Nathaniel, Nick, what do you guys think? [..]
I'm OK with this idea -- pystate.c becomes way too crowded. Maybe we should just put this stuff in _contextlib.c and expose in the contextlib module.
Yes, I planned to have a second hidden field, as Coroutines will have their cr_local_context set to NULL, and that will be their empty LC. So a second internal field is needed to disambiguate NULL -- meaning an "empty context" and NULL meaning "use outside local context". I omitted this from the PEP to make it a bit easier to digest, as this seemed to be a low-level implementation detail.
All good questions. I don't like sentinels in general, I'd be more OK with a "gi_isolated_local_context" flag (we're back to square one here). But I don't think we should add it. My thinking is that once you start writing to "gi_local_context" -- all bets are off, and you manage this from now on (meaning that some internal coroutine flag will be set to 1, and the interpreter will never touch local_context of this coroutine): 1. If you write None -- it means that the generator/coroutine will not have its own LC. 2. If you write you own LC object -- the generator/coroutine will use it.
I think this would be a bit too low-level. I'd prefer to defer solving the "squashing" problem until I have a reference implementation and we can test this. Essentially, this is an optimization problem--the EC implementation can just squash the chain itself, when the chain is longer than 5 LCs. Or something like this. But exposing this to Python level would be like letting a program to tinker GCC -O flags after it's compiled IMO. [..]
I see what you are saying here, but 'yield from' for generators is still different from awaits, as you can partially iterate the generator and *then* "yield from" from it: def foo(): g = gen() val1 = next(g) val2 = next(g) # do some computation? yield from g ... def gen(): # messing with EC between yields In general, I still think that 'yield from g' is semantically equivalent to 'for i in g: yield i' for most users.
Yes, seems to be correct.
Now that you highlighted the deallocs counter and I thought about it a bit more I don't think it's needed :) I'll remove it.
Excellent idea, will be in the next version of the PEP.
The main purpose of this section is to tell if some parts of the PEP are breaking some existing code/patterns or if it imposes a significant performance penalty. PEP 550 does neither of these things. If decimal/numpy simply switch to using new APIs, everything should work as expected for them, with the exception that assigning a new decimal context (without a context manager) will be isolated in generators. Which I'd consider as a bug fix. We can add a new section to discuss the specifics. Yury

On 17 August 2017 at 02:36, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I think I don't want to have try to explain to anyone what happens if I get a context value in my current execution environment and then send that value reference into a different execution context :) So I'd prefer my earlier proposal of: # Resolve key in current execution environment ck.get_value() # Assign to key in current execution context ck.set_value(value) # Assign to key in specific execution context sys.run_with_active_context(ec, ck.set_value, value) One suggestion I do like is Stefan's one of using "ExecutionContext" to refer to the namespace that ck.set_value() writes to, and then "ExecutionEnvironment" for the whole chain that ck.get_value() reads. Similar to "generator" and "package", we'd still end up with "context" being inherently ambiguous when used without qualification: - PEP 550 execution context - exception handling context (for chained exceptions) - with statement context - various context objects, like the decimal context But we wouldn't have two different kinds of context within PEP 550 itself. Instead, we'd have to start disambiguating the word environment: - PEP 550 execution environment - process environment (i.e. os.environ) The analogy between process environments and execution environments wouldn't be exact (since the key-value pairs in process environments are copied eagerly rather than via lazily chained lookups), but once you account for that, the parallels between an operating system level process environment tree and a Python level execution environment tree as proposed in PEP 550 seem like they would be helpful rather than confusing.
Yeah, I'd be OK with that - if we're going to reuse the word, it makes sense to reuse the module to expose the related machinery. That said, if we do go that way *and* we decide to offer a coroutine-only backport, I see an offer of contextlib2 co-maintainership in your future ;)
Given that the field is writable, I think it makes more sense to just choose a suitable default, and then rely on other code changing that default when its not right. For generators: set it to an empty context by default, have contextlib.contextmanager (and similar wrapper) clear it For coroutines: set it to None by default, have async task managers give top level coroutines their own private context No hidden flags, no magic value adjustments, just different defaults for coroutines and generators (including async generators). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nathaniel Smith schrieb am 16.08.2017 um 09:18:
+1
I agree with Nathaniel that many projects that can benefit from this feature will need to keep supporting older Python versions as well. In the case of Cython, that's Py2.6+. We already have the problem that the asynchronous finalisation of async generators cannot be supported in older Python versions ("old" as in Py3.5 and before), so we end up with a language feature that people can use in Py2.6, but not completely/safely. I can't say yet how difficult it will be to integrate the new infrastructure that this PEP proposes into a backwards compatible code base, but if there's something we can think of now in order to help projects keep supporting older Python versions in the same code base, given the constraints of their existing APIs and semantics - that would be great. Stefan

On 18 August 2017 at 16:12, Stefan Behnel <stefan_ml@behnel.de> wrote:
One aspect of this that we're considering is to put the Python level API in contextlib rather than in sys. That has the pragmatic benefit that contextlib2 then becomes the natural home for an API backport, and we should be able to get the full *explicit* API working on older versions (even if it means introducing an optional C extension module as a dependency to get that part of the API working fully). To backport the isolation of generators, we'd likely be able to provide a decorator that explicitly isolated generators, but it wouldn't be feasible to backport implicit isolation. The same would go for the various other proposals for implicit isolation - when running on older versions, the general principle would be "if you (or a library/framework you're using) didn't explicitly isolate the execution context, assume it's not isolated". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 2:12 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
I think it's Cython's quest to try to backport support of all new Python 3.x language features to be 2.6-compatible, which sometimes can be questionable. You can add support of PEP 550 semantics to code that was compiled with Cython, but pure Python code won't be able to support it. This, in my opinion, could cause more confusion than benefit, so for Cython I think the solution is to do nothing in this case. We'll (maybe) backport some functionality to contextlib2. In my opinion, any code that uses contextlib2 in Python should work exactly the same when it's compiled with Cython. Yury

Cool to see this on python-ideas. I'm really looking forward to this PEP 550 or 521. On Wednesday, August 16, 2017 at 3:19:29 AM UTC-4, Nathaniel Smith wrote:
I agree with Nathaniel that this is an issue with the current API. I don't think it's a good idea to have set and get methods. It would be much better to reflect the underlying ExecutionContext *stack* in the API by exposing a mutating *context manager* on the Context Key object instead of set. For example, my_context = sys.new_context_key('my_context') options = my_context.get() options.some_mutating_method() with my_context.mutate(options): # Do whatever you want with the mutated context # Now, the context is reverted. Similarly, instead of my_context.set('spam') you would do with my_context.mutate('spam'): # Do whatever you want with the mutated context # Now, the context is reverted.

On Sat, Aug 19, 2017 at 12:09 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Unfortunately, I don't think we can eliminate the set() operation entirely, because the libraries we want to migrate to using this -- like decimal and numpy -- generally provide set() operations in their public API. (See: decimal.setcontext, numpy.seterr, ...) They're generally not recommended for use in new code, but they do exist and are covered by compatibility guarantees, so we need some way to implement them using the PEP 550 API. OTOH we can certainly provide a context manager like this and make it the obvious convenient thing to use (and which also happens to do the right thing). We could potentially also give the 'set' primitive an ugly name to remind people that it has this pitfall, like make it 'set_in_top_context' or something. -n -- Nathaniel J. Smith -- https://vorpus.org

TLDR: I really like this version, and the tweaks I suggest below are just cosmetic. I figure if there are any major technical traps lurking, you'll find them as you work through updating the reference implementation. On 16 August 2017 at 09:55, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
It may be worth having separate "name" and "description" attributes, similar to __name__ and __doc__ being separate on things like functions. That way, error messages can just show "name", while debuggers and other introspection tools can include a more detailed description.
For ease of introspection, it's probably worth using a common `__local_context__` attribute name across all the different types that support one, and encouraging other object implementations to do the same. This isn't like cr_await and gi_yieldfrom, where we wanted to use different names because they refer to different kinds of objects.
.. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046780.html
The threading in pipermail makes it difficult to get from your reply back to my original comment, so it may be better to link directly to the latter: https://mail.python.org/pipermail/python-ideas/2017-August/046775.html And to be completely explicit about: I like your proposed approach of leaving it up to iterator developers to decide whether or not to run with a local context or not. If they don't manipulate any context items, it won't matter, and if they do, it's straightforward to add a suitable call to sys.run_in_local_context(). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
TLDR: I really like this version, and the tweaks I suggest below are just cosmetic.
Thanks, Nick!
FWIW I've implemented 3-5 different variations of PEP 550 (along with HAMT) and I'm fairly confident that datastructures and optimizations will work, so no major traps there are really expected. The risk that we need to manage now is getting the API design "right".
Initially I wanted to have "sys.new_context_item(name)" signature, but then I thought that some users might be confused what "name" actually means. In some contexts you might say that the "name" of the CI is the name of the variable it is bound to, IOW, for "foo = CI(name="bar")' the name is "foo". But some users might think that it's "bar". OTOH, PEP 550 doesn't have any introspection APIs at this point, and the final version of it will have to have them. If we add something like "sys.get_execution_context_as_dict()", then it would be preferable for CIs to have short name-like descriptions, as opposed to multiline docstrings. So in the end, I think that we should adopt a namedtuple solution, and just make the first "ContextItem" parameter a positional-only "name": ContextItem(name: str, /)
We also have cr_code and gi_code, which are used for introspection purposes but refer to CodeObject. I myself don't like the mess the C-style convention created for our Python code (think of what the "dis" and "inspect" modules have to go through), so I'm +0 for having "__local_context__".
Fixed the link, and will update the Acknowledgments section with your paragraph (thanks!) Yury

On 17 August 2017 at 01:22, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Right, hence https://bugs.python.org/issue31230 :) (That suggestion is prompted by the fact that if we'd migrated gi_code to __code__ in 3.0, the same way we migrated func_code, then cr_code and ag_code would almost certainly have followed the same dunder-naming convention, and https://github.com/python/cpython/pull/3077 would never have been necessary)
I'm starting to think this should be __private_context__ (to convey the *intent* of the attribute), rather than naming it after the type that it's expected to store. Thinking about this particular attribute name did prompt the question of how we want PEP 550 to interact with the exec builtin, though, as well as raising some questions around a number of other code execution cases: 1. What is the execution context for top level code in a module? 2. What is the execution context for the import machinery in an import statement? 3. What is the execution context for the import machinery when invoked via importlib? 4. What is the execution context for the import machinery when invoked via the C API? 5. What is the execution context for the import machinery when invoked via the runpy module? 6. What is the execution context for things like the timeit module, templating engines, etc? 7. What is the execution context for codecs and codec error handlers? 8. What is the execution context for __del__ methods and weakref callbacks? 9. What is the execution context for trace hooks and other really low level machinery? 10. What is the execution context for displayhook and excepthook? I think a number of those (top level module code executed via the import system, the timeit module, templating engines) can be addressed by saying that the exec builtin always creates a completely fresh execution context by default (with no access to the parent's execution context), and will gain a new keyword-only parameter that allows you to specify an execution context to use. That way, exec'ed code will be independent by default, but users of exec() will be able to opt in to handing it like a normal function call by passing in the current context. The default REPL, the code module and the IDLE shell window would need to be updated so that they use a shared context for evaluating the user supplied code snippets, while keeping their own context separate. While top-level code would always run in a completely fresh context for imports, the runpy module would expose the same setting as the exec builtin, so the executed code would be isolated by default, but you could opt in to using a particular execution context if you wanted to. Codecs and codec error handlers I think will be best handled in a way similar to generators, where they have their own private context (so they can't alter the caller's context), but can *read* the caller's context (so the context can be used as a way of providing context-dependent codec settings). That "read-only" access model also feels like the right option for the import machinery - regardless of whether it's accessed via the import statement, importlib, the C API, or the runpy module, the import machinery should be able to *read* the dynamic context, but not make persistent changes to it. Since they can be executed at arbitrary points in the code, it feels to me that __del__ methods and weakref callbacks should *always* be executed in a completely pristine execution context, with no access whatsoever to any thread's dynamic context. I think we should leave the execution context alone for the really low level hooks, and simply point out that yes, these have the ability to do weird things to the execution context, just as they have the power to do weird things to local variables, so they need to be handles with care. For displayhook and excepthook, I don't have a particularly strong intuition, so my default recommendation would be the read-only access proposed for generators, codecs, and the import machinery. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 1:09 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I've been thinking a lot about the terminology, and I have another variant to consider: ExecutionContext is a stack of LogicalContexts. Coroutines/generators will thus have a __logical_context__ attribute. I think that the "logical" term better conveys the meaning than "private" or "dynamic".
Whatever the execution context of the current thread that is importing the code is. Which would usually be the main thread.
Whatever the execution context that invoked the import machinery, be it "__import__()" or "import" statement or "importlib.load_module"
In general, EC behaves just like TLS for all these cases, there's literally no difference.
Speaking of sys.displayhook and sys.stdio -- this API is fundamentally incompatible with PEP 550 or any possible context isolation. These things are essentially *global* variables in the sys module, and there's tons of code out there that *expects* them to behave like globals. If a user changes displayhook they expect it to work across all threads. If we want to make displayhooks/sys.stdio to become context-aware we will need new APIs for them with new properties/expectations. Simply forcing them to use execution context would be backwards incompatible. PEP 550 won't try to change how displayhooks, excepthooks, trace functions, sys.stdout etc work -- this is out of its scope. We can't refactor half of sys module as part of one PEP.
"exec" uses outer globals/locals if you don't pass them explicitly -- the code isn't isolated by default. Isolation for "exec" is opt-in: ]]] a = 1 ]]] exec('print(a); b = 2') 1 ]]] b 2 Therefore, with regards to PEP 550, it should execute the code with the current EC/LC. We should also add a new keyword arguments to provide custom LC and EC (same as we do for locals/globals).
I really think that in 3.7 we should just implement PEP 550 with its current scope, and defer system refactorings to 3.8. Many of such refactorings will probably deserve their own PEP, as, for example, changing sys.stdout semantics is a really complex topic. At this point we try to solve a problem of making a replacement for TLS that supports generators and async. Yury

Hi,
* ``sys.get_execution_context()`` function. The function returns a copy of the current EC: an ``ExecutionContext`` instance.
Can you explain the requirement for it being a copy? What do you call a copy exactly? Does it shallow-copy the stack or does it deep copy the context items?
How does this interact with sub-interpreters? (same question for rest of the PEP :-))
* O(N) for ``sys.get_execution_context()``, where ``N`` is the total number of items in the current **execution** context.
Right... but if this is a simple list copy, we are talking about an extremely fast O(N):
(what is "number of items"? number of local contexts? number of individual context items?)
We believe that approach #3 enables an efficient and complete Execution Context implementation, with excellent runtime performance.
What about the maintenance and debugging cost, though?
But, for relatively small mappings, regular dicts would also be fast enough, right? It would be helpful for the PEP to estimate reasonable parameter sizes: - reasonable number of context items in a local context - reasonable number of local contexts in an execution stack Regards Antoine.

On Wed, Aug 16, 2017 at 4:12 PM, Antoine Pitrou <antoine@python.org> wrote:
When the execution context is used to schedule a function call in a thread, or an asyncio callback in the futures, we want to take a snapshot of all items in the EC. In general the recommendation will be to store immutable data in the context (same as in .NET EC implementation, or whenever you have some potentially shared state).
What do you call a copy exactly? Does it shallow-copy the stack or does it deep copy the context items?
Execution Context is conceptually a stack of Local Contexts. Each local context is a weak key mapping. We need a shallow copy of the EC, which is semantically equivalent to the below snippet: new_lc = {} for lc in execution_context: new_lc.update(lc) return ExecutionContext(new_lc)
As long as PyThreadState_Get() works with sub-interpreters, all of the PEP machinery will work too.
"Number of items in the current **execution** context" = sum(len(local_context) for local_context in current_execution_context) Yes, even though making a new list + merging all LCs is a relatively fast operation, it will need to be performed on *every* asyncio.call_soon and create_task. The immutable stack/mappings solution simply elminates the problem because you can just copy by reference which is fast. The #3 approach is implementable with regular dicts + copy() too, it will be just slower in some cases (explained below).
Contrary to Python dicts, the implementation scope for hamt mapping is much smaller -- we only need get, set, and merge operations. No split dicts, no ordering, etc. With the help of fuzz-testing and out ref-counting test mode I hope that we'll be able to catch most of the bugs. Any solution adds to the total debugging and maintenance cost, but I believe that in this specific case, the benefits outweigh that cost: 1. Sometimes we'll need to merge many dicts in places like asyncio.call_soon or async Task objects. 2. "set" operation might resize the dict, making it slower. 3. The "dict.copy()" optimization that the PEP mentions won't be able to always help us, as we will likely need to often resize the dict.
If all mappings are relatively small than the answer is close to "yes". We might want to periodically "squash" (or merge or compact) the chain of Local Contexts, in which case merging dicts will be more expensive than merging hamt.
It would be helpful for the PEP to estimate reasonable parameter sizes: - reasonable number of context items in a local context
I assume that the number of context items will be relatively low. It's hard for me to imagine having more than a thousand of them.
- reasonable number of local contexts in an execution stack
In a simple multi-threaded code we will only have one local context per execution context. Every time you run a generator or an asynchronous task you push a local context to the stack. Generators will have an optimization -- they will push NULL to the stack and it will be a NULL until a generator writes to its local context. It's possible to imagine a degenerative case when a generator recurses in, say, a 'decimal context' with block, which can potentially create a long chain of LCs. Long chains of LCs are not a problem in general -- once the generator is done, it pops its LCs, thus decreasing the stack size. Long chains of LCs might become a problem if, deep into recursion, a generator needs to capture the execution context (say it makes an asyncio.call_soon() call). In which case the solution is simple -- we squash chains that are longer than 5-10-some-predefined-number. In general, though, EC is something that is there and you can't really control it. If you have a thousand decimal libraries in your next YouTube-killer website, you will have large numbers of items in your Execution Context. You will inevitably start experiencing slowdowns of your code that you can't even fix (or maybe even explain). In this case, HAMT is a safer bet -- it's a guarantee that you will always have O(log32) performance for LC-stack-squashing or set operations. This is the strongest argument in favour of HAMT mapping - we implement it and it should work for all use-cases, even the for the unlikely ones. Yury

On 21 August 2017 at 07:01, Barry <barry@barrys-emacs.org> wrote:
It's basically borrowed from procedural thread local APIs, which tend to use APIs like "tss_set(key, value)". That said, in a separate discussion, Caleb Hattingh mentioned C#'s AsyncLocal API, and it occurred to me that "context local" might work well as the name of the context access API: my_implicit_state = sys.new_context_local('my_state') my_implicit_state.set('spam') # Later, to access the value of my_implicit_state: print(my_implicit_state.get()) That way, we'd have 3 clearly defined kinds of local variables: * frame locals (the regular kind) * thread locals (threading.locals() et al) * context locals (PEP 550) The fact contexts can be nested, and a failed lookup in the active implicit context may then query outer namespaces in the current execution context would then be directly analogous to the way name lookups are resolved for frame locals. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 23, 2017 at 2:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If we're extending the analogy with thread-locals we should at least consider making each instantiation return a namespace rather than something holding a single value. We have log_state = threading.local() log_state.verbose = False def action(x): if log_state.verbose: print(x) def make_verbose(): log_state.verbose = True It would be nice if we could upgrade this to make it PEP 550-aware so that only the first line needs to change: log_state = sys.AsyncLocal("log state") # The rest is the same We might even support the alternative notation where you can provide default values and suggest a schema, similar to to threading.local: class LogState(threading.local): verbose = False log_state = LogState(<description>) (I think that for calls that construct empty instances of various types we should just use the class name rather than some factory function. I also think none of this should live in sys but that's separate.) -- --Guido van Rossum (python.org/~guido)

On Wed, Aug 23, 2017 at 8:41 AM, Guido van Rossum <guido@python.org> wrote:
You can mostly implement this on top of the current PEP 550. Something like: _tombstone = object() class AsyncLocal: def __getattribute__(self, name): # if this raises AttributeError, we let it propagate key = object.__getattribute__(self, name) value = key.get() if value is _tombstone: raise AttributeError(name) return value def __setattr__(self, name, value): try: key = object.__getattribute__(self, name) except AttributeError: with some_lock: # double-checked locking pattern try: key = object.__getattribute__(self, name) except AttributeError: key = new_context_key() object.__setattr__(self, name, key) key.set(value) def __delattr__(self, name): self.__setattr__(name, _tombstone) def __dir__(self): # filter out tombstoned values return [name for name in object.__dir__(self) if hasattr(self, name)] Issues: Minor problem: On threading.local you can use .__dict__ to get the dict. That doesn't work here. But this could be done by returning a mapping proxy type, or maybe it's better not to support at all -- I don't think it's a big issue. Major problem: An attribute setting/getting API doesn't give any way to solve the save/restore problem [1]. PEP 550 v3 doesn't have a solution to this yet either, but we know we can do it by adding some methods to context-key. Supporting this in AsyncLocal is kinda awkward, since you can't use methods on the object -- I guess you could have some staticmethods, like AsyncLocal.save_state(my_async_local, name) and AsyncLocal.restore_state(my_async_local, name, value)? In any case this kinda spoils the sense of like "oh it's just an object with attributes, I already know how this works". Major problem: There are two obvious implementations. The above uses a separate ContextKey for each entry in the dict; the other way would be to have a single ContextKey that holds a dict. They have subtly different semantics. Suppose you have a generator and inside it you assign to my_async_local.a but not to my_async_local.b, then yield, and then the caller assigns to my_async_local.b. Is this visible inside the generator? In the ContextKey-holds-an-attribute approach, the answer is "yes": each AsyncLocal is a bag of independent attributes. In the ContextKey-holds-a-dict approach, the answer is "no": each AsyncLocal is a single container holding a single piece of (complex) state. It isn't obvious to me which of these semantics is preferable – maybe it is if you're Dutch :-). But there's a danger that either option leaves a bunch of people confused. (Tangent: in the ContextKey-holds-a-dict approach, currently you have to copy the dict before mutating it every time, b/c PEP 550 currently doesn't provide a way to tell whether the value returned by get() came from the top of the stack, and thus is private to you and can be mutated in place, or somewhere deeper, and thus is shared and shouldn't be mutated. But we should fix that anyway, and anyway copy-the-mutate is a viable approach.) Observation: I don't think there's any simpler way to implement AsyncLocal other than to start with machinery like what PEP 550 already proposes, and then layer something like the above on top of it. We could potentially hide the layers inside the interpreter and only expose AsyncLocal, but I don't think it really simplifies the implementation any. Observation: I feel like many users of threading.local -- possibly the majority -- only put a single attribute on each object anyway, so for those users a raw ContextKey API is actually more natural and faster. For example, looking through the core django repo, I see thread locals in - django.utils.timezone._active - django.utils.translation.trans_real._active - django.urls.base._prefixes - django.urls.base._urlconfs - django.core.cache._caches - django.urls.resolvers.RegexURLResolver._local - django.contrib.gis.geos.prototypes.threadsafe.thread_context - django.contrib.gis.geos.prototypes.io.thread_context - django.db.utils.ConnectionHandler._connections Of these 9 thread-local objects, 7 of them have only a single attribute; only the last 2 use multiple attributes. For the first 4, that attribute is even called "value", which seems like a pretty clear indication that the authors found the whole local-as-namespace thing a nuisance to work around rather than something helpful. I also looked at asyncio; it has 2 threading.locals, and they each contain 2 attributes. But the two attributes are always read/written together; to me it would feel more natural to model this as a single ContextKey holding a small dict or tuple instead of something like AsyncLocal. So tl;dr: I think PEP 550 should just focus on a single object per key, and the subgroup of users who want to convert that to a more threading.local-style interface can do that themselves as efficiently as we could, once they've decided how they want to resolve the semantic issues. -n [1] https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of... -- Nathaniel J. Smith -- https://vorpus.org

There's another "major" problem with theading.local()-like API for PEP 550: C API. threading.local() in C right now is PyThreadState_GetDict(), which returns a dictionary for the current thread, that can be queried/modified with PyDict_* functions. For PEP 550 this would not work. The advantage of the current ContextKey solution is that the Python API and C API are essentially the same: [1] Another advantage, is that ContextKey implements a better caching, because it can have only one value cached in it, see [2] for details. [1] https://www.python.org/dev/peps/pep-0550/#new-apis [2] https://www.python.org/dev/peps/pep-0550/#contextkey-get-cache Yury

2017-08-16 1:55 GMT+02:00 Yury Selivanov <yselivanov.ml@gmail.com>:
Minor suggestion: Could we allow something like `sys.set_new_context_item(description='mylib.context', initial_value='spam')`? That would make it easier for type checkers to infer the type of a ContextItem, and it would save a line of code in the common case. With this modification, the type of new_context_item would be @overload def new_context_item(*, description: str, initial_value: T) -> ContextItem[T]: ... @overload def new_context_item(*, description: str) -> ContextItem[Any]: ... If we only allow the second variant, type checkers would need some sort of special casing to figure out that after .set(), .get() will return the same type.

On Tue, Aug 15, 2017 at 11:53 PM, Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
This is a really handy feature in general, actually! In fact all of asyncio's thread-locals define initial values (using a trick involving subclassing threading.local), and I recently added this feature to trio.TaskLocal as well just because it's so convenient. However, something that you realize almost immediately when trying to use this is that in many cases, what you actually want is an initial value *factory*. Like, if you write new_context_item(initial_value=[]) then you're going to have a bad time. So, should we support something like new_context_item(initializer=lambda: [])? The semantics are a little bit subtle. I guess it would be something like: if ci.get() goes to find the value and fails at all levels, then we call the factory function and assign its return value to the *deepest* LC, EC[0]. The idea being that we're pretending that the value was there all along in the outermost scope, you just didn't notice before now.
I'm not super familiar with PEP 484. Would using a factory function instead of an initial value break this type inference? If you want to automatically infer that whatever type I use to initialize the value is the only type it can ever have, is there a way for users to easily override that? Like could I write something like my_ci: ContextItem[int, str] = new_context_item(initial_value=0) ? -n -- Nathaniel J. Smith -- https://vorpus.org

On 16 August 2017 at 18:37, Nathaniel Smith <njs@pobox.com> wrote:
I actually wondered about this in the context of the PEP saying that "context items are set to None by default", as it isn't clear what that means for the behaviour of sys.new_execution_context(). The PEP states that the latter API creates an "empty" execution context, but the notion of a fresh EC being truly empty conflicts with the notion of all defined config items having a default value of None. I think your idea resolves that nicely: if context_item.get() failed to find a suitable context entry, it would do: base_context = ec.local_contexts[0] default_value = sys.run_with_local_context(base_context, self.default_factory) sys.run_with_local_context(base_context, self.set, default_value) The default setting for default_factory could then be to raise RuntimeError complaining that the context item isn't set in the current context. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 2:53 AM, Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote: [..]
I think that trying to infer the type of CI values by its default value is not the way to go: ci = sys.ContextItem(default=1) Is CI an int? Likely. Can it be set to None? Maybe, for some use-cases it might be what you want. The correct way IMO is to extend the typing module: ci1: typing.ContextItem[int] = sys.ContextItem(default=1) # ci1: is an int, and can't be anything else. ci2: typing.ContextItem[typing.Optional[int]] = sys.ContextItem(default=42) # ci2 is 42 by default, but can be reset to None. ci3: typing.ContextItem[typing.Union[int, str]] = sys.ContextItem(default='spam') # ci3 can be an int or str, can't be None. This is also forward compatible with proposals to add a `default_factory` or `initializer` parameter to ContextItems. Yury

On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Hi,
Here's the PEP 550 version 2.
Awesome! Some of the changes from v1 to v2 might be a bit confusing -- in particular the thing where ExecutionContext is now a stack of LocalContext objects instead of just being a mapping. So here's the big picture as I understand it: In discussions on the mailing list and off-line, we realized that the main reason people use "thread locals" is to implement fake dynamic scoping. Of course, generators/async/await mean that currently it's impossible to *really* fake dynamic scoping in Python -- that's what PEP 550 is trying to fix. So PEP 550 v1 essentially added "generator locals" as a refinement of "thread locals". But... it turns out that "generator locals" aren't enough to properly implement dynamic scoping either! So the goal in PEP 550 v2 is to provide semantics strong enough to *really* get this right. I wrote up some notes on what I mean by dynamic scoping, and why neither thread-locals nor generator-locals can fake it: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb
If you're more familiar with dynamic scoping, then you can think of an LC as a single dynamic scope...
* **Execution Context**, or EC, is an OS-thread-specific dynamic stack of Local Contexts.
...and an EC as a stack of scopes. Looking up a ContextItem in an EC proceeds by checking the first LC (innermost scope), then if it doesn't find what it's looking for it checks the second LC (the next-innermost scope), etc.
Two issues here, that both require some expansion of this API to reveal a *bit* more information about the EC structure. 1) For trio's cancel scope use case I described in the last, I actually need some way to read out all the values on the LocalContext stack. (It would also be helpful if there were some fast way to check the depth of the ExecutionContext stack -- or at least tell whether it's 1 deep or more-than-1 deep. I know that any cancel scopes that are in the bottommost LC will always be attached to the given Task, so I can set up the scope->task mapping once and re-use it indefinitely. OTOH for scopes that are stored in higher LCs, I have to check at every yield whether they're currently in effect. And I want to minimize the per-yield workload as much as possible.) 2) For classic decimal.localcontext context managers, the idea is still that you save/restore the value, so that you can nest multiple context managers without having to push/pop LCs all the time. But the above API is not actually sufficient to implement a proper save/restore, for a subtle reason: if you do ci.set(ci.get()) then you just (potentially) moved the value from a lower LC up to the top LC. Here's an example of a case where this can produce user-visible effects: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of... There are probably a bunch of options for fixing this. But basically we need some API that makes it possible to temporarily set a value in the top LC, and then restore that value to what it was before (either the previous value, or 'unset' to unshadow a value in a lower LC). One simple option would be to make the idiom be something like: @contextmanager def local_value(new_value): state = ci.get_local_state() ci.set(new_value) try: yield finally: ci.set_local_state(state) where 'state' is something like a tuple (ci in EC[-1], EC[-1].get(ci)). A downside with this is that it's a bit error-prone (very easy for an unwary user to accidentally use get/set instead of get_local_state/set_local_state). But I'm sure we can come up with something.
If there are enough of these functions then it might make sense to stick them in their own module instead of adding more stuff to sys. I guess worrying about that can wait until the API details are more firm though.
I like all the ideas in this section, but this specific point feels a bit weird. Coroutine objects need a second hidden field somewhere to keep track of whether the object they end up with is the same one they were created with? If I set cr_local_context to something else, and then set it back to the original value, does that trigger the magic await behavior or not? What if I take the initial LocalContext off of one coroutine and attach it to another, does that trigger the magic await behavior? Maybe it would make more sense to have two sentinel values: UNINITIALIZED and INHERIT?
I wonder if it would be useful to have an option to squash this execution context down into a single LocalContext, since we know we'll be using it for a while and once we've copied an ExecutionContext it becomes impossible to tell the difference between one that has lots of internal LocalContexts and one that doesn't. This could also be handy for trio/curio's semantics where they initialize a new task's context to be a shallow copy of the parent task: you could do new_task_coro.cr_local_context = get_current_context().squash() and then skip having to wrap every send() call in a run_in_context.
Hmm. I assume you're simplifying for expository purposes, but 'yield from' isn't the same as 'for v in o: yield v'. In fact PEP 380 says: "Motivation: [...] a piece of code containing a yield cannot be factored out and put into a separate function in the same way as other code. [...] If yielding of values is the only concern, this can be performed without much difficulty using a loop such as 'for v in g: yield v'. However, if the subgenerator is to interact properly with the caller in the case of calls to send(), throw() and close(), things become considerably more difficult. As will be seen later, the necessary code is very complicated, and it is tricky to handle all the corner cases correctly." So it seems to me that the whole idea of 'yield from' is that it's supposed to handle all the tricky bits needed to guarantee that if you take some code out of a generator and refactor it into a subgenerator, then everything works the same as before. This suggests that 'yield from' should do the same magic as 'await', where by default the subgenerator shares the same LocalContext as the parent generator. (And as a bonus it makes things simpler if 'yield from' and 'await' work the same.)
You showed how to make an iterator that acts like a generator. Is it also possible to make an async iterator that acts like an async generator? It's not immediately obvious, because you need to make sure that the local context gets restored each time you re-enter the __anext__ generator. I think it's something like: class AIter: def __init__(self): self._local_context = ... # Note: intentionally not async def __anext__(self): coro = self._real_anext() coro.cr_local_context = self._local_context return coro async def _real_anext(self): ... Does that look right?
I think this can be refined further (and I don't understand context_item_deallocs -- maybe it's a mistake?). AFAICT the things that invalidate a ContextItem's cache are: 1) switching threadstates 2) popping or pushing a non-empty LocalContext off the current threadstate's ExecutionContext 3) calling ContextItem.set() on *that* context item So I'd suggest tracking the thread state id, a counter of how many non-empty LocalContexts have been pushed/popped on this thread state, and a *per ContextItem* counter of how many times set() has been called.
While this is mostly true in the strict sense, in practice this PEP is useless if existing thread-local users like decimal and numpy can't migrate to it without breaking backcompat. So maybe this section should discuss that? (For example, one constraint on the design is that we can't provide only a pure push/pop API, even though that's what would be most convenient context managers like decimal.localcontext or numpy.errstate, because we also need to provide some backcompat story for legacy functions like decimal.setcontext and numpy.seterr.) -n -- Nathaniel J. Smith -- https://vorpus.org

On 16 August 2017 at 17:18, Nathaniel Smith <njs@pobox.com> wrote: [Yury wrote]
I'm actually wondering if it may be worth defining a _contextlib module (to export the interpreter level APIs to Python code), and making contextlib the official home of the user facing API. That we we can use contextlib2 to at least attempt to polyfill the coroutine parts of the proposal for 3.5+, even if the implicit generator changes are restricted to 3.7+ .
It feels odd to me as well, and I'm wondering if we can actually simplify this by saying: 1. Generator contexts (both sync and async) are isolated by default (__local_context__ = LocalContext()) 2. Coroutine contexts are *not* isolated by default (__local_context__ = None) Running top level task coroutines in separate execution contexts then becomes the responsibility of the event loop, which the PEP already lists as a required change in 3rd party libraries to get this all to work properly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 5:36 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
This is an interesting twist, and I like it. This will change asyncio.Task from: class Task: def __init__(self, coro): ... self.exec_context = sys.get_execution_context() def step(): sys.run_with_execution_context(self.coro.send) to: class Task: def __init__(self, coro): ... self.local_context = sys.new_local_context() def step(): sys.run_with_local_context(self.local_context, self.coro.send) And we don't need ceval to do anything for "await", which means that with this approach we won't touch ceval.c at all. Yury

On Wed, Aug 16, 2017 at 12:51 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
And immediately after I hit "send" I realized that this is a bit more complicated. In order for Tasks to remember the full execution context of where they were created, we need a new method that would allow to run with *both* exec and local contexts: class Task: def __init__(self, coro): ... self.local_context = sys.new_local_context() self.exec_context = sys.get_execution_context() def step(): sys.run_with_contexts(self.exec_context, self.local_context, self.coro.send) This is needed for the following PEP example to work properly: current_request = sys.new_context_item(description='request') async def child(): print('current request:', repr(current_request.get())) async def handle_request(request): current_request.set(request) event_loop.create_task(child) run(top_coro()) See https://www.python.org/dev/peps/pep-0550/#tasks Yury

On Wed, Aug 16, 2017 at 12:55 PM, Yury Selivanov [..]
Never mind, the actual implementation would be as simple as: class Task: def __init__(self, coro): ... coro.cr_local_context = sys.new_local_context() self.exec_context = sys.get_execution_context() def step(): sys.run_with_execution_context(self.exec_contex , self.coro.send) No need for another "run_with_context" function. Yury

On 17 August 2017 at 02:55, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I don't think that's entirely true, since you can nest the calls even without a combined API: sys.run_with_execution_context(self.exec_context, sys.run_with_local_context, self.local_context, self.coro.send) Offering a combined API may still make sense for usability and efficiency reasons, but it isn't strictly necessary. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 12:18:23AM -0700, Nathaniel Smith wrote:
I'm still trying to digest this with very little time for it. It *is* slightly confusing. Perhaps it would be possible to name the data structures by their functionality. E.g. if ExecutionContext is a stack, use ExecutionStack? Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope or even DynamicScope. This sounds like bikeshedding, but I find it difficult to have ExecutionContext, ContextItem, LocalContext in addition to the actual decimal.localcontext() and PyDecContext. For example, should PyDecContext inherit from ContextItem? I don't fully understand. :-/ Stefan Krah

On Wed, Aug 16, 2017 at 10:25 AM, Stefan Krah <stefan@bytereef.org> wrote:
I'm -1 on calling this thing a "scope" or "dynamic scope", as I think it will be even more confusing to Python users. When I think of "scoping" I usually think about Python name scopes -- locals, globals, nonlocals, etc. I'm afraid that adding another dimension to this vocabulary won't help anyone. "Context" is an established term for what PEP 550 tries to accomplish. It's used in multiple languages and runtimes, and while researching this topic I didn't see anybody confused with the concept on StackOverflow/etc.
No, you wouldn't be able to extend ContextItem type. The way for decimal it so simply do the following: In Python: _current_ctx = sys.ContextItem('decimal context') # later when you set decimal context _current_ctx.set(DecimalContext) # whenever you need to get the current context dc = _current_ctx.get() In C: PyContextItem * _current_ctx = PyContext_NewItem("decimal context"); if (_current_ctx == NULL) { /* error */ } # later when you set decimal context PyDecContextObject *ctx; ... if (PyContext_SetItem(_current_ctx, (PyObject*)ctx)) { /* error */ } # whenever you need to get the current context PyDecContextObject *ctx = PyContext_GetItem(_current_ctx); if (ctx == NULL) { /* error */ } if (ctx == Py_None) { /* not initialized, nothing is there */ } We didn't really discuss C APIs at this point, and it's very likely that they will be adjusted, but the general idea should stay the same. All in all, the complexity of _decimal.c will only decrease with PEP 550, while getting better support for generators/async. Yury

On Wed, Aug 16, 2017 at 11:00:43AM -0400, Yury Selivanov wrote:
For me a context is a "single thing" that is usually used to thread state through functions. I guess I'd call "environment" what you call "context".
Thanks! This makes it a lot clearer. I'd probably use (stealing Nick's key suggestion): PyEnvKey *_current_context_key = PyEnv_NewKey("___DECIMAL_CONTEXT__"); ... PyDecContextObject *ctx = PyEnv_GetItem(_current_ctx_key); Stefan Krah

On Wed, Aug 16, 2017 at 12:40:26PM -0400, Yury Selivanov wrote:
Yeah, I usually think about symbol tables. FWIW, I find this terminology quite reasonable: https://hackernoon.com/execution-context-in-javascript-319dd72e8e2c The main points are ExecutionContextStack/FunctionalExecutionContext vs. ExecutionContext/LocalContext. Stefan Krah

On Wed, Aug 16, 2017 at 1:13 PM, Stefan Krah <stefan@bytereef.org> wrote:
Thanks for the link! I think it actually explains the JS language spec wrt how scoping of regular variables is implemented.
The main points are ExecutionContextStack/FunctionalExecutionContext
vs. ExecutionContext/LocalContext.
While I'm trying to avoid using scoping terminology for PEP 550, there's one parallel -- as with regular Python scoping you have global variables and you have local variables. You can use the locals() to access to your local scope, and you can use globals() to access to your global scope. Similarly in PEP 550, you have your LocalContext and ExecutionContext. We don't want to call ExecutionContext a "Global Context" because it is fundamentally OS-thread-specific (contrary to Python globals). LocalContexts are created for threads, generators, coroutines and are really similar to local scoping. Adding more names for local contexts like CoroutineLocalContext, GeneratorLocalContext won't solve anything either. All in all, Local Context is what its name stands for -- it's a local context for your current logical scope, be it a coroutine or a generator. At this point PEP 550 is very different from ExecutionContext in .NET, but there are still many similarities. That's a +1 to keep its current name. ExecutionContextStack and ExecutionContextChain reflect the implementation of PEP 550 on some level, but for most Python users they won't mean anything. If they want to learn how EC works, they just need to read the PEP (or documentation). Otherwise they will just use the ContextKey API and it should just work for them. So IMO, ExecutionContext and LocalContext are really the best names of all that were proposed so far. Yury

On 17 August 2017 at 04:38, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
To be honest, the difference between LocalContext and ExecutionContext feels more like the difference between locals() and lexical closure variables than it does the difference between between locals() and globals(). It's just that where the scoping rules are a compile time thing related to lexical closures, PEP 550 is about defining a dynamic context.
In addition to it being different from the way the decimal module already uses the phrase, one of the reasons I don't want to call it a LocalContext is because doing so brings in the suggestion that it is somehow connected to the locals() scope, and it isn't - there are plenty of things (most notably, function calls) that will change the active local namespace, but *won't* change the active execution context.
But unlike locals() itself, it *isn't* linked to a specific frame of execution - it's deliberately designed to be shared *between* frames. If you don't like either of the ExecutionContext/ExecutionEnvironment or ExecutionContext/ExecutionContextChain combinations, how would you feel about ExecutionContext + DynamicContext? Saying that "ck.set_value(value) sets the value corresponding to the given context key in the currently active execution context" is still my preferred terminology for setting values, and I think the following would work well for reading values: ck.get_value() attempts to look up the value for that key in the currently active execution context. If it doesn't find one, it then tries each of the execution contexts in the currently active dynamic context. If it *still* doesn't find one, then it will set the default value in the outermost execution context and then return that value. One thing I like about that phrasing is that we'd be using the word dynamic in exactly the same sense that dynamic scoping uses it, and the dynamic context mechanism would become PEP 550's counterpart to the lexical closure support in Python's normal scoping rules. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 6:25 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
I really like DynamicContext -- if you know the classic dynamic/static terminology in language design then it works as a precise technical description, but it also makes sense as plain non-technical English. And it avoids the confusingly overloaded word "scope". Apropos Guido's point about container naming, how about DynamicContext and DynamicContextStack? That's only 3 letters longer than ExecutionContext. -n -- Nathaniel J. Smith -- https://vorpus.org

On 17 August 2017 at 00:25, Stefan Krah <stefan@bytereef.org> wrote:
Agreed, I don't think we have the terminology quite right yet. For "ContextItem" for example, we may actually be better off calling it "ContextKey", and have the methods be "ck.get_value()" and "ck.set_value()". That would get us closer to the POSIX TSS terminology, and emphasises that the objects themselves are best seen as opaque references to a key that lets you get and set the corresponding value in the active execution context. I do think we should stick with "context" rather than bringing dynamic scopes into the mix - while dynamic scoping *is* an accurate term for what we're doing at a computer science level, Python itself tends to reserve the term scoping for the way the compiler resolves names, which we're deliberately *not* touching here. Avoiding a naming collision with decimal.localcontext() would also be desirable. Yury, what do you think about moving the ExecutionContext name to what the PEP currently calls LocalContext, and renaming the current ExecutionContext type to ExecutionContextChain? The latter name then hints at the collections.ChainMap style behaviour of ck.get_value() lookups, without making any particular claims about what the internal implementation data structures actually are. The run methods could then be sys.run_with_context_chain() (to ignore the current context entirely and use a completely separate context chain) and sys.run_with_active_context() (to append a single execution context onto the end of the current context chain) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 11:03 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
With the confusion of what "empty ExecutionContext" and "ContextItem is set to None by default", I tend to agree that "ContextKey" might be a better name. A default for "ContextKey" means something that will be returned if the lookup failed, plain and simple.
+1, I feel the same about this.
Avoiding a naming collision with decimal.localcontext() would also be desirable.
The ContextItem (or ContextKey) that decimal will be using will be an implementation detail, and it must not be exposed to the public API of the module.
While I think that the naming issue is important, the API that will be used most of the time is ContextItem. That's the name in the spotlight.
sys.run_with_context_chain and sys.run_with_active_context sound *really* confusing to me. Maybe it's because I spent too much time thinking about the current PEP 550 naming. To be honest, I really like Execution Context and Local Context names. I'm curious if other people are confused with them. Yury

I'm also confused by these, because they share the noun part of their name, but their use and meaning is quite different. The PEP defines an EC as a stack of LCs, and (apart from strings :-) it's usually not a good idea to use the same term for a container and its items. On Fri, Aug 18, 2017 at 6:41 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Aug 16, 2017 at 3:18 AM, Nathaniel Smith <njs@pobox.com> wrote:
Thanks! [..]
Yes. We touched upon this topic in parallel threads, so I'll just briefly mention this here: I deliberately avoided using "scope" in PEP 550 naming, as "scoping" in Python is usually associated with names/globals/locals/nonlocals etc. Adding another "level" of scoping will be very confusing for users (IMO).
We can add an API for returning the full stack of values for a CI: ContextItem.iter_stack() -> Iterator # or ContextItem.get_stack() -> List Because some of the LC will be empty, what you'll get is a list with some None values in it, like: [None, val1, None, None, val2] The length of the list will tell you how deep the stack is.
Yeah, this is tricky. The main issue is indeed the confusion of what methods you need to call -- "get/set" or "get_local_state/set_local_state". On some level the problem is very similar to regular Python scoping rules: 1. we have local hames 2. we have global names 3. we nave 'nonlocal' modifier IOW scoping isn't easy, and you need to be conscious of what you do. It's just that we are so used to these scoping rules that they have a low cognitive effort for us. One of the ideas that I have in mind is to add another level of indirection to separate "global get" from "local set/get": 1. Rename ContextItem to ContextKey (reasoning for that in parallel thread) 2. Remove ContextKey.set() method 3. Add a new ContextKey.value() -> ContextValue ck = ContextKey() with ck.value() as val: val.set(spam) yield or val = ck.value() val.set(spam) try: yield finally: val.clear() Essentially ContextValue will be the only API to set values in execution context. ContextKey.get() will be used to get them. Nathaniel, Nick, what do you guys think? [..]
I'm OK with this idea -- pystate.c becomes way too crowded. Maybe we should just put this stuff in _contextlib.c and expose in the contextlib module.
Yes, I planned to have a second hidden field, as Coroutines will have their cr_local_context set to NULL, and that will be their empty LC. So a second internal field is needed to disambiguate NULL -- meaning an "empty context" and NULL meaning "use outside local context". I omitted this from the PEP to make it a bit easier to digest, as this seemed to be a low-level implementation detail.
All good questions. I don't like sentinels in general, I'd be more OK with a "gi_isolated_local_context" flag (we're back to square one here). But I don't think we should add it. My thinking is that once you start writing to "gi_local_context" -- all bets are off, and you manage this from now on (meaning that some internal coroutine flag will be set to 1, and the interpreter will never touch local_context of this coroutine): 1. If you write None -- it means that the generator/coroutine will not have its own LC. 2. If you write you own LC object -- the generator/coroutine will use it.
I think this would be a bit too low-level. I'd prefer to defer solving the "squashing" problem until I have a reference implementation and we can test this. Essentially, this is an optimization problem--the EC implementation can just squash the chain itself, when the chain is longer than 5 LCs. Or something like this. But exposing this to Python level would be like letting a program to tinker GCC -O flags after it's compiled IMO. [..]
I see what you are saying here, but 'yield from' for generators is still different from awaits, as you can partially iterate the generator and *then* "yield from" from it: def foo(): g = gen() val1 = next(g) val2 = next(g) # do some computation? yield from g ... def gen(): # messing with EC between yields In general, I still think that 'yield from g' is semantically equivalent to 'for i in g: yield i' for most users.
Yes, seems to be correct.
Now that you highlighted the deallocs counter and I thought about it a bit more I don't think it's needed :) I'll remove it.
Excellent idea, will be in the next version of the PEP.
The main purpose of this section is to tell if some parts of the PEP are breaking some existing code/patterns or if it imposes a significant performance penalty. PEP 550 does neither of these things. If decimal/numpy simply switch to using new APIs, everything should work as expected for them, with the exception that assigning a new decimal context (without a context manager) will be isolated in generators. Which I'd consider as a bug fix. We can add a new section to discuss the specifics. Yury

On 17 August 2017 at 02:36, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I think I don't want to have try to explain to anyone what happens if I get a context value in my current execution environment and then send that value reference into a different execution context :) So I'd prefer my earlier proposal of: # Resolve key in current execution environment ck.get_value() # Assign to key in current execution context ck.set_value(value) # Assign to key in specific execution context sys.run_with_active_context(ec, ck.set_value, value) One suggestion I do like is Stefan's one of using "ExecutionContext" to refer to the namespace that ck.set_value() writes to, and then "ExecutionEnvironment" for the whole chain that ck.get_value() reads. Similar to "generator" and "package", we'd still end up with "context" being inherently ambiguous when used without qualification: - PEP 550 execution context - exception handling context (for chained exceptions) - with statement context - various context objects, like the decimal context But we wouldn't have two different kinds of context within PEP 550 itself. Instead, we'd have to start disambiguating the word environment: - PEP 550 execution environment - process environment (i.e. os.environ) The analogy between process environments and execution environments wouldn't be exact (since the key-value pairs in process environments are copied eagerly rather than via lazily chained lookups), but once you account for that, the parallels between an operating system level process environment tree and a Python level execution environment tree as proposed in PEP 550 seem like they would be helpful rather than confusing.
Yeah, I'd be OK with that - if we're going to reuse the word, it makes sense to reuse the module to expose the related machinery. That said, if we do go that way *and* we decide to offer a coroutine-only backport, I see an offer of contextlib2 co-maintainership in your future ;)
Given that the field is writable, I think it makes more sense to just choose a suitable default, and then rely on other code changing that default when its not right. For generators: set it to an empty context by default, have contextlib.contextmanager (and similar wrapper) clear it For coroutines: set it to None by default, have async task managers give top level coroutines their own private context No hidden flags, no magic value adjustments, just different defaults for coroutines and generators (including async generators). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Nathaniel Smith schrieb am 16.08.2017 um 09:18:
+1
I agree with Nathaniel that many projects that can benefit from this feature will need to keep supporting older Python versions as well. In the case of Cython, that's Py2.6+. We already have the problem that the asynchronous finalisation of async generators cannot be supported in older Python versions ("old" as in Py3.5 and before), so we end up with a language feature that people can use in Py2.6, but not completely/safely. I can't say yet how difficult it will be to integrate the new infrastructure that this PEP proposes into a backwards compatible code base, but if there's something we can think of now in order to help projects keep supporting older Python versions in the same code base, given the constraints of their existing APIs and semantics - that would be great. Stefan

On 18 August 2017 at 16:12, Stefan Behnel <stefan_ml@behnel.de> wrote:
One aspect of this that we're considering is to put the Python level API in contextlib rather than in sys. That has the pragmatic benefit that contextlib2 then becomes the natural home for an API backport, and we should be able to get the full *explicit* API working on older versions (even if it means introducing an optional C extension module as a dependency to get that part of the API working fully). To backport the isolation of generators, we'd likely be able to provide a decorator that explicitly isolated generators, but it wouldn't be feasible to backport implicit isolation. The same would go for the various other proposals for implicit isolation - when running on older versions, the general principle would be "if you (or a library/framework you're using) didn't explicitly isolate the execution context, assume it's not isolated". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 2:12 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
I think it's Cython's quest to try to backport support of all new Python 3.x language features to be 2.6-compatible, which sometimes can be questionable. You can add support of PEP 550 semantics to code that was compiled with Cython, but pure Python code won't be able to support it. This, in my opinion, could cause more confusion than benefit, so for Cython I think the solution is to do nothing in this case. We'll (maybe) backport some functionality to contextlib2. In my opinion, any code that uses contextlib2 in Python should work exactly the same when it's compiled with Cython. Yury

Cool to see this on python-ideas. I'm really looking forward to this PEP 550 or 521. On Wednesday, August 16, 2017 at 3:19:29 AM UTC-4, Nathaniel Smith wrote:
I agree with Nathaniel that this is an issue with the current API. I don't think it's a good idea to have set and get methods. It would be much better to reflect the underlying ExecutionContext *stack* in the API by exposing a mutating *context manager* on the Context Key object instead of set. For example, my_context = sys.new_context_key('my_context') options = my_context.get() options.some_mutating_method() with my_context.mutate(options): # Do whatever you want with the mutated context # Now, the context is reverted. Similarly, instead of my_context.set('spam') you would do with my_context.mutate('spam'): # Do whatever you want with the mutated context # Now, the context is reverted.

On Sat, Aug 19, 2017 at 12:09 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
Unfortunately, I don't think we can eliminate the set() operation entirely, because the libraries we want to migrate to using this -- like decimal and numpy -- generally provide set() operations in their public API. (See: decimal.setcontext, numpy.seterr, ...) They're generally not recommended for use in new code, but they do exist and are covered by compatibility guarantees, so we need some way to implement them using the PEP 550 API. OTOH we can certainly provide a context manager like this and make it the obvious convenient thing to use (and which also happens to do the right thing). We could potentially also give the 'set' primitive an ugly name to remind people that it has this pitfall, like make it 'set_in_top_context' or something. -n -- Nathaniel J. Smith -- https://vorpus.org

TLDR: I really like this version, and the tweaks I suggest below are just cosmetic. I figure if there are any major technical traps lurking, you'll find them as you work through updating the reference implementation. On 16 August 2017 at 09:55, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
It may be worth having separate "name" and "description" attributes, similar to __name__ and __doc__ being separate on things like functions. That way, error messages can just show "name", while debuggers and other introspection tools can include a more detailed description.
For ease of introspection, it's probably worth using a common `__local_context__` attribute name across all the different types that support one, and encouraging other object implementations to do the same. This isn't like cr_await and gi_yieldfrom, where we wanted to use different names because they refer to different kinds of objects.
.. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046780.html
The threading in pipermail makes it difficult to get from your reply back to my original comment, so it may be better to link directly to the latter: https://mail.python.org/pipermail/python-ideas/2017-August/046775.html And to be completely explicit about: I like your proposed approach of leaving it up to iterator developers to decide whether or not to run with a local context or not. If they don't manipulate any context items, it won't matter, and if they do, it's straightforward to add a suitable call to sys.run_in_local_context(). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
TLDR: I really like this version, and the tweaks I suggest below are just cosmetic.
Thanks, Nick!
FWIW I've implemented 3-5 different variations of PEP 550 (along with HAMT) and I'm fairly confident that datastructures and optimizations will work, so no major traps there are really expected. The risk that we need to manage now is getting the API design "right".
Initially I wanted to have "sys.new_context_item(name)" signature, but then I thought that some users might be confused what "name" actually means. In some contexts you might say that the "name" of the CI is the name of the variable it is bound to, IOW, for "foo = CI(name="bar")' the name is "foo". But some users might think that it's "bar". OTOH, PEP 550 doesn't have any introspection APIs at this point, and the final version of it will have to have them. If we add something like "sys.get_execution_context_as_dict()", then it would be preferable for CIs to have short name-like descriptions, as opposed to multiline docstrings. So in the end, I think that we should adopt a namedtuple solution, and just make the first "ContextItem" parameter a positional-only "name": ContextItem(name: str, /)
We also have cr_code and gi_code, which are used for introspection purposes but refer to CodeObject. I myself don't like the mess the C-style convention created for our Python code (think of what the "dis" and "inspect" modules have to go through), so I'm +0 for having "__local_context__".
Fixed the link, and will update the Acknowledgments section with your paragraph (thanks!) Yury

On 17 August 2017 at 01:22, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Right, hence https://bugs.python.org/issue31230 :) (That suggestion is prompted by the fact that if we'd migrated gi_code to __code__ in 3.0, the same way we migrated func_code, then cr_code and ag_code would almost certainly have followed the same dunder-naming convention, and https://github.com/python/cpython/pull/3077 would never have been necessary)
I'm starting to think this should be __private_context__ (to convey the *intent* of the attribute), rather than naming it after the type that it's expected to store. Thinking about this particular attribute name did prompt the question of how we want PEP 550 to interact with the exec builtin, though, as well as raising some questions around a number of other code execution cases: 1. What is the execution context for top level code in a module? 2. What is the execution context for the import machinery in an import statement? 3. What is the execution context for the import machinery when invoked via importlib? 4. What is the execution context for the import machinery when invoked via the C API? 5. What is the execution context for the import machinery when invoked via the runpy module? 6. What is the execution context for things like the timeit module, templating engines, etc? 7. What is the execution context for codecs and codec error handlers? 8. What is the execution context for __del__ methods and weakref callbacks? 9. What is the execution context for trace hooks and other really low level machinery? 10. What is the execution context for displayhook and excepthook? I think a number of those (top level module code executed via the import system, the timeit module, templating engines) can be addressed by saying that the exec builtin always creates a completely fresh execution context by default (with no access to the parent's execution context), and will gain a new keyword-only parameter that allows you to specify an execution context to use. That way, exec'ed code will be independent by default, but users of exec() will be able to opt in to handing it like a normal function call by passing in the current context. The default REPL, the code module and the IDLE shell window would need to be updated so that they use a shared context for evaluating the user supplied code snippets, while keeping their own context separate. While top-level code would always run in a completely fresh context for imports, the runpy module would expose the same setting as the exec builtin, so the executed code would be isolated by default, but you could opt in to using a particular execution context if you wanted to. Codecs and codec error handlers I think will be best handled in a way similar to generators, where they have their own private context (so they can't alter the caller's context), but can *read* the caller's context (so the context can be used as a way of providing context-dependent codec settings). That "read-only" access model also feels like the right option for the import machinery - regardless of whether it's accessed via the import statement, importlib, the C API, or the runpy module, the import machinery should be able to *read* the dynamic context, but not make persistent changes to it. Since they can be executed at arbitrary points in the code, it feels to me that __del__ methods and weakref callbacks should *always* be executed in a completely pristine execution context, with no access whatsoever to any thread's dynamic context. I think we should leave the execution context alone for the really low level hooks, and simply point out that yes, these have the ability to do weird things to the execution context, just as they have the power to do weird things to local variables, so they need to be handles with care. For displayhook and excepthook, I don't have a particularly strong intuition, so my default recommendation would be the read-only access proposed for generators, codecs, and the import machinery. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 18, 2017 at 1:09 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I've been thinking a lot about the terminology, and I have another variant to consider: ExecutionContext is a stack of LogicalContexts. Coroutines/generators will thus have a __logical_context__ attribute. I think that the "logical" term better conveys the meaning than "private" or "dynamic".
Whatever the execution context of the current thread that is importing the code is. Which would usually be the main thread.
Whatever the execution context that invoked the import machinery, be it "__import__()" or "import" statement or "importlib.load_module"
In general, EC behaves just like TLS for all these cases, there's literally no difference.
Speaking of sys.displayhook and sys.stdio -- this API is fundamentally incompatible with PEP 550 or any possible context isolation. These things are essentially *global* variables in the sys module, and there's tons of code out there that *expects* them to behave like globals. If a user changes displayhook they expect it to work across all threads. If we want to make displayhooks/sys.stdio to become context-aware we will need new APIs for them with new properties/expectations. Simply forcing them to use execution context would be backwards incompatible. PEP 550 won't try to change how displayhooks, excepthooks, trace functions, sys.stdout etc work -- this is out of its scope. We can't refactor half of sys module as part of one PEP.
"exec" uses outer globals/locals if you don't pass them explicitly -- the code isn't isolated by default. Isolation for "exec" is opt-in: ]]] a = 1 ]]] exec('print(a); b = 2') 1 ]]] b 2 Therefore, with regards to PEP 550, it should execute the code with the current EC/LC. We should also add a new keyword arguments to provide custom LC and EC (same as we do for locals/globals).
I really think that in 3.7 we should just implement PEP 550 with its current scope, and defer system refactorings to 3.8. Many of such refactorings will probably deserve their own PEP, as, for example, changing sys.stdout semantics is a really complex topic. At this point we try to solve a problem of making a replacement for TLS that supports generators and async. Yury

Hi,
* ``sys.get_execution_context()`` function. The function returns a copy of the current EC: an ``ExecutionContext`` instance.
Can you explain the requirement for it being a copy? What do you call a copy exactly? Does it shallow-copy the stack or does it deep copy the context items?
How does this interact with sub-interpreters? (same question for rest of the PEP :-))
* O(N) for ``sys.get_execution_context()``, where ``N`` is the total number of items in the current **execution** context.
Right... but if this is a simple list copy, we are talking about an extremely fast O(N):
(what is "number of items"? number of local contexts? number of individual context items?)
We believe that approach #3 enables an efficient and complete Execution Context implementation, with excellent runtime performance.
What about the maintenance and debugging cost, though?
But, for relatively small mappings, regular dicts would also be fast enough, right? It would be helpful for the PEP to estimate reasonable parameter sizes: - reasonable number of context items in a local context - reasonable number of local contexts in an execution stack Regards Antoine.

On Wed, Aug 16, 2017 at 4:12 PM, Antoine Pitrou <antoine@python.org> wrote:
When the execution context is used to schedule a function call in a thread, or an asyncio callback in the futures, we want to take a snapshot of all items in the EC. In general the recommendation will be to store immutable data in the context (same as in .NET EC implementation, or whenever you have some potentially shared state).
What do you call a copy exactly? Does it shallow-copy the stack or does it deep copy the context items?
Execution Context is conceptually a stack of Local Contexts. Each local context is a weak key mapping. We need a shallow copy of the EC, which is semantically equivalent to the below snippet: new_lc = {} for lc in execution_context: new_lc.update(lc) return ExecutionContext(new_lc)
As long as PyThreadState_Get() works with sub-interpreters, all of the PEP machinery will work too.
"Number of items in the current **execution** context" = sum(len(local_context) for local_context in current_execution_context) Yes, even though making a new list + merging all LCs is a relatively fast operation, it will need to be performed on *every* asyncio.call_soon and create_task. The immutable stack/mappings solution simply elminates the problem because you can just copy by reference which is fast. The #3 approach is implementable with regular dicts + copy() too, it will be just slower in some cases (explained below).
Contrary to Python dicts, the implementation scope for hamt mapping is much smaller -- we only need get, set, and merge operations. No split dicts, no ordering, etc. With the help of fuzz-testing and out ref-counting test mode I hope that we'll be able to catch most of the bugs. Any solution adds to the total debugging and maintenance cost, but I believe that in this specific case, the benefits outweigh that cost: 1. Sometimes we'll need to merge many dicts in places like asyncio.call_soon or async Task objects. 2. "set" operation might resize the dict, making it slower. 3. The "dict.copy()" optimization that the PEP mentions won't be able to always help us, as we will likely need to often resize the dict.
If all mappings are relatively small than the answer is close to "yes". We might want to periodically "squash" (or merge or compact) the chain of Local Contexts, in which case merging dicts will be more expensive than merging hamt.
It would be helpful for the PEP to estimate reasonable parameter sizes: - reasonable number of context items in a local context
I assume that the number of context items will be relatively low. It's hard for me to imagine having more than a thousand of them.
- reasonable number of local contexts in an execution stack
In a simple multi-threaded code we will only have one local context per execution context. Every time you run a generator or an asynchronous task you push a local context to the stack. Generators will have an optimization -- they will push NULL to the stack and it will be a NULL until a generator writes to its local context. It's possible to imagine a degenerative case when a generator recurses in, say, a 'decimal context' with block, which can potentially create a long chain of LCs. Long chains of LCs are not a problem in general -- once the generator is done, it pops its LCs, thus decreasing the stack size. Long chains of LCs might become a problem if, deep into recursion, a generator needs to capture the execution context (say it makes an asyncio.call_soon() call). In which case the solution is simple -- we squash chains that are longer than 5-10-some-predefined-number. In general, though, EC is something that is there and you can't really control it. If you have a thousand decimal libraries in your next YouTube-killer website, you will have large numbers of items in your Execution Context. You will inevitably start experiencing slowdowns of your code that you can't even fix (or maybe even explain). In this case, HAMT is a safer bet -- it's a guarantee that you will always have O(log32) performance for LC-stack-squashing or set operations. This is the strongest argument in favour of HAMT mapping - we implement it and it should work for all use-cases, even the for the unlikely ones. Yury

On 21 August 2017 at 07:01, Barry <barry@barrys-emacs.org> wrote:
It's basically borrowed from procedural thread local APIs, which tend to use APIs like "tss_set(key, value)". That said, in a separate discussion, Caleb Hattingh mentioned C#'s AsyncLocal API, and it occurred to me that "context local" might work well as the name of the context access API: my_implicit_state = sys.new_context_local('my_state') my_implicit_state.set('spam') # Later, to access the value of my_implicit_state: print(my_implicit_state.get()) That way, we'd have 3 clearly defined kinds of local variables: * frame locals (the regular kind) * thread locals (threading.locals() et al) * context locals (PEP 550) The fact contexts can be nested, and a failed lookup in the active implicit context may then query outer namespaces in the current execution context would then be directly analogous to the way name lookups are resolved for frame locals. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Aug 23, 2017 at 2:00 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If we're extending the analogy with thread-locals we should at least consider making each instantiation return a namespace rather than something holding a single value. We have log_state = threading.local() log_state.verbose = False def action(x): if log_state.verbose: print(x) def make_verbose(): log_state.verbose = True It would be nice if we could upgrade this to make it PEP 550-aware so that only the first line needs to change: log_state = sys.AsyncLocal("log state") # The rest is the same We might even support the alternative notation where you can provide default values and suggest a schema, similar to to threading.local: class LogState(threading.local): verbose = False log_state = LogState(<description>) (I think that for calls that construct empty instances of various types we should just use the class name rather than some factory function. I also think none of this should live in sys but that's separate.) -- --Guido van Rossum (python.org/~guido)

On Wed, Aug 23, 2017 at 8:41 AM, Guido van Rossum <guido@python.org> wrote:
You can mostly implement this on top of the current PEP 550. Something like: _tombstone = object() class AsyncLocal: def __getattribute__(self, name): # if this raises AttributeError, we let it propagate key = object.__getattribute__(self, name) value = key.get() if value is _tombstone: raise AttributeError(name) return value def __setattr__(self, name, value): try: key = object.__getattribute__(self, name) except AttributeError: with some_lock: # double-checked locking pattern try: key = object.__getattribute__(self, name) except AttributeError: key = new_context_key() object.__setattr__(self, name, key) key.set(value) def __delattr__(self, name): self.__setattr__(name, _tombstone) def __dir__(self): # filter out tombstoned values return [name for name in object.__dir__(self) if hasattr(self, name)] Issues: Minor problem: On threading.local you can use .__dict__ to get the dict. That doesn't work here. But this could be done by returning a mapping proxy type, or maybe it's better not to support at all -- I don't think it's a big issue. Major problem: An attribute setting/getting API doesn't give any way to solve the save/restore problem [1]. PEP 550 v3 doesn't have a solution to this yet either, but we know we can do it by adding some methods to context-key. Supporting this in AsyncLocal is kinda awkward, since you can't use methods on the object -- I guess you could have some staticmethods, like AsyncLocal.save_state(my_async_local, name) and AsyncLocal.restore_state(my_async_local, name, value)? In any case this kinda spoils the sense of like "oh it's just an object with attributes, I already know how this works". Major problem: There are two obvious implementations. The above uses a separate ContextKey for each entry in the dict; the other way would be to have a single ContextKey that holds a dict. They have subtly different semantics. Suppose you have a generator and inside it you assign to my_async_local.a but not to my_async_local.b, then yield, and then the caller assigns to my_async_local.b. Is this visible inside the generator? In the ContextKey-holds-an-attribute approach, the answer is "yes": each AsyncLocal is a bag of independent attributes. In the ContextKey-holds-a-dict approach, the answer is "no": each AsyncLocal is a single container holding a single piece of (complex) state. It isn't obvious to me which of these semantics is preferable – maybe it is if you're Dutch :-). But there's a danger that either option leaves a bunch of people confused. (Tangent: in the ContextKey-holds-a-dict approach, currently you have to copy the dict before mutating it every time, b/c PEP 550 currently doesn't provide a way to tell whether the value returned by get() came from the top of the stack, and thus is private to you and can be mutated in place, or somewhere deeper, and thus is shared and shouldn't be mutated. But we should fix that anyway, and anyway copy-the-mutate is a viable approach.) Observation: I don't think there's any simpler way to implement AsyncLocal other than to start with machinery like what PEP 550 already proposes, and then layer something like the above on top of it. We could potentially hide the layers inside the interpreter and only expose AsyncLocal, but I don't think it really simplifies the implementation any. Observation: I feel like many users of threading.local -- possibly the majority -- only put a single attribute on each object anyway, so for those users a raw ContextKey API is actually more natural and faster. For example, looking through the core django repo, I see thread locals in - django.utils.timezone._active - django.utils.translation.trans_real._active - django.urls.base._prefixes - django.urls.base._urlconfs - django.core.cache._caches - django.urls.resolvers.RegexURLResolver._local - django.contrib.gis.geos.prototypes.threadsafe.thread_context - django.contrib.gis.geos.prototypes.io.thread_context - django.db.utils.ConnectionHandler._connections Of these 9 thread-local objects, 7 of them have only a single attribute; only the last 2 use multiple attributes. For the first 4, that attribute is even called "value", which seems like a pretty clear indication that the authors found the whole local-as-namespace thing a nuisance to work around rather than something helpful. I also looked at asyncio; it has 2 threading.locals, and they each contain 2 attributes. But the two attributes are always read/written together; to me it would feel more natural to model this as a single ContextKey holding a small dict or tuple instead of something like AsyncLocal. So tl;dr: I think PEP 550 should just focus on a single object per key, and the subgroup of users who want to convert that to a more threading.local-style interface can do that themselves as efficiently as we could, once they've decided how they want to resolve the semantic issues. -n [1] https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of... -- Nathaniel J. Smith -- https://vorpus.org

There's another "major" problem with theading.local()-like API for PEP 550: C API. threading.local() in C right now is PyThreadState_GetDict(), which returns a dictionary for the current thread, that can be queried/modified with PyDict_* functions. For PEP 550 this would not work. The advantage of the current ContextKey solution is that the Python API and C API are essentially the same: [1] Another advantage, is that ContextKey implements a better caching, because it can have only one value cached in it, see [2] for details. [1] https://www.python.org/dev/peps/pep-0550/#new-apis [2] https://www.python.org/dev/peps/pep-0550/#contextkey-get-cache Yury
participants (11)
-
Antoine Pitrou
-
Barry
-
Ethan Furman
-
Guido van Rossum
-
Jelle Zijlstra
-
Nathaniel Smith
-
Neil Girdhar
-
Nick Coghlan
-
Stefan Behnel
-
Stefan Krah
-
Yury Selivanov