
Hi, This is the 4th iteration of the PEP that Elvis and I have rewritten from scratch. The specification section has been separated from the implementation section, which makes them easier to follow. During the rewrite, we realized that generators and coroutines should work with the EC in exactly the same way (coroutines used to be created with no LC in prior versions of the PEP). We also renamed Context Keys to Context Variables which seems to be a more appropriate name. Hopefully this update will resolve the remaining questions about the specification and the proposed implementation, and will allow us to focus on refining the API. Yury PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io>, Elvis Pranskevichus <elvis@magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017 Abstract ======== This PEP adds a new generic mechanism of ensuring consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines. Thread-local storage, such as ``threading.local()``, is inadequate for programs that execute concurrently in the same OS thread. This PEP proposes a solution to this problem. Rationale ========= Prior to the advent of asynchronous programming in Python, programs used OS threads to achieve concurrency. The need for thread-specific state was solved by ``threading.local()`` and its C-API equivalent, ``PyThreadState_GetDict()``. A few examples of where Thread-local storage (TLS) is commonly relied upon: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``. * Request-related data, such as security tokens and request data in web applications, language context for ``gettext`` etc. * Profiling, tracing, and logging in large code bases. Unfortunately, TLS does not work well for programs which execute concurrently in a single thread. A Python generator is the simplest example of a concurrent program. Consider the following:: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The expected value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] Rather surprisingly, the actual result is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.111111'), Decimal('0.222222'))] This is because Decimal context is stored as a thread-local, so concurrent iteration of the ``fractions()`` generator would corrupt the state. A similar problem exists with coroutines. Applications also often need to associate certain data with a given thread of execution. For example, a web application server commonly needs access to the current HTTP request object. The inadequacy of TLS in asynchronous code has lead to the proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases. The current status quo is that any library (including the standard library), which relies on TLS, is likely to be broken when used in asynchronous code or with generators (see [3]_ as an example issue.) Some languages, that support coroutines or generators, recommend passing the context manually as an argument to every function, see [1]_ for an example. This approach, however, has limited use for Python, where there is a large ecosystem that was built to work with a TLS-like context. Furthermore, libraries like ``decimal`` or ``numpy`` rely on context implicitly in overloaded operator implementations. The .NET runtime, which has support for async/await, has a generic solution for this problem, called ``ExecutionContext`` (see [2]_). Goals ===== The goal of this PEP is to provide a more reliable ``threading.local()`` alternative, which: * provides the mechanism and the API to fix non-local state issues with coroutines and generators; * has no or negligible performance impact on the existing code or the code that will be using the new mechanism, including libraries like ``decimal`` and ``numpy``. High-Level Specification ======================== The full specification of this PEP is broken down into three parts: * High-Level Specification (this section): the description of the overall solution. We show how it applies to generators and coroutines in user code, without delving into implementation details. * Detailed Specification: the complete description of new concepts, APIs, and related changes to the standard library. * Implementation Details: the description and analysis of data structures and algorithms used to implement this PEP, as well as the necessary changes to CPython. For the purpose of this section, we define *execution context* as an opaque container of non-local state that allows consistent access to its contents in the concurrent execution environment. A *context variable* is an object representing a value in the execution context. A new context variable is created by calling the ``new_context_var()`` function. A context variable object has two methods: * ``lookup()``: returns the value of the variable in the current execution context; * ``set()``: sets the value of the variable in the current execution context. Regular Single-threaded Code ---------------------------- In regular, single-threaded code that doesn't involve generators or coroutines, context variables behave like globals:: var = new_context_var() def sub(): assert var.lookup() == 'main' var.set('sub') def main(): var.set('main') sub() assert var.lookup() == 'sub' Multithreaded Code ------------------ In multithreaded code, context variables behave like thread locals:: var = new_context_var() def sub(): assert var.lookup() is None # The execution context is empty # for each new thread. var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.lookup() == 'main' Generators ---------- In generators, changes to context variables are local and are not visible to the caller, but are visible to the code called by the generator. Once set in the generator, the context variable is guaranteed not to change between iterations:: var = new_context_var() def gen(): var.set('gen') assert var.lookup() == 'gen' yield 1 assert var.lookup() == 'gen' yield 2 def main(): var.set('main') g = gen() next(g) assert var.lookup() == 'main' var.set('main modified') next(g) assert var.lookup() == 'main modified' Changes to caller's context variables are visible to the generator (unless they were also modified inside the generator):: var = new_context_var() def gen(): assert var.lookup() == 'var' yield 1 assert var.lookup() == 'var modified' yield 2 def main(): g = gen() var.set('var') next(g) var.set('var modified') next(g) Now, let's revisit the decimal precision example from the `Rationale`_ section, and see how the execution context can improve the situation:: import decimal decimal_prec = new_context_var() # create a new context variable # Pre-PEP 550 Decimal relies on TLS for its context. # This subclass switches the decimal context storage # to the execution context for illustration purposes. # class MyDecimal(decimal.Decimal): def __init__(self, value="0"): prec = decimal_prec.lookup() if prec is None: raise ValueError('could not find decimal precision') context = decimal.Context(prec=prec) super().__init__(value, context=context) def fractions(precision, x, y): # Normally, this would be set by a context manager, # but for simplicity we do this directly. decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] which matches the expected result. Coroutines and Asynchronous Tasks --------------------------------- In coroutines, like in generators, context variable changes are local and are not visible to the caller:: import asyncio var = new_context_var() async def sub(): assert var.lookup() == 'main' var.set('sub') assert var.lookup() == 'sub' async def main(): var.set('main') await sub() assert var.lookup() == 'main' loop = asyncio.get_event_loop() loop.run_until_complete(main()) To establish the full semantics of execution context in couroutines, we must also consider *tasks*. A task is the abstraction used by *asyncio*, and other similar libraries, to manage the concurrent execution of coroutines. In the example above, a task is created implicitly by the ``run_until_complete()`` function. ``asyncio.wait_for()`` is another example of implicit task creation:: async def sub(): await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') # waiting for sub() directly await sub() # waiting for sub() with a timeout await asyncio.wait_for(sub(), timeout=2) var.set('main changed') Intuitively, we expect the assertion in ``sub()`` to hold true in both invocations, even though the ``wait_for()`` implementation actually spawns a task, which runs ``sub()`` concurrently with ``main()``. Thus, tasks **must** capture a snapshot of the current execution context at the moment of their creation and use it to execute the wrapped coroutine whenever that happens. If this is not done, then innocuous looking changes like wrapping a coroutine in a ``wait_for()`` call would cause surprising breakage. This leads to the following:: import asyncio var = new_context_var() async def sub(): # Sleeping will make sub() run after # `var` is modified in main(). await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') loop.create_task(sub()) # schedules asynchronous execution # of sub(). assert var.lookup() == 'main' var.set('main changed') loop = asyncio.get_event_loop() loop.run_until_complete(main()) In the above code we show how ``sub()``, running in a separate task, sees the value of ``var`` as it was when ``loop.create_task(sub())`` was called. Like tasks, the intuitive behaviour of callbacks scheduled with either ``Loop.call_soon()``, ``Loop.call_later()``, or ``Future.add_done_callback()`` is to also capture a snapshot of the current execution context at the point of scheduling, and use it to run the callback:: current_request = new_context_var() def log_error(e): logging.error('error when handling request %r', current_request.lookup()) async def render_response(): ... async def handle_get_request(request): current_request.set(request) try: return await render_response() except Exception as e: get_event_loop().call_soon(log_error, e) return '500 - Internal Server Error' Detailed Specification ====================== Conceptually, an *execution context* (EC) is a stack of logical contexts. There is one EC per Python thread. A *logical context* (LC) is a mapping of context variables to their values in that particular LC. A *context variable* is an object representing a value in the execution context. A new context variable object is created by calling the ``sys.new_context_var(name: str)`` function. The value of the ``name`` argument is not used by the EC machinery, but may be used for debugging and introspection. The context variable object has the following methods and attributes: * ``name``: the value passed to ``new_context_var()``. * ``lookup()``: traverses the execution context top-to-bottom, until the variable value is found. Returns ``None``, if the variable is not present in the execution context; * ``set()``: sets the value of the variable in the topmost logical context. Generators ---------- When created, each generator object has an empty logical context object stored in its ``__logical_context__`` attribute. This logical context is pushed onto the execution context at the beginning of each generator iteration and popped at the end:: var1 = sys.new_context_var('var1') var2 = sys.new_context_var('var2') def gen(): var1.set('var1-gen') var2.set('var2-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] n = nested_gen() # nested_gen_LC is created next(n) # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] var1.set('var1-gen-mod') var2.set('var2-gen-mod') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}) # ] next(n) def nested_gen(): # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC() # ] assert var1.lookup() == 'var1-gen' assert var2.lookup() == 'var2-gen' var1.set('var1-nested-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] yield # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] assert var1.lookup() == 'var1-nested-gen' assert var2.lookup() == 'var2-gen-mod' yield # EC = [outer_LC()] g = gen() # gen_LC is created for the generator object `g` list(g) # EC = [outer_LC()] The snippet above shows the state of the execution context stack throughout the generator lifespan. contextlib.contextmanager ------------------------- Earlier, we've used the following example:: import decimal # create a new context variable decimal_prec = sys.new_context_var('decimal_prec') # ... def fractions(precision, x, y): decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) Let's extend it by adding a context manager:: @contextlib.contextmanager def precision_context(prec): old_rec = decimal_prec.lookup() try: decimal_prec.set(prec) yield finally: decimal_prec.set(old_prec) Unfortunately, this would not work straight away, as the modification to the ``decimal_prec`` variable is contained to the ``precision_context()`` generator, and therefore will not be visible inside the ``with`` block:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC becomes [{}, {}, {decimal_prec: precision}] in the # *precision_context()* generator, # but here the EC is still [{}, {}] # raises ValueError('could not find decimal precision')! yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) The way to fix this is to set the generator's ``__logical_context__`` attribute to ``None``. This will cause the generator to avoid modifying the execution context stack. We modify the ``contextlib.contextmanager()`` decorator to set ``genobj.__logical_context__`` to ``None`` to produce well-behaved context managers:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC = [{}, {decimal_prec: precision}] yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) # EC becomes [{}, {decimal_prec: None}] asyncio ------- ``asyncio`` uses ``Loop.call_soon``, ``Loop.call_later``, and ``Loop.call_at`` to schedule the asynchronous execution of a function. ``asyncio.Task`` uses ``call_soon()`` to further the execution of the wrapped coroutine. We modify ``Loop.call_{at,later,soon}`` to accept the new optional *execution_context* keyword argument, which defaults to the copy of the current execution context:: def call_soon(self, callback, *args, execution_context=None): if execution_context is None: execution_context = sys.get_execution_context() # ... some time later sys.run_with_execution_context( execution_context, callback, args) The ``sys.get_execution_context()`` function returns a shallow copy of the current execution context. By shallow copy here we mean such a new execution context that: * lookups in the copy provide the same results as in the original execution context, and * any changes in the original execution context do not affect the copy, and * any changes to the copy do not affect the original execution context. Either of the following satisfy the copy requirements: * a new stack with shallow copies of logical contexts; * a new stack with one squashed logical context. The ``sys.run_with_execution_context(ec, func, *args, **kwargs)`` function runs ``func(*args, **kwargs)`` with *ec* as the execution context. The function performs the following steps: 1. Set *ec* as the current execution context stack in the current thread. 2. Push an empty logical context onto the stack. 3. Run ``func(*args, **kwargs)``. 4. Pop the logical context from the stack. 5. Restore the original execution context stack. 6. Return or raise the ``func()`` result. These steps ensure that *ec* cannot be modified by *func*, which makes ``run_with_execution_context()`` idempotent. ``asyncio.Task`` is modified as follows:: class Task: def __init__(self, coro): ... # Get the current execution context snapshot. self._exec_context = sys.get_execution_context() self._loop.call_soon( self._step, execution_context=self._exec_context) def _step(self, exc=None): ... self._loop.call_soon( self._step, execution_context=self._exec_context) ... Generators Transformed into Iterators ------------------------------------- Any Python generator can be represented as an equivalent iterator. Compilers like Cython rely on this axiom. With respect to the execution context, such iterator should behave the same way as the generator it represents. This means that there needs to be a Python API to create new logical contexts and run code with a given logical context. The ``sys.new_logical_context()`` function creates a new empty logical context. The ``sys.run_with_logical_context(lc, func, *args, **kwargs)`` function can be used to run functions in the specified logical context. The *lc* can be modified as a result of the call. The ``sys.run_with_logical_context()`` function performs the following steps: 1. Push *lc* onto the current execution context stack. 2. Run ``func(*args, **kwargs)``. 3. Pop *lc* from the execution context stack. 4. Return or raise the ``func()`` result. By using ``new_logical_context()`` and ``run_with_logical_context()``, we can replicate the generator behaviour like this:: class Generator: def __init__(self): self.logical_context = sys.new_logical_context() def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): # Actual __next__ implementation. ... Let's see how this pattern can be applied to a real generator:: # create a new context variable decimal_prec = sys.new_context_var('decimal_precision') def gen_series(n, precision): decimal_prec.set(precision) for i in range(1, n): yield MyDecimal(i) / MyDecimal(3) # gen_series is equivalent to the following iterator: class Series: def __init__(self, n, precision): # Create a new empty logical context on creation, # like the generators do. self.logical_context = sys.new_logical_context() # run_with_logical_context() will pushes # self.logical_context onto the execution context stack, # runs self._next_impl, and pops self.logical_context # from the stack. return sys.run_with_logical_context( self.logical_context, self._init, n, precision) def _init(self, n, precision): self.i = 1 self.n = n decimal_prec.set(precision) def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): decimal_prec.set(self.precision) result = MyDecimal(self.i) / MyDecimal(3) self.i += 1 return result For regular iterators such approach to logical context management is normally not necessary, and it is recommended to set and restore context variables directly in ``__next__``:: class Series: def __next__(self): old_prec = decimal_prec.lookup() try: decimal_prec.set(self.precision) ... finally: decimal_prec.set(old_prec) Asynchronous Generators ----------------------- The execution context semantics in asynchronous generators does not differ from that of regular generators and coroutines. Implementation ============== Execution context is implemented as an immutable linked list of logical contexts, where each logical context is an immutable weak key mapping. A pointer to the currently active execution context is stored in the OS thread state:: +-----------------+ | | ec | PyThreadState +-------------+ | | | +-----------------+ | | ec_node ec_node ec_node v +------+------+ +------+------+ +------+------+ | NULL | lc |<----| prev | lc |<----| prev | lc | +------+--+---+ +------+--+---+ +------+--+---+ | | | LC v LC v LC v +-------------+ +-------------+ +-------------+ | var1: obj1 | | EMPTY | | var1: obj4 | | var2: obj2 | +-------------+ +-------------+ | var3: obj3 | +-------------+ The choice of the immutable list of immutable mappings as a fundamental data structure is motivated by the need to efficiently implement ``sys.get_execution_context()``, which is to be frequently used by asynchronous tasks and callbacks. When the EC is immutable, ``get_execution_context()`` can simply copy the current execution context *by reference*:: def get_execution_context(self): return PyThreadState_Get().ec Let's review all possible context modification scenarios: * The ``ContextVariable.set()`` method is called:: def ContextVar_set(self, val): # See a more complete set() definition # in the `Context Variables` section. tstate = PyThreadState_Get() top_ec_node = tstate.ec top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, val) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) * The ``sys.run_with_logical_context()`` is called, in which case the passed logical context object is appended to the execution context:: def run_with_logical_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * The ``sys.run_with_execution_context()`` is called, in which case the current execution context is set to the passed execution context with a new empty logical context appended to it:: def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_lc = sys.new_logical_context() new_top_ec_node = ec_node(prev=ec, lc=new_lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * Either ``genobj.send()``, ``genobj.throw()``, ``genobj.close()`` are called on a ``genobj`` generator, in which case the logical context recorded in ``genobj`` is pushed onto the stack:: PyGen_New(PyGenObject *gen): gen.__logical_context__ = sys.new_logical_context() gen_send(PyGenObject *gen, ...): tstate = PyThreadState_Get() if gen.__logical_context__ is not None: old_top_ec_node = tstate.ec new_top_ec_node = ec_node( prev=old_top_ec_node, lc=gen.__logical_context__) try: tstate.ec = new_top_ec_node return _gen_send_impl(gen, ...) finally: gen.__logical_context__ = tstate.ec.lc tstate.ec = old_top_ec_node else: return _gen_send_impl(gen, ...) * Coroutines and asynchronous generators share the implementation with generators, and the above changes apply to them as well. In certain scenarios the EC may need to be squashed to limit the size of the chain. For example, consider the following corner case:: async def repeat(coro, delay): await coro() await asyncio.sleep(delay) loop.create_task(repeat(coro, delay)) async def ping(): print('ping') loop = asyncio.get_event_loop() loop.create_task(repeat(ping, 1)) loop.run_forever() In the above code, the EC chain will grow as long as ``repeat()`` is called. Each new task will call ``sys.run_in_execution_context()``, which will append a new logical context to the chain. To prevent unbounded growth, ``sys.get_execution_context()`` checks if the chain is longer than a predetermined maximum, and if it is, squashes the chain into a single LC:: def get_execution_context(): tstate = PyThreadState_Get() if tstate.ec_len > EC_LEN_MAX: squashed_lc = sys.new_logical_context() ec_node = tstate.ec while ec_node: # The LC.merge() method does not replace existing keys. squashed_lc = squashed_lc.merge(ec_node.lc) ec_node = ec_node.prev return ec_node(prev=NULL, lc=squashed_lc) else: return tstate.ec Logical Context --------------- Logical context is an immutable weak key mapping which has the following properties with respect to garbage collection: * ``ContextVar`` objects are strongly-referenced only from the application code, not from any of the Execution Context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their collection by the GC. * Values put in the Execution Context are guaranteed to be kept alive while there is a ``ContextVar`` key referencing them in the thread. * If a ``ContextVar`` is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed. * If a thread has ended its execution, its thread state will be cleaned up along with its ``ExecutionContext``, cleaning up all values bound to all context variables in the thread. As discussed earluier, we need ``sys.get_execution_context()`` to be consistently fast regardless of the size of the execution context, so logical context is necessarily an immutable mapping. Choosing ``dict`` for the underlying implementation is suboptimal, because ``LC.set()`` will cause ``dict.copy()``, which is an O(N) operation, where *N* is the number of items in the LC. ``get_execution_context()``, when squashing the EC, is a O(M) operation, where *M* is the total number of context variable values in the EC. So, instead of ``dict``, we choose Hash Array Mapped Trie (HAMT) as the underlying implementation of logical contexts. (Scala and Clojure use HAMT to implement high performance immutable collections [5]_, [6]_.) With HAMT ``.set()`` becomes an O(log N) operation, and ``get_execution_context()`` squashing is more efficient on average due to structural sharing in HAMT. See `Appendix: HAMT Performance Analysis`_ for a more elaborate analysis of HAMT performance compared to ``dict``. Context Variables ----------------- The ``ContextVar.lookup()`` and ``ContextVar.set()`` methods are implemented as follows (in pseudo-code):: class ContextVar: def get(self): tstate = PyThreadState_Get() ec_node = tstate.ec while ec_node: if self in ec_node.lc: return ec_node.lc[self] ec_node = ec_node.prev return None def set(self, value): tstate = PyThreadState_Get() top_ec_node = tstate.ec if top_ec_node is not None: top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) else: top_lc = sys.new_logical_context() new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=NULL, lc=new_top_lc) For efficient access in performance-sensitive code paths, such as in ``numpy`` and ``decimal``, we add a cache to ``ContextVar.get()``, making it an O(1) operation when the cache is hit. The cache key is composed from the following: * The new ``uint64_t PyThreadState->unique_id``, which is a globally unique thread state identifier. It is computed from the new ``uint64_t PyInterpreterState->ts_counter``, which is incremented whenever a new thread state is created. * The ``uint64_t ContextVar->version`` counter, which is incremented whenever the context variable value is changed in any logical context in any thread. The cache is then implemented as follows:: class ContextVar: def set(self, value): ... # implementation self.version += 1 def get(self): tstate = PyThreadState_Get() if (self.last_tstate_id == tstate.unique_id and self.last_version == self.version): return self.last_value value = self._get_uncached() self.last_value = value # borrowed ref self.last_tstate_id = tstate.unique_id self.last_version = self.version return value Note that ``last_value`` is a borrowed reference. The assumption is that if the version checks are fine, the object will be alive. This allows the values of context variables to be properly garbage collected. This generic caching approach is similar to what the current C implementation of ``decimal`` does to cache the the current decimal context, and has similar performance characteristics. Performance Considerations ========================== Tests of the reference implementation based on the prior revisions of this PEP have shown 1-2% slowdown on generator microbenchmarks and no noticeable difference in macrobenchmarks. The performance of non-generator and non-async code is not affected by this PEP. Summary of the New APIs ======================= Python ------ The following new Python APIs are introduced by this PEP: 1. The ``sys.new_context_var(name: str='...')`` function to create ``ContextVar`` objects. 2. The ``ContextVar`` object, which has: * the read-only ``.name`` attribute, * the ``.lookup()`` method which returns the value of the variable in the current execution context; * the ``.set()`` method which sets the value of the variable in the current execution context. 3. The ``sys.get_execution_context()`` function, which returns a copy of the current execution context. 4. The ``sys.new_execution_context()`` function, which returns a new empty execution context. 5. The ``sys.new_logical_context()`` function, which returns a new empty logical context. 6. The ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)`` function, which runs *func* with the provided execution context. 7. The ``sys.run_with_logical_context(lc:LogicalContext, func, *args, **kwargs)`` function, which runs *func* with the provided logical context on top of the current execution context. C API ----- 1. ``PyContextVar * PyContext_NewVar(char *desc)``: create a ``PyContextVar`` object. 2. ``PyObject * PyContext_LookupVar(PyContextVar *)``: return the value of the variable in the current execution context. 3. ``int PyContext_SetVar(PyContextVar *, PyObject *)``: set the value of the variable in the current execution context. 4. ``PyLogicalContext * PyLogicalContext_New()``: create a new empty ``PyLogicalContext``. 5. ``PyLogicalContext * PyExecutionContext_New()``: create a new empty ``PyExecutionContext``. 6. ``PyExecutionContext * PyExecutionContext_Get()``: return the current execution context. 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the passed EC object as the current for the active thread state. 8. ``int PyExecutionContext_SetWithLogicalContext(PyExecutionContext *, PyLogicalContext *)``: allows to implement ``sys.run_with_logical_context`` Python API. Design Considerations ===================== Should ``PyThreadState_GetDict()`` use the execution context? ------------------------------------------------------------- No. ``PyThreadState_GetDict`` is based on TLS, and changing its semantics will break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem, which extends the context manager protocol with two new methods: ``__suspend__()`` and ``__resume__()``. Similarly, the asynchronous context manager protocol is also extended with ``__asuspend__()`` and ``__aresume__()``. This allows implementing context managers that manage non-local state, which behave correctly in generators and coroutines. For example, consider the following context manager, which uses execution state:: class Context: def __init__(self): self.var = new_context_var('var') def __enter__(self): self.old_x = self.var.lookup() self.var.set('something') def __exit__(self, *err): self.var.set(self.old_x) An equivalent implementation with PEP 521:: local = threading.local() class Context: def __enter__(self): self.old_x = getattr(local, 'x', None) local.x = 'something' def __suspend__(self): local.x = self.old_x def __resume__(self): local.x = 'something' def __exit__(self, *err): local.x = self.old_x The downside of this approach is the addition of significant new complexity to the context manager protocol and the interpreter implementation. This approach is also likely to negatively impact the performance of generators and coroutines. Additionally, the solution in :pep:`521` is limited to context managers, and does not provide any mechanism to propagate state in asynchronous tasks and callbacks. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- No. Proper generator behaviour with respect to the execution context requires changes to the interpreter. Should we update sys.displayhook and other APIs to use EC? ---------------------------------------------------------- APIs like redirecting stdout by overwriting ``sys.stdout``, or specifying new exception display hooks by overwriting the ``sys.displayhook`` function are affecting the whole Python process **by design**. Their users assume that the effect of changing them will be visible across OS threads. Therefore we cannot just make these APIs to use the new Execution Context. That said we think it is possible to design new APIs that will be context aware, but that is outside of the scope of this PEP. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. Conceptually, the behaviour of greenlets is very similar to that of generators, which means that similar changes around greenlet entry and exit can be done to add support for execution context. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Appendix: HAMT Performance Analysis =================================== .. figure:: pep-0550-hamt_vs_dict-v2.png :align: center :width: 100% Figure 1. Benchmark code can be found here: [9]_. The above chart demonstrates that: * HAMT displays near O(1) performance for all benchmarked dictionary sizes. * ``dict.copy()`` becomes very slow around 100 items. .. figure:: pep-0550-lookup_hamt.png :align: center :width: 100% Figure 2. Benchmark code can be found here: [10]_. Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based immutable mapping. HAMT lookup time is 30-40% slower than Python dict lookups on average, which is a very good result, considering that the latter is very well optimized. Thre is research [8]_ showing that there are further possible improvements to the performance of HAMT. The reference implementation of HAMT for CPython can be found here: [7]_. Acknowledgments =============== Thanks to Victor Petrovykh for countless discussions around the topic and PEP proofreading and edits. Thanks to Nathaniel Smith for proposing the ``ContextVar`` design [17]_ [18]_, for pushing the PEP towards a more complete design, and coming up with the idea of having a stack of contexts in the thread state. Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_. Version History =============== 1. Initial revision, posted on 11-Aug-2017 [20]_. 2. V2 posted on 15-Aug-2017 [21]_. The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators (see [19]_.) Version 2 was a complete rewrite, introducing new terminology (Local Context, Execution Context, Context Item) and new APIs. 3. V3 posted on 18-Aug-2017 [22]_. Updates: * Local Context was renamed to Logical Context. The term "local" was ambiguous and conflicted with local name scopes. * Context Item was renamed to Context Key, see the thread with Nick Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details. * Context Item get cache design was adjusted, per Nathaniel Smith's idea in [25]_. * Coroutines are created without a Logical Context; ceval loop no longer needs to special case the ``await`` expression (proposed by Nick Coghlan in [24]_.) 4. V4 posted on 25-Aug-2017: the current version. * The specification section has been completely rewritten. * Context Key renamed to Context Var. * Removed the distinction between generators and coroutines with respect to logical context isolation. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.a... .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap... .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.p... .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c .. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html .. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html .. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046775.html .. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b3ad73a1d01... .. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e9e5960c175... .. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71c1248f67e7... .. [23] https://mail.python.org/pipermail/python-ideas/2017-August/046801.html .. [24] https://mail.python.org/pipermail/python-ideas/2017-August/046790.html .. [25] https://mail.python.org/pipermail/python-ideas/2017-August/046786.html Copyright ========= This document has been placed in the public domain.

On 26.08.2017 04:19, Ethan Furman wrote:
Why not the same interface as thread-local storage? This has been the question which bothered me from the beginning of PEP550. I don't understand what inventing a new way of access buys us here. Python features regular attribute access for years. It's even simpler than method-based access. Best, Sven

On Sat, Aug 26, 2017 at 9:33 AM, Sven R. Kunze <srkunze@mail.de> wrote: [..]
This was covered at length in these threads: https://mail.python.org/pipermail/python-ideas/2017-August/046888.html https://mail.python.org/pipermail/python-ideas/2017-August/046889.html I forgot to add a subsection to "Design Consideration" with a summary of that thread. Will be fixed in the next revision. Yury

On Mon, Aug 28, 2017 at 6:19 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
And it should not be trivial, as the PEP 550 semantics is different from TLS. Using PEP 550 instead of TLS should be carefully evaluated. Please also see this: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int... Yury

On Fri, Aug 25, 2017 at 10:19 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
All in all, I like it. Nice job.
Thanks!
ContextVar.set(value) method writes the `value` to the *topmost LC*. ContextVar.lookup() method *traverses the stack* until it finds the LC that has a value. "get()" does not reflect this subtle semantics difference. Yury

On 08/26/2017 09:25 AM, Yury Selivanov wrote:
On Fri, Aug 25, 2017 at 10:19 PM, Ethan Furman wrote:
A good point; however, ChainMap, which behaves similarly as far as lookup goes, uses "get" and does not have a "lookup" method. I think we lose more than we gain by changing that method name. -- ~Ethan~

On 26.08.2017 19:23, Yury Selivanov wrote:
I like "get" more. ;-) Best, Sven PS: This might be a result of still leaning towards attribute access despite the discussion you referenced. I still don't think complicating and reinventing terminology (which basically results in API names) buys us much. And I am still with Ethan, a context stack is just a ChainMap. Renaming basic methods won't hide that fact. That's my only criticism of the PEP. The rest is fine and useful.

On 27 August 2017 at 03:23, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I don't think "we may want to add extra parameters" is a good reason to omit a conventional `get()` method - I think it's a reason to offer a separate API to handle use cases where the question of *where* the var is set matters (for example, `my_var.is_set()` would indicate whether or not `my_var.set()` has been called in the current logical context without requiring a parameter check for normal lookups that don't care). Cheers, Nick. P.S. And I say that as a reader who correctly guessed why you had changed the method name in the current iteration of the proposal. I'm sympathetic to those reasons, but I think sticking with the conventional API will make this one easier to learn and use :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Aug 29, 2017 at 5:01 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [..]
Yeah, I agree. We'll switch lookup -> get in the next iteration. Guido's parallel with getattr/setattr/delattr is also useful. getattr can also lookup the attribute in base classes, but we still call it "get". Yury

I agree with David; this PEP has really gotten to a great place and the new organization makes it much easier to understand.
On Aug 25, 2017, at 22:19, Ethan Furman <ethan@stoneleaf.us> wrote:
Why "lookup" and not "get" ? Many APIs use "get" and it's functionality is well understood.
I have the same question as Sven as to why we can’t have attribute access semantics. I probably asked that before, and you probably answered, so maybe if there’s a specific reason why this can’t be supported, the PEP should include a “rejected ideas” section explaining the choice. That said, if we have to use method lookup, then I agree that `.get()` is a better choice than `.lookup()`. But in that case, would it be possible to add an optional `default=None` argument so that you can specify a marker object for a missing value? I worry that None might be a valid value in some cases, but that currently can’t be distinguished from “missing”. I’d also like a debugging interface, such that I can ask “context_var.get()” and get some easy diagnostics about the resolution order. Cheers, -Barry

On Sat, Aug 26, 2017 at 12:30 PM, Barry Warsaw <barry@python.org> wrote:
Elvis just added it: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int...
That said, if we have to use method lookup, then I agree that `.get()` is a better choice than `.lookup()`. But in that case, would it be possible to add an optional `default=None` argument so that you can specify a marker object for a missing value? I worry that None might be a valid value in some cases, but that currently can’t be distinguished from “missing”.
Nathaniel has a use case where he needs to know if the value is in the topmost LC or not. One way to address that need is to have the following signature for lookup(): lookup(*, default=None, traverse=True) IMO "lookup" is a slightly better name in this particular context. Yury

On Aug 26, 2017, at 14:15, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Elvis just added it: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int...
Thanks, that’s exactly what I was looking for. Great summary of the issue.
Given that signature (which +1), I agree. You could add keywords for debugging lookup fairly easily too. Cheers, -Barry

I'm convinced by the new section explaining why a single value is better than a namespace. Nonetheless, it would feel more "Pythonic" to me to create a property `ContextVariable.val` whose getter and setter was `.lookup()` and `.set()` (or maybe `._lookup()` and `._set()`). Lookup might require a more complex call signature in rare cases, but the large majority of the time it would simply be `var.val`, and that should be the preferred API IMO. That provides a nice parallel between `var.name` and `var.val` also. On Sat, Aug 26, 2017 at 11:22 AM, Barry Warsaw <barry@python.org> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

This is now looking really good and I can understands it. One question though. Sometimes creation of a context variable is done with a name argument, other times not. E.g. var1 = new_context_var('var1') var = new_context_var() The signature is given as: sys.new_context_var(name: str) But it seems like it should be: sys.new_context_var(name: Optional[str]=None) On Aug 25, 2017 3:35 PM, "Yury Selivanov" <yselivanov.ml@gmail.com> wrote: Hi, This is the 4th iteration of the PEP that Elvis and I have rewritten from scratch. The specification section has been separated from the implementation section, which makes them easier to follow. During the rewrite, we realized that generators and coroutines should work with the EC in exactly the same way (coroutines used to be created with no LC in prior versions of the PEP). We also renamed Context Keys to Context Variables which seems to be a more appropriate name. Hopefully this update will resolve the remaining questions about the specification and the proposed implementation, and will allow us to focus on refining the API. Yury PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io>, Elvis Pranskevichus <elvis@magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017 Abstract ======== This PEP adds a new generic mechanism of ensuring consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines. Thread-local storage, such as ``threading.local()``, is inadequate for programs that execute concurrently in the same OS thread. This PEP proposes a solution to this problem. Rationale ========= Prior to the advent of asynchronous programming in Python, programs used OS threads to achieve concurrency. The need for thread-specific state was solved by ``threading.local()`` and its C-API equivalent, ``PyThreadState_GetDict()``. A few examples of where Thread-local storage (TLS) is commonly relied upon: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``. * Request-related data, such as security tokens and request data in web applications, language context for ``gettext`` etc. * Profiling, tracing, and logging in large code bases. Unfortunately, TLS does not work well for programs which execute concurrently in a single thread. A Python generator is the simplest example of a concurrent program. Consider the following:: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The expected value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] Rather surprisingly, the actual result is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.111111'), Decimal('0.222222'))] This is because Decimal context is stored as a thread-local, so concurrent iteration of the ``fractions()`` generator would corrupt the state. A similar problem exists with coroutines. Applications also often need to associate certain data with a given thread of execution. For example, a web application server commonly needs access to the current HTTP request object. The inadequacy of TLS in asynchronous code has lead to the proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases. The current status quo is that any library (including the standard library), which relies on TLS, is likely to be broken when used in asynchronous code or with generators (see [3]_ as an example issue.) Some languages, that support coroutines or generators, recommend passing the context manually as an argument to every function, see [1]_ for an example. This approach, however, has limited use for Python, where there is a large ecosystem that was built to work with a TLS-like context. Furthermore, libraries like ``decimal`` or ``numpy`` rely on context implicitly in overloaded operator implementations. The .NET runtime, which has support for async/await, has a generic solution for this problem, called ``ExecutionContext`` (see [2]_). Goals ===== The goal of this PEP is to provide a more reliable ``threading.local()`` alternative, which: * provides the mechanism and the API to fix non-local state issues with coroutines and generators; * has no or negligible performance impact on the existing code or the code that will be using the new mechanism, including libraries like ``decimal`` and ``numpy``. High-Level Specification ======================== The full specification of this PEP is broken down into three parts: * High-Level Specification (this section): the description of the overall solution. We show how it applies to generators and coroutines in user code, without delving into implementation details. * Detailed Specification: the complete description of new concepts, APIs, and related changes to the standard library. * Implementation Details: the description and analysis of data structures and algorithms used to implement this PEP, as well as the necessary changes to CPython. For the purpose of this section, we define *execution context* as an opaque container of non-local state that allows consistent access to its contents in the concurrent execution environment. A *context variable* is an object representing a value in the execution context. A new context variable is created by calling the ``new_context_var()`` function. A context variable object has two methods: * ``lookup()``: returns the value of the variable in the current execution context; * ``set()``: sets the value of the variable in the current execution context. Regular Single-threaded Code ---------------------------- In regular, single-threaded code that doesn't involve generators or coroutines, context variables behave like globals:: var = new_context_var() def sub(): assert var.lookup() == 'main' var.set('sub') def main(): var.set('main') sub() assert var.lookup() == 'sub' Multithreaded Code ------------------ In multithreaded code, context variables behave like thread locals:: var = new_context_var() def sub(): assert var.lookup() is None # The execution context is empty # for each new thread. var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.lookup() == 'main' Generators ---------- In generators, changes to context variables are local and are not visible to the caller, but are visible to the code called by the generator. Once set in the generator, the context variable is guaranteed not to change between iterations:: var = new_context_var() def gen(): var.set('gen') assert var.lookup() == 'gen' yield 1 assert var.lookup() == 'gen' yield 2 def main(): var.set('main') g = gen() next(g) assert var.lookup() == 'main' var.set('main modified') next(g) assert var.lookup() == 'main modified' Changes to caller's context variables are visible to the generator (unless they were also modified inside the generator):: var = new_context_var() def gen(): assert var.lookup() == 'var' yield 1 assert var.lookup() == 'var modified' yield 2 def main(): g = gen() var.set('var') next(g) var.set('var modified') next(g) Now, let's revisit the decimal precision example from the `Rationale`_ section, and see how the execution context can improve the situation:: import decimal decimal_prec = new_context_var() # create a new context variable # Pre-PEP 550 Decimal relies on TLS for its context. # This subclass switches the decimal context storage # to the execution context for illustration purposes. # class MyDecimal(decimal.Decimal): def __init__(self, value="0"): prec = decimal_prec.lookup() if prec is None: raise ValueError('could not find decimal precision') context = decimal.Context(prec=prec) super().__init__(value, context=context) def fractions(precision, x, y): # Normally, this would be set by a context manager, # but for simplicity we do this directly. decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] which matches the expected result. Coroutines and Asynchronous Tasks --------------------------------- In coroutines, like in generators, context variable changes are local and are not visible to the caller:: import asyncio var = new_context_var() async def sub(): assert var.lookup() == 'main' var.set('sub') assert var.lookup() == 'sub' async def main(): var.set('main') await sub() assert var.lookup() == 'main' loop = asyncio.get_event_loop() loop.run_until_complete(main()) To establish the full semantics of execution context in couroutines, we must also consider *tasks*. A task is the abstraction used by *asyncio*, and other similar libraries, to manage the concurrent execution of coroutines. In the example above, a task is created implicitly by the ``run_until_complete()`` function. ``asyncio.wait_for()`` is another example of implicit task creation:: async def sub(): await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') # waiting for sub() directly await sub() # waiting for sub() with a timeout await asyncio.wait_for(sub(), timeout=2) var.set('main changed') Intuitively, we expect the assertion in ``sub()`` to hold true in both invocations, even though the ``wait_for()`` implementation actually spawns a task, which runs ``sub()`` concurrently with ``main()``. Thus, tasks **must** capture a snapshot of the current execution context at the moment of their creation and use it to execute the wrapped coroutine whenever that happens. If this is not done, then innocuous looking changes like wrapping a coroutine in a ``wait_for()`` call would cause surprising breakage. This leads to the following:: import asyncio var = new_context_var() async def sub(): # Sleeping will make sub() run after # `var` is modified in main(). await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') loop.create_task(sub()) # schedules asynchronous execution # of sub(). assert var.lookup() == 'main' var.set('main changed') loop = asyncio.get_event_loop() loop.run_until_complete(main()) In the above code we show how ``sub()``, running in a separate task, sees the value of ``var`` as it was when ``loop.create_task(sub())`` was called. Like tasks, the intuitive behaviour of callbacks scheduled with either ``Loop.call_soon()``, ``Loop.call_later()``, or ``Future.add_done_callback()`` is to also capture a snapshot of the current execution context at the point of scheduling, and use it to run the callback:: current_request = new_context_var() def log_error(e): logging.error('error when handling request %r', current_request.lookup()) async def render_response(): ... async def handle_get_request(request): current_request.set(request) try: return await render_response() except Exception as e: get_event_loop().call_soon(log_error, e) return '500 - Internal Server Error' Detailed Specification ====================== Conceptually, an *execution context* (EC) is a stack of logical contexts. There is one EC per Python thread. A *logical context* (LC) is a mapping of context variables to their values in that particular LC. A *context variable* is an object representing a value in the execution context. A new context variable object is created by calling the ``sys.new_context_var(name: str)`` function. The value of the ``name`` argument is not used by the EC machinery, but may be used for debugging and introspection. The context variable object has the following methods and attributes: * ``name``: the value passed to ``new_context_var()``. * ``lookup()``: traverses the execution context top-to-bottom, until the variable value is found. Returns ``None``, if the variable is not present in the execution context; * ``set()``: sets the value of the variable in the topmost logical context. Generators ---------- When created, each generator object has an empty logical context object stored in its ``__logical_context__`` attribute. This logical context is pushed onto the execution context at the beginning of each generator iteration and popped at the end:: var1 = sys.new_context_var('var1') var2 = sys.new_context_var('var2') def gen(): var1.set('var1-gen') var2.set('var2-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] n = nested_gen() # nested_gen_LC is created next(n) # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] var1.set('var1-gen-mod') var2.set('var2-gen-mod') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}) # ] next(n) def nested_gen(): # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC() # ] assert var1.lookup() == 'var1-gen' assert var2.lookup() == 'var2-gen' var1.set('var1-nested-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] yield # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] assert var1.lookup() == 'var1-nested-gen' assert var2.lookup() == 'var2-gen-mod' yield # EC = [outer_LC()] g = gen() # gen_LC is created for the generator object `g` list(g) # EC = [outer_LC()] The snippet above shows the state of the execution context stack throughout the generator lifespan. contextlib.contextmanager ------------------------- Earlier, we've used the following example:: import decimal # create a new context variable decimal_prec = sys.new_context_var('decimal_prec') # ... def fractions(precision, x, y): decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) Let's extend it by adding a context manager:: @contextlib.contextmanager def precision_context(prec): old_rec = decimal_prec.lookup() try: decimal_prec.set(prec) yield finally: decimal_prec.set(old_prec) Unfortunately, this would not work straight away, as the modification to the ``decimal_prec`` variable is contained to the ``precision_context()`` generator, and therefore will not be visible inside the ``with`` block:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC becomes [{}, {}, {decimal_prec: precision}] in the # *precision_context()* generator, # but here the EC is still [{}, {}] # raises ValueError('could not find decimal precision')! yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) The way to fix this is to set the generator's ``__logical_context__`` attribute to ``None``. This will cause the generator to avoid modifying the execution context stack. We modify the ``contextlib.contextmanager()`` decorator to set ``genobj.__logical_context__`` to ``None`` to produce well-behaved context managers:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC = [{}, {decimal_prec: precision}] yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) # EC becomes [{}, {decimal_prec: None}] asyncio ------- ``asyncio`` uses ``Loop.call_soon``, ``Loop.call_later``, and ``Loop.call_at`` to schedule the asynchronous execution of a function. ``asyncio.Task`` uses ``call_soon()`` to further the execution of the wrapped coroutine. We modify ``Loop.call_{at,later,soon}`` to accept the new optional *execution_context* keyword argument, which defaults to the copy of the current execution context:: def call_soon(self, callback, *args, execution_context=None): if execution_context is None: execution_context = sys.get_execution_context() # ... some time later sys.run_with_execution_context( execution_context, callback, args) The ``sys.get_execution_context()`` function returns a shallow copy of the current execution context. By shallow copy here we mean such a new execution context that: * lookups in the copy provide the same results as in the original execution context, and * any changes in the original execution context do not affect the copy, and * any changes to the copy do not affect the original execution context. Either of the following satisfy the copy requirements: * a new stack with shallow copies of logical contexts; * a new stack with one squashed logical context. The ``sys.run_with_execution_context(ec, func, *args, **kwargs)`` function runs ``func(*args, **kwargs)`` with *ec* as the execution context. The function performs the following steps: 1. Set *ec* as the current execution context stack in the current thread. 2. Push an empty logical context onto the stack. 3. Run ``func(*args, **kwargs)``. 4. Pop the logical context from the stack. 5. Restore the original execution context stack. 6. Return or raise the ``func()`` result. These steps ensure that *ec* cannot be modified by *func*, which makes ``run_with_execution_context()`` idempotent. ``asyncio.Task`` is modified as follows:: class Task: def __init__(self, coro): ... # Get the current execution context snapshot. self._exec_context = sys.get_execution_context() self._loop.call_soon( self._step, execution_context=self._exec_context) def _step(self, exc=None): ... self._loop.call_soon( self._step, execution_context=self._exec_context) ... Generators Transformed into Iterators ------------------------------------- Any Python generator can be represented as an equivalent iterator. Compilers like Cython rely on this axiom. With respect to the execution context, such iterator should behave the same way as the generator it represents. This means that there needs to be a Python API to create new logical contexts and run code with a given logical context. The ``sys.new_logical_context()`` function creates a new empty logical context. The ``sys.run_with_logical_context(lc, func, *args, **kwargs)`` function can be used to run functions in the specified logical context. The *lc* can be modified as a result of the call. The ``sys.run_with_logical_context()`` function performs the following steps: 1. Push *lc* onto the current execution context stack. 2. Run ``func(*args, **kwargs)``. 3. Pop *lc* from the execution context stack. 4. Return or raise the ``func()`` result. By using ``new_logical_context()`` and ``run_with_logical_context()``, we can replicate the generator behaviour like this:: class Generator: def __init__(self): self.logical_context = sys.new_logical_context() def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): # Actual __next__ implementation. ... Let's see how this pattern can be applied to a real generator:: # create a new context variable decimal_prec = sys.new_context_var('decimal_precision') def gen_series(n, precision): decimal_prec.set(precision) for i in range(1, n): yield MyDecimal(i) / MyDecimal(3) # gen_series is equivalent to the following iterator: class Series: def __init__(self, n, precision): # Create a new empty logical context on creation, # like the generators do. self.logical_context = sys.new_logical_context() # run_with_logical_context() will pushes # self.logical_context onto the execution context stack, # runs self._next_impl, and pops self.logical_context # from the stack. return sys.run_with_logical_context( self.logical_context, self._init, n, precision) def _init(self, n, precision): self.i = 1 self.n = n decimal_prec.set(precision) def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): decimal_prec.set(self.precision) result = MyDecimal(self.i) / MyDecimal(3) self.i += 1 return result For regular iterators such approach to logical context management is normally not necessary, and it is recommended to set and restore context variables directly in ``__next__``:: class Series: def __next__(self): old_prec = decimal_prec.lookup() try: decimal_prec.set(self.precision) ... finally: decimal_prec.set(old_prec) Asynchronous Generators ----------------------- The execution context semantics in asynchronous generators does not differ from that of regular generators and coroutines. Implementation ============== Execution context is implemented as an immutable linked list of logical contexts, where each logical context is an immutable weak key mapping. A pointer to the currently active execution context is stored in the OS thread state:: +-----------------+ | | ec | PyThreadState +-------------+ | | | +-----------------+ | | ec_node ec_node ec_node v +------+------+ +------+------+ +------+------+ | NULL | lc |<----| prev | lc |<----| prev | lc | +------+--+---+ +------+--+---+ +------+--+---+ | | | LC v LC v LC v +-------------+ +-------------+ +-------------+ | var1: obj1 | | EMPTY | | var1: obj4 | | var2: obj2 | +-------------+ +-------------+ | var3: obj3 | +-------------+ The choice of the immutable list of immutable mappings as a fundamental data structure is motivated by the need to efficiently implement ``sys.get_execution_context()``, which is to be frequently used by asynchronous tasks and callbacks. When the EC is immutable, ``get_execution_context()`` can simply copy the current execution context *by reference*:: def get_execution_context(self): return PyThreadState_Get().ec Let's review all possible context modification scenarios: * The ``ContextVariable.set()`` method is called:: def ContextVar_set(self, val): # See a more complete set() definition # in the `Context Variables` section. tstate = PyThreadState_Get() top_ec_node = tstate.ec top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, val) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) * The ``sys.run_with_logical_context()`` is called, in which case the passed logical context object is appended to the execution context:: def run_with_logical_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * The ``sys.run_with_execution_context()`` is called, in which case the current execution context is set to the passed execution context with a new empty logical context appended to it:: def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_lc = sys.new_logical_context() new_top_ec_node = ec_node(prev=ec, lc=new_lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * Either ``genobj.send()``, ``genobj.throw()``, ``genobj.close()`` are called on a ``genobj`` generator, in which case the logical context recorded in ``genobj`` is pushed onto the stack:: PyGen_New(PyGenObject *gen): gen.__logical_context__ = sys.new_logical_context() gen_send(PyGenObject *gen, ...): tstate = PyThreadState_Get() if gen.__logical_context__ is not None: old_top_ec_node = tstate.ec new_top_ec_node = ec_node( prev=old_top_ec_node, lc=gen.__logical_context__) try: tstate.ec = new_top_ec_node return _gen_send_impl(gen, ...) finally: gen.__logical_context__ = tstate.ec.lc tstate.ec = old_top_ec_node else: return _gen_send_impl(gen, ...) * Coroutines and asynchronous generators share the implementation with generators, and the above changes apply to them as well. In certain scenarios the EC may need to be squashed to limit the size of the chain. For example, consider the following corner case:: async def repeat(coro, delay): await coro() await asyncio.sleep(delay) loop.create_task(repeat(coro, delay)) async def ping(): print('ping') loop = asyncio.get_event_loop() loop.create_task(repeat(ping, 1)) loop.run_forever() In the above code, the EC chain will grow as long as ``repeat()`` is called. Each new task will call ``sys.run_in_execution_context()``, which will append a new logical context to the chain. To prevent unbounded growth, ``sys.get_execution_context()`` checks if the chain is longer than a predetermined maximum, and if it is, squashes the chain into a single LC:: def get_execution_context(): tstate = PyThreadState_Get() if tstate.ec_len > EC_LEN_MAX: squashed_lc = sys.new_logical_context() ec_node = tstate.ec while ec_node: # The LC.merge() method does not replace existing keys. squashed_lc = squashed_lc.merge(ec_node.lc) ec_node = ec_node.prev return ec_node(prev=NULL, lc=squashed_lc) else: return tstate.ec Logical Context --------------- Logical context is an immutable weak key mapping which has the following properties with respect to garbage collection: * ``ContextVar`` objects are strongly-referenced only from the application code, not from any of the Execution Context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their collection by the GC. * Values put in the Execution Context are guaranteed to be kept alive while there is a ``ContextVar`` key referencing them in the thread. * If a ``ContextVar`` is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed. * If a thread has ended its execution, its thread state will be cleaned up along with its ``ExecutionContext``, cleaning up all values bound to all context variables in the thread. As discussed earluier, we need ``sys.get_execution_context()`` to be consistently fast regardless of the size of the execution context, so logical context is necessarily an immutable mapping. Choosing ``dict`` for the underlying implementation is suboptimal, because ``LC.set()`` will cause ``dict.copy()``, which is an O(N) operation, where *N* is the number of items in the LC. ``get_execution_context()``, when squashing the EC, is a O(M) operation, where *M* is the total number of context variable values in the EC. So, instead of ``dict``, we choose Hash Array Mapped Trie (HAMT) as the underlying implementation of logical contexts. (Scala and Clojure use HAMT to implement high performance immutable collections [5]_, [6]_.) With HAMT ``.set()`` becomes an O(log N) operation, and ``get_execution_context()`` squashing is more efficient on average due to structural sharing in HAMT. See `Appendix: HAMT Performance Analysis`_ for a more elaborate analysis of HAMT performance compared to ``dict``. Context Variables ----------------- The ``ContextVar.lookup()`` and ``ContextVar.set()`` methods are implemented as follows (in pseudo-code):: class ContextVar: def get(self): tstate = PyThreadState_Get() ec_node = tstate.ec while ec_node: if self in ec_node.lc: return ec_node.lc[self] ec_node = ec_node.prev return None def set(self, value): tstate = PyThreadState_Get() top_ec_node = tstate.ec if top_ec_node is not None: top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) else: top_lc = sys.new_logical_context() new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=NULL, lc=new_top_lc) For efficient access in performance-sensitive code paths, such as in ``numpy`` and ``decimal``, we add a cache to ``ContextVar.get()``, making it an O(1) operation when the cache is hit. The cache key is composed from the following: * The new ``uint64_t PyThreadState->unique_id``, which is a globally unique thread state identifier. It is computed from the new ``uint64_t PyInterpreterState->ts_counter``, which is incremented whenever a new thread state is created. * The ``uint64_t ContextVar->version`` counter, which is incremented whenever the context variable value is changed in any logical context in any thread. The cache is then implemented as follows:: class ContextVar: def set(self, value): ... # implementation self.version += 1 def get(self): tstate = PyThreadState_Get() if (self.last_tstate_id == tstate.unique_id and self.last_version == self.version): return self.last_value value = self._get_uncached() self.last_value = value # borrowed ref self.last_tstate_id = tstate.unique_id self.last_version = self.version return value Note that ``last_value`` is a borrowed reference. The assumption is that if the version checks are fine, the object will be alive. This allows the values of context variables to be properly garbage collected. This generic caching approach is similar to what the current C implementation of ``decimal`` does to cache the the current decimal context, and has similar performance characteristics. Performance Considerations ========================== Tests of the reference implementation based on the prior revisions of this PEP have shown 1-2% slowdown on generator microbenchmarks and no noticeable difference in macrobenchmarks. The performance of non-generator and non-async code is not affected by this PEP. Summary of the New APIs ======================= Python ------ The following new Python APIs are introduced by this PEP: 1. The ``sys.new_context_var(name: str='...')`` function to create ``ContextVar`` objects. 2. The ``ContextVar`` object, which has: * the read-only ``.name`` attribute, * the ``.lookup()`` method which returns the value of the variable in the current execution context; * the ``.set()`` method which sets the value of the variable in the current execution context. 3. The ``sys.get_execution_context()`` function, which returns a copy of the current execution context. 4. The ``sys.new_execution_context()`` function, which returns a new empty execution context. 5. The ``sys.new_logical_context()`` function, which returns a new empty logical context. 6. The ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)`` function, which runs *func* with the provided execution context. 7. The ``sys.run_with_logical_context(lc:LogicalContext, func, *args, **kwargs)`` function, which runs *func* with the provided logical context on top of the current execution context. C API ----- 1. ``PyContextVar * PyContext_NewVar(char *desc)``: create a ``PyContextVar`` object. 2. ``PyObject * PyContext_LookupVar(PyContextVar *)``: return the value of the variable in the current execution context. 3. ``int PyContext_SetVar(PyContextVar *, PyObject *)``: set the value of the variable in the current execution context. 4. ``PyLogicalContext * PyLogicalContext_New()``: create a new empty ``PyLogicalContext``. 5. ``PyLogicalContext * PyExecutionContext_New()``: create a new empty ``PyExecutionContext``. 6. ``PyExecutionContext * PyExecutionContext_Get()``: return the current execution context. 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the passed EC object as the current for the active thread state. 8. ``int PyExecutionContext_SetWithLogicalContext(PyExecutionContext *, PyLogicalContext *)``: allows to implement ``sys.run_with_logical_context`` Python API. Design Considerations ===================== Should ``PyThreadState_GetDict()`` use the execution context? ------------------------------------------------------------- No. ``PyThreadState_GetDict`` is based on TLS, and changing its semantics will break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem, which extends the context manager protocol with two new methods: ``__suspend__()`` and ``__resume__()``. Similarly, the asynchronous context manager protocol is also extended with ``__asuspend__()`` and ``__aresume__()``. This allows implementing context managers that manage non-local state, which behave correctly in generators and coroutines. For example, consider the following context manager, which uses execution state:: class Context: def __init__(self): self.var = new_context_var('var') def __enter__(self): self.old_x = self.var.lookup() self.var.set('something') def __exit__(self, *err): self.var.set(self.old_x) An equivalent implementation with PEP 521:: local = threading.local() class Context: def __enter__(self): self.old_x = getattr(local, 'x', None) local.x = 'something' def __suspend__(self): local.x = self.old_x def __resume__(self): local.x = 'something' def __exit__(self, *err): local.x = self.old_x The downside of this approach is the addition of significant new complexity to the context manager protocol and the interpreter implementation. This approach is also likely to negatively impact the performance of generators and coroutines. Additionally, the solution in :pep:`521` is limited to context managers, and does not provide any mechanism to propagate state in asynchronous tasks and callbacks. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- No. Proper generator behaviour with respect to the execution context requires changes to the interpreter. Should we update sys.displayhook and other APIs to use EC? ---------------------------------------------------------- APIs like redirecting stdout by overwriting ``sys.stdout``, or specifying new exception display hooks by overwriting the ``sys.displayhook`` function are affecting the whole Python process **by design**. Their users assume that the effect of changing them will be visible across OS threads. Therefore we cannot just make these APIs to use the new Execution Context. That said we think it is possible to design new APIs that will be context aware, but that is outside of the scope of this PEP. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. Conceptually, the behaviour of greenlets is very similar to that of generators, which means that similar changes around greenlet entry and exit can be done to add support for execution context. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Appendix: HAMT Performance Analysis =================================== .. figure:: pep-0550-hamt_vs_dict-v2.png :align: center :width: 100% Figure 1. Benchmark code can be found here: [9]_. The above chart demonstrates that: * HAMT displays near O(1) performance for all benchmarked dictionary sizes. * ``dict.copy()`` becomes very slow around 100 items. .. figure:: pep-0550-lookup_hamt.png :align: center :width: 100% Figure 2. Benchmark code can be found here: [10]_. Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based immutable mapping. HAMT lookup time is 30-40% slower than Python dict lookups on average, which is a very good result, considering that the latter is very well optimized. Thre is research [8]_ showing that there are further possible improvements to the performance of HAMT. The reference implementation of HAMT for CPython can be found here: [7]_. Acknowledgments =============== Thanks to Victor Petrovykh for countless discussions around the topic and PEP proofreading and edits. Thanks to Nathaniel Smith for proposing the ``ContextVar`` design [17]_ [18]_, for pushing the PEP towards a more complete design, and coming up with the idea of having a stack of contexts in the thread state. Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_. Version History =============== 1. Initial revision, posted on 11-Aug-2017 [20]_. 2. V2 posted on 15-Aug-2017 [21]_. The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators (see [19]_.) Version 2 was a complete rewrite, introducing new terminology (Local Context, Execution Context, Context Item) and new APIs. 3. V3 posted on 18-Aug-2017 [22]_. Updates: * Local Context was renamed to Logical Context. The term "local" was ambiguous and conflicted with local name scopes. * Context Item was renamed to Context Key, see the thread with Nick Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details. * Context Item get cache design was adjusted, per Nathaniel Smith's idea in [25]_. * Coroutines are created without a Logical Context; ceval loop no longer needs to special case the ``await`` expression (proposed by Nick Coghlan in [24]_.) 4. V4 posted on 25-Aug-2017: the current version. * The specification section has been completely rewritten. * Context Key renamed to Context Var. * Removed the distinction between generators and coroutines with respect to logical context isolation. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading. executioncontext.aspx .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures- persistenthashmap-part-ii.html .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/ bench/echoserver.py .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c .. [17] https://mail.python.org/pipermail/python-ideas/2017- August/046752.html .. [18] https://mail.python.org/pipermail/python-ideas/2017- August/046772.html .. [19] https://mail.python.org/pipermail/python-ideas/2017- August/046775.html .. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b 3ad73a1d01/pep-0550.rst .. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e 9e5960c175/pep-0550.rst .. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71 c1248f67e7/pep-0550.rst .. [23] https://mail.python.org/pipermail/python-ideas/2017- August/046801.html .. [24] https://mail.python.org/pipermail/python-ideas/2017- August/046790.html .. [25] https://mail.python.org/pipermail/python-ideas/2017- August/046786.html Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ mertz%40gnosis.cx

On Sat, Aug 26, 2017 at 12:56 AM, David Mertz <mertz@gnosis.cx> wrote:
This is now looking really good and I can understands it.
Great!
We were very focused on making the High-level Specification as succinct as possible, omitting some API details that are not important for understanding the semantics. "name" argument is not optional and will be required. If it's optional, people will not provide it, making it very hard to introspect the context when we want it. I guess we'll just update the High-level Specification section to use the correct signature of "new_context_var". Yury

Would it be possible/desirable to make the default a unique string value like a UUID or a stringified counter? On Aug 26, 2017 9:35 AM, "Yury Selivanov" <yselivanov.ml@gmail.com> wrote: On Sat, Aug 26, 2017 at 12:56 AM, David Mertz <mertz@gnosis.cx> wrote:
This is now looking really good and I can understands it.
Great!
One question though. Sometimes creation of a context variable is done
with a
We were very focused on making the High-level Specification as succinct as possible, omitting some API details that are not important for understanding the semantics. "name" argument is not optional and will be required. If it's optional, people will not provide it, making it very hard to introspect the context when we want it. I guess we'll just update the High-level Specification section to use the correct signature of "new_context_var". Yury

On Sat, Aug 26, 2017 at 1:10 PM, David Mertz <mertz@gnosis.cx> wrote:
Would it be possible/desirable to make the default a unique string value like a UUID or a stringified counter?
Sure, or we could just use the id of ContextVar. In the end, when we want to introspect the EC while debugging, we would see something like this: { ContextVar(name='518CDD4F-D676-408F-B968-E144F792D055'): 42, ContextVar(name='decimal_context'): DecimalContext(precision=2), ContextVar(name='7A44D3BE-F7A1-40B7-BE51-7DFFA7E0E02F'): 'spam' } That's why I think it's easier to force users always specify the name: my_var = sys.new_context_var('my_var') This is similar to namedtuples, and nobody really complains about them. Yury

On Sat, Aug 26, 2017 at 11:19 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This is similar to namedtuples, and nobody really complains about them.
FWIW, there are plenty of complaints on python-ideas about this (and never a satisfactory solution). :) That said, I don't think it is as big a deal here since the target audience is much smaller. -eric

On Fri, Aug 25, 2017 at 3:32 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I think this change is a bad idea. I think that generally, an async call like 'await async_sub()' should have the equivalent semantics to a synchronous call like 'sync_sub()', except for the part where the former is able to contain yields. Giving every coroutine an LC breaks that equivalence. It also makes it so in async code, you can't necessarily refactor by moving code in and out of subroutines. Like, if we inline 'sub' into 'main', that shouldn't change the semantics, but... async def main(): var.set('main') # inlined copy of sub() assert var.lookup() == 'main' var.set('sub') assert var.lookup() == 'sub' # end of inlined copy assert var.lookup() == 'main' # fails It also adds non-trivial overhead, because now lookup() is O(depth of async callstack), instead of O(depth of (async) generator nesting), which is generally much smaller. I think I see the motivation: you want to make await sub() and await ensure_future(sub()) have the same semantics, right? And the latter has to create a Task and split it off into a new execution context, so you want the former to do so as well? But to me this is like saying that we want sync_sub() and thread_pool_executor.submit(sync_sub).result() to have the same semantics: they mostly do, but sync_sub() access thread-locals then they won't. Oh well. That's perhaps a but unfortunate, but it doesn't mean we should give every synchronous frame its own thread-locals. (And fwiw I'm still not convinced we should give up on 'yield from' as a mechanism for refactoring generators.)
I found this example confusing -- you talk about sub() and main() running concurrently, but ``wait_for`` blocks main() until sub() has finished running, right? Is this just supposed to show that there should be some sort of inheritance across tasks, and then the next example is to show that it has to be a copy rather than sharing the actual object? (This is just a issue of phrasing/readability.)
It occurs to me that both this and the way generator/coroutines expose their logic context means that logical context objects are semantically mutable. This could create weird effects if someone attaches the same LC to two different generators, or tries to use it simultaneously in two different threads, etc. We should have a little interlock like generator's ag_running, where an LC keeps track of whether it's currently in use and if you try to push the same LC onto two ECs simultaneously then it errors out.
I'm pretty sure you need to also invalidate on context push/pop. Consider: def gen(): var.set("gen") var.lookup() # cache now holds "gen" yield print(var.lookup()) def main(): var.set("main") g = gen() next(g) # This should print "main", but it's the same thread and the last call to set() was # the one inside gen(), so we get the cached "gen" instead print(var.lookup()) var.set("no really main") var.lookup() # cache now holds "no really main" next(g) # should print "gen" but instead prints "no really main"
I think you missed a s/get/lookup/ here :-) -n -- Nathaniel J. Smith -- https://vorpus.org

On Saturday, August 26, 2017 2:34:29 AM EDT Nathaniel Smith wrote:
If we could easily, we'd given each _normal function_ its own logical context as well. What we are talking about here is variable scope leaking up the call stack. I think this is a bad pattern. For decimal context-like uses of the EC you should always use a context manager. For uses like Web request locals, you always have a top function that sets the context vars.
What we want is for `await sub()` to be equivalent to `await asyncio.wait_for(sub())` and to `await asyncio.gather(sub())`. Imagine we allow context var changes to leak out of `async def`. It's easy to write code that relies on this: async def init(): var.set('foo') async def main(): await init() assert var.lookup() == 'foo' If we change `await init()` to `await asyncio.wait_for(init())`, the code will break (and in real world, possibly very subtly).
You would hit cache in lookup() most of the time. Elvis

On Sat, Aug 26, 2017 at 7:58 AM, Elvis Pranskevichus <elprans@gmail.com> wrote:
I mean... you could do that. It'd be easy to do technically, right? But it would make the PEP useless, because then projects like decimal and numpy couldn't adopt it without breaking backcompat, meaning they couldn't adopt it at all. The backcompat argument isn't there in the same way for async code, because it's new and these functions have generally been broken there anyway. But it's still kinda there in spirit: there's a huge amount of collective knowledge about how (synchronous) Python code works, and IMO async code should match that whenever possible.
It's perfectly reasonable to have a script where you call decimal.setcontext or np.seterr somewhere at the top to set the defaults for the rest of the script. Yeah, maybe it'd be a bit cleaner to use a 'with' block wrapped around main(), and certainly in a complex app you want to stick to that, but Python isn't just used for complex apps :-). I foresee confused users trying to figure out why np.seterr suddenly stopped working when they ported their app to use async. This also seems like it makes some cases much trickier. Like, say you have an async context manager that wants to manipulate a context local. If you write 'async def __aenter__', you just lost -- it'll be isolated. I think you have to write some awkward thing like: def __aenter__(self): coro = self._real_aenter() coro.__logical_context__ = None return coro It would be really nice if libraries like urllib3/requests supported async as an option, but it's difficult because they can't drop support for synchronous operation and python 2, and we want to keep a single codebase. One option I've been exploring is to write them in "synchronous style" but with async/await keywords added, and then generating a py2-compatible version with a script that strips out async/await etc. (Like a really simple 3to2 that just works at the token level.) One transformation you'd want to apply is replacing __aenter__ -> __enter__, but this gets much more difficult if we have to worry about elaborate transformations like the above... If I have an async generator, and I set its __logical_context__ to None, then do I also have to set this attribute on every coroutine returned from calling __anext__/asend/athrow/aclose?
I don't feel like there's any need to make gather() have exactly the same semantics as a regular call -- it's pretty clearly a task-spawning primitive that runs all of the given coroutines concurrently, so it makes sense that it would have task-spawning semantics rather than call semantics. wait_for is a more unfortunate case; there's really no reason for it to create a Task at all, except that asyncio made the decision to couple cancellation and Tasks, so if you want one then you're stuck with the other. Yury's made some comments about stealing Trio's cancellation system and adding it to asyncio -- I don't know how serious he was. If he did then it would let you use timeouts without creating a new Task, and this problem would go away. OTOH if you stick with pushing a new LC on every coroutine call, then that makes Trio's cancellation system way slower, because it has to walk the whole stack of LCs on every yield to register/unregister each cancel scope. PEP 550v4 makes that stack much deeper, plus breaks the optimization I was planning to use to let us mostly skip this entirely. (To be clear, this isn't the main reason I think these semantics are a bad idea -- the main reason is that I think async and sync code should have the same semantics. But it definitely doesn't help that it creates obstacles to improving asyncio/improving on asyncio.)
But instead you're making it so that it will break if the user adds/removes async/await keywords: def init(): var.set('foo') def main(): init()
You've just reduced the cache hit rate too, because the cache gets invalidated on every push/pop. Presumably you'd optimize this to skip invalidating if the LC that gets pushed/popped is empty, so this isn't as catastrophic as it might initially look, but you still have to invalidate all the cached variables every time any variable gets touched and then you return from a function. Which might happen quite a bit if, for example, using timeouts involves touching the LC :-). -n -- Nathaniel J. Smith -- https://vorpus.org

On Sun, Aug 27, 2017 at 6:08 AM, Stefan Krah <stefan@bytereef.org> wrote:
TBH Nathaniel's argument isn't entirely correct. With the semantics defined in PEP 550 v4, you still can set decimal context on top of your file, in your async functions etc. This will work: decimal.setcontext(ctx) def foo(): # use decimal with context=ctx and this: def foo(): decimal.setcontext(ctx) # use decimal with context=ctx and this: def bar(): # use decimal with context=ctx def foo(): decimal.setcontext(ctx) bar() and this: def bar(): decimal.setcontext(ctx) def foo(): bar() # use decimal with context=ctx and this: decimal.setcontext(ctx) async def foo(): # use decimal with context=ctx and this: async def bar(): # use decimal with context=ctx async def foo(): decimal.setcontext(ctx) await bar() The only thing that will not work, is this (ex1): async def bar(): decimal.setcontext(ctx) async def foo(): await bar() # use decimal with context=ctx The reason why this one example worked in PEP 550 v3 and doesn't work in v4 is that we want to avoid random code breakage if you wrap your coroutine in a task, like here (ex2): async def bar(): decimal.setcontext(ctx) async def foo(): await wait_for(bar(), 1) # use decimal with context=ctx We want (ex1) and (ex2) to work the same way always. That's the only difference in semantics between v3 and v4, and it's the only sane one, because implicit task creation is an extremely subtle detail that most users aren't aware of. We can't have a semantics that let you easily break your code by adding a timeout in one await. Speaking of (ex1), there's an example that didn't work in any PEP 550 version: def bar(): decimal.setcontext(ctx) yield async def foo(): list(bar()) # use decimal with context=ctx In the above code, bar() generator sets some decimal context, and it will not leak outside of it. This semantics is one of PEP 550 goals. The last change just unifies this semantics for coroutines, generators, and asynchronous generators, which is a good thing. Yury

On Sun, Aug 27, 2017 at 11:19:20AM -0400, Yury Selivanov wrote:
Okay, so if I understand this correctly we actually will not have dynamic scoping for regular functions: bar() has returned, so the new context would not be found on the stack with proper dynamic scoping.
Here we do have dynamic scoping.
What about this? async def bar(): setcontext(Context(prec=1)) for i in range(10): await asyncio.sleep(1) yield i async def foo(): async for i in bar(): # ctx.prec=1? print(Decimal(100) / 3) I'm searching for some abstract model to reason about the scopes. Stefan Krah

On Mon, Aug 28, 2017 at 7:19 AM, Stefan Krah <stefan@bytereef.org> wrote:
Correct. Although I would avoid associating PEP 550 with dynamic scoping entirely, as we never intended to implement it. [..]
Whatever is set in coroutines, generators, and async generators does not leak out. In the above example, "prec=1" will only be set inside "bar()", and "foo()" will not see that. (Same will happen for a regular function and a generator). Yury

On Mon, Aug 28, 2017 at 11:23:12AM -0400, Yury Selivanov wrote:
Good, I agree it does not make sense.
But the state "leaks in" as per your previous example: async def bar(): # use decimal with context=ctx async def foo(): decimal.setcontext(ctx) await bar() IMHO it shouldn't with coroutine-local-storage (let's call it CLS). So, as I see it, there's still some mixture between dynamic scoping and CLS because it this example bar() is allowed to search the stack. Stefan Krah

On Mon, Aug 28, 2017 at 11:52 AM, Stefan Krah <stefan@bytereef.org> wrote: [..]
The whole proposal will then be mostly useless. If we forget about the dynamic scoping (I don't know why it's being brought up all the time, TBH; nobody uses it, almost no language implements it) the current proposal is well balanced and solves multiple problems. Three points listed in the rationale section: * Context managers like decimal contexts, numpy.errstate, and warnings.catch_warnings. * Request-related data, such as security tokens and request data in web applications, language context for gettext etc. * Profiling, tracing, and logging in large code bases. Two of them require context propagation *down* the stack of coroutines. What latest PEP 550 revision does, it prohibits context propagation *up* the stack in coroutines (it's a requirement to make async code refactorable and easy to reason about). Propagation of context "up" the stack in regular code is allowed with threading.local(), and everybody is used to it. Doing that for coroutines doesn't work, because of the reasons covered here: https://www.python.org/dev/peps/pep-0550/#coroutines-and-asynchronous-tasks Yury

On 08/28/2017 09:12 AM, Yury Selivanov wrote:
If we forget about dynamic scoping (I don't know why it's being brought up all the time, TBH; nobody uses it, almost no language implements it)
Probably because it's not lexical scoping, and possibly because it's possible for a function to be running with one EC on one call, and a different EC on the next -- hence, the EC it's using is dynamically determined. It seems to me the biggest difference between "true" dynamic scoping and what PEP 550 implements is the granularity: i.e. not every single function gets it's own LC, just a select few: generators, async stuff, etc. Am I right? (No CS degree here.) If not, what are the differences? -- ~Ethan~

On Mon, Aug 28, 2017 at 12:43 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Sounds right to me. If PEP 550 was about adding true dynamic scoping, we couldn't use it as a suitable context management solution for libraries like decimal. For example, converting decimal/numpy to use new APIs would be a totally backwards-incompatible change. I still prefer using a "better TLS" analogy for PEP 550. We'll likely add a section summarizing differences between threading.local() and new APIs (as suggested by Eric Snow). Yury

On Mon, Aug 28, 2017 at 12:12:00PM -0400, Yury Selivanov wrote:
Because a) it was brought up by proponents of the PEP early on python-ideas, b) people desperately want a mental model of what is going on. :-)
* Context managers like decimal contexts, numpy.errstate, and warnings.catch_warnings.
The decimal context works like this: 1) There is a default context template (user settable). 2) Whenever the first operation *in a new thread* occurs, the thread-local context is initialized with a copy of the template. I don't find it very intuitive if setcontext() is somewhat local in coroutines but they don't participate in some form of CLS. You have to think about things like "what happens in a fresh thread when a coroutine calls setcontext() before any other decimal operation has taken place". So perhaps Nathaniel is right that the PEP is not so useful for numpy and decimal backwards compat. Stefan Krah

On Mon, Aug 28, 2017 at 1:33 PM, Stefan Krah <stefan@bytereef.org> wrote: [..]
I'm sorry, I don't follow you here. PEP 550 semantics: setcontext() in a regular code would set the context for the whole thread. setcontext() in a coroutine/generator/async generator would set the context for all the code it calls.
So perhaps Nathaniel is right that the PEP is not so useful for numpy and decimal backwards compat.
Nathaniel's argument is pretty weak as I see it. He argues that some people would take the following code: def bar(): # set decimal context def foo(): bar() # use the decimal context set in bar() and blindly convert it to async/await: async def bar(): # set decimal context async def foo(): await bar() # use the decimal context set in bar() And that it's a problem that it will stop working. But almost nobody converts the code by simply slapping async/await on top of it -- things don't work this way. It was never a goal for async/await or asyncio, or even trio/curio. Porting code to async/await almost always requires a thoughtful rewrite. In async/await, the above code is an *anti-pattern*. It's super fragile and can break by adding a timeout around "await bar". There's no workaround here. Asynchronous code is fundamentally non-local and a more complex topic on its own, with its own concepts: Asynchronous Tasks, timeouts, cancellation, etc. Fundamentally: "(synchronous code) != (asynchronous code) - (async/await)". Yury

Yury Selivanov wrote:
Maybe not, but it will also affect refactoring of code that is *already* using async/await, e.g. taking async def foobar(): # set decimal context # use the decimal context we just set and refactoring it as above. Given that one of the main motivations for yield-from (and subsequently async/await) was so that you *can* perform that kind of refactoring easily, that does indeed seem like a problem to me. It seems to me that individual generators/coroutines shouldn't automatically get a context of their own, they should have to explicitly ask for one. -- Greg -- things don't work this way. It was never a goal for

On Mon, Aug 28, 2017 at 6:22 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
There's no code that already uses async/await and decimal context managers/setters. Any such code is broken right now, because decimal context set in one coroutine affects them all. Your example would work only if foobar() is the only coroutine in your program.
With the current PEP 550 semantics w.r.t. generators you still can refactor them. The following code would work as expected: def nested_gen(): # use some_context def gen(): with some_context(): yield from nested_gen() list(gen()) I saying that the following should not work: def nested_gen(): set_some_context() yield def gen(): # some_context is not set yield from nested_gen() # use some_context ??? list(gen()) IOW, any context set in generators should not leak to the caller, ever. This is the whole point of the PEP. As for async/await, see this: https://mail.python.org/pipermail/python-dev/2017-August/149022.html Yury

On Mon, Aug 28, 2017 at 6:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Consider the following generator: def gen(): with decimal.context(...): yield We don't want gen's context to leak to the outer scope -- that's one of the reasons why PEP 550 exists. Even if we do this: g = gen() next(g) # the decimal.context won't leak out of gen So a Python user would have a mental model: context set in generators doesn't leak. Not, let's consider a "broken" generator: def gen(): decimal.context(...) yield If we iterate gen() with next(), it still won't leak its context. But if "yield from" has semantics that you want -- "yield from" to be just like function call -- then calling yield from gen() will corrupt the context of the caller. I simply want consistency. It's easier for everybody to say that generators never leaked their context changes to the outer scope, rather than saying that "generators can sometimes leak their context". Yury

On Mon, Aug 28, 2017 at 7:16 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Adding to the above: there's a fundamental reason why we can't make "yield from" transparent for EC modifications. While we want "yield from" to have semantics close to a function call, in some situations we simply can't. Because you can manually iterate a generator and then 'yield from' it, you can have this weird 'partial-function-call' semantics. For example: var = new_context_var() def gen(): var.set(42) yield yield Now, we can partially iterate the generator (1): def main(): g = gen() next(g) # we don't want 'g' to leak its EC changes, # so var.get() is None here. assert var.get() is None and then we can "yield from" it (2): def main(): g = gen() next(g) # we don't want 'g' to leak its EC changes, # so var.get() is None here. assert var.get() is None yield from g # at this point it's too late for us to let var leak into # main().__logical_context__ For (1) we want the context change to be isolated. For (2) you say that the context change should propagate to the caller. But it's impossible: 'g' already has its own LC({var: 42}), and we can't merge it with the LC of "main()". "await" is fundamentally different, because it's not possible to partially iterate the coroutine before awaiting it (asyncio will break if you call "coro.send(None)" manually). Yury

Yury Selivanov wrote:
While we want "yield from" to have semantics close to a function call,
That's not what I said! I said that "yield from foo()" should have semantics close to a function call. If you separate the "yield from" from the "foo()", then of course you can get different behaviours. But that's beside the point, because I'm not suggesting that generators should behave differently depending on when or if you use "yield from" on them.
For (1) we want the context change to be isolated. For (2) you say that the context change should propagate to the caller.
No, I'm saying that the context change should *always* propagate to the caller, unless you do something explicit within the generator to prevent it. I have some ideas on what that something might be, which I'll post later. -- Greg

On Tue, Aug 29, 2017 at 7:36 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
BTW we already have mechanisms to always propagate context to the caller -- just use threading.local() or a global variable. PEP 550 is for situations when you explicitly don't want to propagate the state. Anyways, I'm curious to hear your ideas. Yury

On 30 August 2017 at 10:18, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Writing an "update_parent_context" decorator is also trivial (and will work for both sync and async generators): def update_parent_context(gf): @functools.wraps(gf): def wrapper(*args, **kwds): gen = gf(*args, **kwds): gen.__logical_context__ = None return gen return wrapper The PEP already covers that approach when it talks about the changes to contextlib.contextmanager to get context changes to propagate automatically. With contextvars getting its own module, it would also be straightforward to simply include that decorator as part of its API, so folks won't need to write their own. While I'm not sure how much practical use it will see, I do think it's important to preserve the *ability* to transparently refactor generators using yield from - I'm just OK with such a refactoring becoming "yield from update_parent_context(subgen())" instead of the current "yield from subgen()" (as I think *not* updating the parent context is a better default than updating it). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 30 August 2017 at 16:40, Nick Coghlan <ncoghlan@gmail.com> wrote:
Oops, I got mixed up between whether I thought this should be a decorator or an explicitly called helper function. One option would be to provide both: def update_parent_context(gen): ""Configures a generator-iterator to update its caller's context variables"""" gen.__logical_context__ = None return gen def updates_parent_context(gf): ""Wraps a generator function's instances with update_parent_context"""" @functools.wraps(gf): def wrapper(*args, **kwds): return update_parent_context(gf(*args, **kwds)) return wrapper Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Yury Selivanov wrote:
BTW we already have mechanisms to always propagate context to the caller -- just use threading.local() or a global variable.
But then you don't have a way to *not* propagate the context change when you don't want to. Here's my suggestion: Make an explicit distinction between creating a new binding for a context var and updating an existing one. So instead of two API calls there would be three: contextvar.new(value) # Creates a new binding only # visible to this frame and # its callees contextvar.set(value) # Updates existing binding in # context inherited from caller contextvar.get() # Retrieves the current binding If we assume an extension to the decimal module so that decimal.localcontext is a context var, we can now do this: async def foo(): # Establish a new context for this task decimal.localcontext.new(decimal.Context()) # Delegate changing the context await bar() # Do some calculations yield 17 * math.pi + 42 async def bar(): # Change context for caller decimal.localcontext.prec = 5 -- Greg

On Wed, Aug 30, 2017 at 8:55 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Interesting. Question: how to write a context manager with contextvar.new? var = new_context_var() class CM: def __enter__(self): var.new(42) with CM(): print(var.get() or 'None') My understanding that the above code will print "None", because "var.new()" makes 42 visible only to callees of __enter__. But if I use "set()" in "CM.__enter__", presumably, it will traverse the stack of LCs to the very bottom and set "var=42" in in it. Right? If so, how can fix the example in PEP 550 Rationale: https://www.python.org/dev/peps/pep-0550/#rationale where we zip() the "fractions()" generator? With current PEP 550 semantics that's trivial: https://www.python.org/dev/peps/pep-0550/#generators Yury

Yury Selivanov wrote:
If you tie the introduction of a new scope for context vars to generators, as PEP 550 currently does, then this isn't a problem. But I'm trying to avoid doing that. The basic issue is that, ever since yield-from, "generator" and "task" are not synonymous. When you use a generator to implement an iterator, you probably want it to behave as a distinct task with its own local context. But a generator used with yield-from isn't a task of its own, it's just part of another task, and there is nothing built into Python that lets you tell the difference automatically. So I'm now thinking that the introduction of a new local context should also be explicit. Suppose we have these primitives: push_local_context() pop_local_context() Now introducing a temporary decimal context looks like: push_local_context() decimal.localcontextvar.new(decimal.getcontext().copy()) decimal.localcontextvar.prec = 5 do_some_calculations() pop_local_context() Since calls (either normal or generator) no longer automatically result in a new local context, we can easily factor this out into a context manager: class LocalDecimalContext(): def __enter__(self): push_local_context() ctx = decimal.getcontext().copy() decimal.localcontextvar.new(ctx) return ctx def __exit__(self): pop_local_context() Usage: with LocalDecimalContext() as ctx: ctx.prec = 5 do_some_calculations() -- Greg

On Tue, Sep 5, 2017 at 4:59 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Greg, have you seen this new section: https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-cha... ? It has a couple of examples that illustrate some issues with the "But a generator used with yield-from isn't a task of its own, it's just part of another task," reasoning. In principle, we can modify PEP 550 to make 'yield from' transparent to context changes. The interpreter can just reset g.__logical_context__ to None whenever 'g' is being 'yield-frommed'. The key issue is that there are a couple of edge-cases when having this semantics is problematic. The bottomline is that it's easier to reason about context when it's guaranteed that context changes are always isolated in generators no matter what. I think this semantics actually makes the refactoring easier. Please take a look at the linked section.
This will have some performance implications and make the API way more complex. But I'm not convinced yet that real-life code needs the semantics you want. This will work with the current PEP 550 design: def g(): with DecimalContext() as ctx: ctx.prec = 5 yield from do_some_calculations() # will run with the correct ctx the only thing that won't work is this: def do_some_calculations(): ctx = DecimalContext() ctx.prec = 10 decimal.setcontext(ctx) yield def g(): yield from do_some_calculations() # Context changes in do_some_calculations() will not leak to g() In the above example, do_some_calculations() deliberately tries to leak context changes (by not using a contextmanager). And I consider it a feature that PEP 550 does not allow generators to leak state. If you write code that uses 'with' statements consistently, you will never even know that context changes are isolated in generators. Yury

Yury Selivanov wrote:
Greg, have you seen this new section: https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-cha...
That section seems to be addressing the idea of a generator behaving differently depending on whether you use yield-from on it. I never suggested that, and I'm still not suggesting it.
I don't see a lot of value in trying to automagically isolate changes to global state *only* in generators. Under PEP 550, if you want to e.g. change the decimal context temporarily in a non-generator function, you're still going to have to protect those changes using a with-statement or something equivalent. I don't see why the same thing shouldn't apply to generators. It seems to me that it will be *more* confusing to give generators this magical ability to avoid with-statements.
This will have some performance implications and make the API way more complex.
I can't see how it would have any significant effect on performance. The implementation would be very similar to what's currently described in the PEP. You'll have to elaborate on how you think it would be less efficient. As for complexity, push_local_context() and push_local_context() would be considered low-level primitives that you wouldn't often use directly. Most of the time they would be hidden inside context managers. You could even have a context manager just for applying them: with new_local_context(): # go nuts with context vars here
But I'm not convinced yet that real-life code needs the semantics you want.
And I'm not convinced that it needs as much magic as you want.
If you write code that uses 'with' statements consistently, you will never even know that context changes are isolated in generators.
But if you write code that uses context managers consistently, and those context managers know about and handle local contexts properly, generators don't *need* to isolate their context automatically. -- Greg

Another comment from bystander point of view: it looks like the discussions of API design and implementation are a bit entangled here. This is much better in the current version of the PEP, but still there is a _feelling_ that some design decisions are influenced by the implementation strategy. As I currently see the "philosophy" at large is like this: there are different level of coupling between concurrently executing code: * processes: practically not coupled, designed to be long running * threads: more tightly coupled, designed to be less long-lived, context is managed by threading.local, which is not inherited on "forking" * tasks: tightly coupled, designed to be short-lived, context will be managed by PEP 550, context is inherited on "forking" This seems right to me. Normal generators fall out from this "scheme", and it looks like their behavior is determined by the fact that coroutines are implemented as generators. What I think miht help is to add few more motivational examples to the design section of the PEP. -- Ivan

On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
Literally the first motivating example at the beginning of the PEP ('def fractions ...') involves only generators, not coroutines, and only works correctly if generators get special handling. (In fact, I'd be curious to see how Greg's {push,pop}_local_storage could handle this case.) The implementation strategy changed radically between v1 and v2 because of considerations around generator (not coroutine) semantics. I'm not sure what more it can do to dispel these feelings :-). -n -- Nathaniel J. Smith -- https://vorpus.org

On Wed, Sep 6, 2017 at 12:13 PM, Nathaniel Smith <njs@pobox.com> wrote:
Just to mention that this is now closely related to the discussion on my proposal on python-ideas. BTW, that proposal is now submitted as PEP 555 on the peps repo. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 6 September 2017 at 11:13, Nathaniel Smith <njs@pobox.com> wrote:
And this is probably what confuses people. As I understand, the tasks/coroutines are among the primary motivations for the PEP, but they appear somewhere later. There are four potential ways to see the PEP: 1) Generators are broken*, and therefore coroutines are broken, we want to fix the latter therefore we fix the former. 2) Coroutines are broken, we want to fix them and let's also fix generators while we are at it. 3) Generators are broken, we want to fix them and let's also fix coroutines while we are at it. 4) Generators and coroutines are broken in similar ways, let us fix them as consistently as we can. As I understand the PEP is based on option (4), please correct me if I am wrong. Therefore maybe this should be said more straight, and maybe then we should show _in addition_ a task example in rationale, show how it is broken, and explain that they are broken in slightly different ways (since expected semantics is a bit different). -- Ivan * here and below by broken I mean "broken" (sometimes behave in non-intuitive way, and lack some functionality we would like them to have)

On Wed, Sep 6, 2017 at 5:58 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
Ivan, generators and coroutines are fundamentally different objects (even though they share the implementation). The only common thing is that they both allow for out of order execution of code in the same OS thread. The PEP explains the semantical difference of EC in the High-level Specification in detail, literally on the 2nd page of the PEP. I don't see any benefit in reshuffling the rationale section. Yury

Nathaniel Smith wrote:
I've given a decimal-based example, but it was a bit scattered. Here's a summary and application to the fractions example. I'm going to assume that the decimal module has been modified to keep the current context in a context var, and that getcontext() and setcontext() access that context var. THe decimal.localcontext context manager is also redefined as: class localcontext(): def __enter__(self): push_local_context() ctx = getcontext().copy() setcontext(ctx) return ctx def __exit__(self): pop_local_context() Now we can write the fractions generator as: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y ** 2) You may notice that this is exactly the same as what you would write today for the same task... -- Greg

On Wed, Sep 6, 2017 at 5:00 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
1. So essentially this means that we will have one "local context" per context manager storing one value. 2. If somebody makes a mistake and calls "push_local_context" without a corresponding "pop_local_context" -- you will have an unbounded growth of LCs (happen's in Koos' proposal too, btw). 3. Users will need to know way more to correctly use the mechanism. So far, both you and Koos can't give us a realistic example which illustrates why we should suffer the implications of (1), (2), and (3). Yury

Yury Selivanov wrote:
1. So essentially this means that we will have one "local context" per context manager storing one value.
I can't see that being a major problem. Context vars will (I hope!) be very rare things, and needing to change a bunch of them in one function ought to be rarer still. But if you do, it would be easy to provide a context manager whose sole effect is to introduce a new context: with new_local_context(): cvar1.set(something) cvar2.set(otherthing) ...
2. If somebody makes a mistake and calls "push_local_context" without a corresponding "pop_local_context"
You wouldn't normally call them directly, they would be encapsulated in carefully-written context managers. If you do use them, you're taking responsibility for using them correctly. If it would make you feel happier, they could be named _push_local_context and _pop_local_context to emphasise that they're not intended for everyday use.
3. Users will need to know way more to correctly use the mechanism.
Most users will simply be using already-provided context managers, which they're *already used to doing*. So they won't have to know anything more than they already do. See my last decimal example, which required *no change* to existing correct user code.
And you haven't given a realistic example that convinces me your proposed with-statement-elimination feature would be of significant benefit. -- Greg

Nathaniel Smith wrote:
I can't say the changes have dispelled any feelings on my part. The implementation suggested in the PEP seems very complicated and messy. There are garbage collection issues, which it proposes using weak references to mitigate. There is also apparently some issue with long chains building up and having to be periodically collapsed. None of this inspires confidence that we have the basic design right. My approach wouldn't have any of those problems. The implementation would be a lot simpler. -- Greg

On Wed, Sep 6, 2017 at 5:06 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
"messy" and "complicated" doesn't sound like a valuable feedback :( There are no "garbage collection issues", sorry. The issue that we use weak references for is the same issue why threading.local() uses them: def foo(): var = ContextVar() var.set(1) for _ in range(10**6): foo() If 'var' is strongly referenced, we would have a bunch of them.
Cool. Yury

Yury Selivanov wrote:
Erk. This is not how I envisaged context vars would be used. What I thought you would do is this: my_context_var = ContextVar() def foo(): my_context_var.set(1) This problem would also not arise if context vars simply had names instead of being magic key objects: def foo(): contextvars.set("mymodule.myvar", 1) That's another thing I think would be an improvement, but it's orthogonal to what we're talking about here and would be best discussed separately. -- Greg

On Thu, Sep 7, 2017 at 10:54 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
There are lots of things in this discussion that I should have commented on, but here's one related to this. PEP 555 does not have the resource-management issue described above and needs no additional tricks to achieve that: # using PEP 555 def foo(): var = contextvars.Var() with var.assign(1): # do something [*] for _ in range(10**6): foo() Every time foo is called, a new context variable is created, but that's perfectly fine, and lightweight. As soon as the context manager exits, there are no references to the Assignment object returned by var.assign(1), and as soon as foo() returns, there are no references to var, so everything should get cleaned up nicely. And regarding string keys, they have pros and cons, and they can be added easily, so let's not go there now. -- Koos [*] (nit-picking) without closures that would keep the var reference alive -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Thursday, September 7, 2017 3:54:15 AM EDT Greg Ewing wrote:
On the contrary, using simple names (PEP 550 V1 was actually doing that) is a regression. It opens up namespace clashing issues. Imagine you have a variable named "foo", and then some library you import also decides to use the name "foo", what then? That's one of the reasons why we do `local = threading.local()` instead of `threading.set_local("foo", 1)`. Elvis

On Wednesday, September 6, 2017 8:06:36 PM EDT Greg Ewing wrote:
I might have missed something, but your claim doesn't make any sense to me. All you've proposed is to replace the implicit and guaranteed push_lc()/pop_lc() around each generator with explicit LC stack management. You *still* need to retain and switch the current stack on every generator send() and throw(). Everything else written out in PEP 550 stays relevant as well. As for the "long chains building up", your approach is actually much worse. The absense of a guaranteed context fence around generators would mean that contextvar context managers will *have* to push LCs whether really needed or not. Consider the following (naive) way of computing the N-th Fibonacci number: def fib(n): with decimal.localcontext(): if n == 0: return 0 elif n == 1: return 1 else: return fib(n - 1) + fib(n - 2) Your proposal can cause the LC stack to grow incessantly even in simple cases, and will affect code that doesn't even use generators. A great deal of effort was put into PEP 550, and the matter discussed is far from trivial. What you see as "complicated and messy" is actually the result of us carefully considering the solutions to real- world problems, and then the implications of those solutions (including the worst-case scenarios.) Elvis

Ivan Levkivskyi wrote:
This is what I disagree with. Generators don't implement coroutines, they implement *parts* of coroutines. We want "task local storage" that behaves analogously to thread local storage. But PEP 550 as it stands doesn't give us that; it gives something more like "function local storage" for certain kinds of function. -- Greg

On Wed, Sep 6, 2017 at 4:27 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
The PEP gives you a Task Local Storage, where Task is: 1. your single-threaded code 2. a generator 3. an async task If you correctly use context managers, PEP 550 works intuitively and similar to how one would think that threading.local() should work. The only example you (and Koos) can come up with is this: def generator(): set_decimal_context() yield next(generator()) # decimal context is not set # or yield from generator() # decimal context is still not set I consider that the above is a feature. Yury

Yury Selivanov wrote:
My version works *more* similarly to thread-local storage, IMO. Currently, if you change the decimal context without using a with-statement or something equivalent, you *don't* expect the change to be confined to the current function or sub-generator or async sub-task. All I'm asking for is one consistent rule: If you want a context change encapsulated, use a with-statement. If you don't, don't. Not only is this rule simpler than yours, it's the *same* rule that we have now, so there is less for users to learn. -- Greg

On Wed, Sep 6, 2017 at 12:07 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Greg, just to make sure that we are talking about the same thing, could you please show an example (using the current PEP 550 API/semantics) of something that in your opinion should work differently for generators? Yury

On Wed, Sep 6, 2017 at 10:07 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Regarding this, I think yield from should have the same semantics as iterating over the generator with next/send, and PEP 555 has no issues with this.
Exactly. To state it clearly: PEP 555 does not have this issue. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I think yield from should have the same semantics as iterating over the generator with next/send, and PEP 555 has no issues with this.
I think the onus is on you and Greg to show a realistic example that shows why this is necessary. So far all the argumentation about this has been of the form "if you have code that currently does this (example using foo) and you refactor it in using yield from (example using bar), and if you were relying on context propagation back out of calls, then it should still propagate out." This feels like a very abstract argument. I have a feeling that context state propagating out of a call is used relatively rarely -- it must work for cases where you refactor something that changes context inline into a utility function (e.g. decimal.setcontext()), but I just can't think of a realistic example where coroutines (either of the yield-from variety or of the async/def form) would be used for such a utility function. A utility function that sets context state but also makes a network call just sounds like asking for trouble! -- --Guido van Rossum (python.org/~guido)

On Wed, Sep 6, 2017 at 8:16 PM, Guido van Rossum <guido@python.org> wrote:
Well, regarding this part, it's just that things like for obj in gen: yield obj often get modernized into yield from gen And realistic examples of that include pretty much any normal use of yield from. So far all the argumentation about this has been of the form "if you have
So here's a realistic example, with the semantics of PEP 550 applied to a decimal.setcontext() kind of thing, but it could be anything using var.set(value): def process_data_buffers(buffers): setcontext(default_context) for buf in buffers: for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) Code smells? Yes, but maybe you often see much worse things, so let's say it's fine. But then, if you refactor it into a subgenerator like this: def process_data_buffer(buffer): for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) def process_data_buffers(buffers): setcontext(default_context) for buf in buffers: yield from buf Now, if setcontext uses PEP 550 semantics, the refactoring broke the code, because a generator introduce a scope barrier by adding a LogicalContext on the stack, and setcontext is only local to the process_data_buffer subroutine. But the programmer is puzzled, because with regular functions it had worked just fine in a similar situation before they learned about generators: def process_data_buffer(buffer, output): for data in buf: if data.tag == "precision change": setcontext(context_based_on(data)) else: output.append(compute(data)) def process_data_buffers(buffers): output = [] setcontext(default_context) for buf in buffers: process_data_buffer(buf, output) In fact, this code had another problem, namely that the context state is leaked out of process_data_buffers, because PEP 550 leaks context state out of functions, but not out of generators. But we can easily imagine that the unit tests for process_data_buffers *do* pass. But let's look at a user of the functionality: def get_total(): return sum(process_data_buffers(get_buffers())) setcontext(somecontext) value = get_total() * compute_factor() Now the code is broken, because setcontext(somecontext) has no effect, because get_total() leaks out another context. Not to mention that our data buffer source now has control over the behavior of compute_factor(). But if one is lucky, the last line was written as value = compute_factor() * get_total() And hooray, the code works! (Except for perhaps the code that is run after this.) Now this was of course a completely fictional example, and hopefully I didn't introduce any bugs or syntax errors other than the ones I described. I haven't seen code like this anywhere, but somehow we caught the problems anyway. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I know that that's the pattern, but everybody just shows the same foo/bar example.
And realistic examples of that include pretty much any normal use of yield from.
There aren't actually any "normal" uses of yield from. The vast majority of uses of yield from are in coroutines written using yield from.
Yeah, so my claim this is simply a non-problem, and you've pretty much just proved that by failing to come up with pointers to actual code that would suffer from this. Clearly you're not aware of any such code. -- --Guido van Rossum (python.org/~guido)

On Wed, Sep 6, 2017 at 11:55 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A real-code example: make it possible to implement decimal.setcontext() on top of PEP 550 semantics. I still feel that there's some huge misunderstanding in the discussion: PEP 550 does not promote "not using context managers". It simply implements a low-level mechanism to make it possible to implement context managers for generators/coroutines/etc. Whether this API is used to write context managers or not is completely irrelevant to the discussion. How does threading.local() promote or demote using of context managers? The answer: it doesn't. Same answer is for PEP 550, which is a similar mechanism. Yury

On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
Thank you for the example, Koos. FWIW I agree it is a "completely fictional example". There are two ways how we can easily adapt PEP 550 to follow your semantics: 1. Set gen.__logical_context__ to None when it is being 'yield frommmed' 2. Merge gen.__logical_context__ with the outer LC when the generator is iterated to the end. But I still really dislike the examples you and Greg show to us. They are not typical or real-world examples, they are showcases of ways to abuse contexts. I still think that giving Python programmers one strong rule: "context mutation is always isolated in generators" makes it easier to reason about the EC and write maintainable code. Yury

Yury Selivanov wrote:
Whereas I think it makes code *harder* to reason about, because to take advantage of it you need to be acutely aware of whether the code you're working on is in a generator/coroutine or not. It seems simpler to me to have one rule for all kinds of functions: If you're making a temporary change to contextual state, always encapsulate it in a with statement. -- Greg

Guido van Rossum wrote:
Yuri has already found one himself, the __aenter__ and __aexit__ methods of an async context manager.
A utility function that sets context state but also makes a network call just sounds like asking for trouble!
I'm coming from the other direction. It seems to me that it's not very useful to allow with-statements to be skipped in certain very restricted circumstances. The only situation in which you will be able to take advantage of this is if the context change is being made in a generator or coroutine, and it is to apply to the whole body of that generator or coroutine. If you're in an ordinary function, you'll still have to use a context manager. If you only want the change to apply to part of the body, you'll still have to use a context manager. It would be simpler to just tell people to always use a context manager, wouldn't it? -- Greg

On Wed, Sep 6, 2017 at 11:26 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
__aenter__ is not a generator and there's no 'yield from' there. Coroutines (within an async task) leak state just like regular functions (within a thread). Your argument is to allow generators to leak context changes (right?). AFAIK we don't use generators to implement __enter__ or __aenter__ (generators decorated with @types.coroutine or @asyncio.coroutine are coroutines, according to PEP 492). So this is irrelevant.
Can you clarify what do you mean by "with-statements to be skipped"? This language is not used in PEP 550 or in Python documentation. I honestly don't understand what it means.
Yes, PEP 550 wants people to always use a context managers! Which will work as you expect them to work for coroutines, generators, and regular functions. At this point I suspect you have some wrong idea about some specification detail of PEP 550. I understand what Koos is talking about, but I really don't follow you. Using the "with-statements to be skipped" language is very confusing and doesn't help to understand you. Yury

Yury Selivanov wrote:
If I understand correctly, instead of using a context manager, your fractions example could be written like this: def fractions(precision, x, y): ctx = decimal.getcontext().copy() decimal.setcontext(ctx) ctx.prec = precision yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y ** 2) and it would work without leaking changes to the decimal context, despite the fact that it doesn't use a context manager or do anything else to explicitly put back the old context. Am I right about that? This is what I mean by "skipping context managers" -- that it's possible in some situations to get by without using a context manager, by taking advantage of the implicit local context push that happens whenever a generator is started up. Now, there are two possibilities: 1) You take advantage of this, and don't use context managers in some or all of the places where you don't need to. You seem to agree that this would be a bad idea. 2) You ignore it and always use a context manager, in which case it's not strictly necessary for the implicit context push to occur, since the relevant context managers can take care of it. So there doesn't seem to be any great advantage to the automatic context push, and it has some disadvantages, such as yield-from not quite working as expected in some situations. Also, it seems that every generator is going to incur the overhead of allocating a logical_context even when it doesn't actually change any context vars, which most generators won't. -- Greg

On 09/07/2017 03:37 AM, Greg Ewing wrote:
The disagreement seems to be whether a LogicalContext should be created implicitly vs explicitly (or opt-out vs opt-in). As a user trying to track down a decimal context change not propagating, I would not suspect the above code of automatically creating a LogicalContext and isolating the change, whereas Greg's context manager version is abundantly clear. The implicit vs explicit argument comes down, I think, to resource management: some resources in Python are automatically managed (memory), and some are not (files) -- which type should LCs be? -- ~Ethan~

On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote:
You are confusing resource management with the isolation mechanism. PEP 550 contextvars are analogous to threading.local(), which the PEP makes very clear from the outset. threading.local(), the isolation mechanism, is *implicit*. decimal.localcontext() is an *explicit* resource manager that relies on threading.local() magic. PEP 550 simply provides a threading.local() alternative that works in tasks and generators. That's it! Elvis

On 09/07/2017 06:41 AM, Elvis Pranskevichus wrote:
On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote:
I might be, and I wouldn't be surprised. :) On the other hand, one can look at isolation as being a resource.
threading.local(), the isolation mechanism, is *implicit*.
I don't think so. You don't get threading.local() unless you call it -- that makes it explicit.
The concern is *how* PEP 550 provides it: - explicitly, like threading.local(): has to be set up manually, preferably with a context manager - implicitly: it just happens under certain conditions -- ~Ethan~

On Thursday, September 7, 2017 10:06:14 AM EDT Ethan Furman wrote:
You literally replace threading.local() with contextvars.ContextVar(): import threading _decimal_context = threading.local() def set_decimal_context(ctx): _decimal_context.context = ctx Becomes: import contextvars _decimal_context = contextvars.ContextVar('decimal.Context') def set_decimal_context(ctx): _decimal_context.set(ctx) Elvis

I write it in a new thread, but I also want to write it here -- I need a time out in this discussion so I can think about it more. -- --Guido van Rossum (python.org/~guido)

On 7 September 2017 at 07:06, Ethan Furman <ethan@stoneleaf.us> wrote:
A recurring point of confusion with the threading.local() analogy seems to be that there are actually *two* pieces to that analogy: * threading.local() <-> contextvars.ContextVar * PyThreadState_GetDict() <-> LogicalContext (See https://github.com/python/cpython/blob/a6a4dc816d68df04a7d592e0b6af8c7ecc4d4... for the definition of the PyThreadState_GetDict) For most practical purposes as a *user* of thread locals, the involvement of PyThreadState and the state dict is a completely hidden implementation detail. However, every time you create a new thread, you're implicitly getting a new Python thread state, and hence a new thread state dict, and hence a new set of thread local values. Similarly, as a *user* of context variables, you'll generally be able to ignore the manipulation of the execution context going on behind the scenes - you'll just get, set, and delete individual context variables without worrying too much about exactly where and how they're stored. PEP 550 itself doesn't have that luxury, though, since in addition to defining how users will access and update these values, it *also* needs to define how the interpreter will implicitly manage the execution context for threads and generators and how event loops (including asyncio as the reference implementation) are going to be expected to manage the execution context explicitly when scheduling coroutines. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Sep 07, 2017 at 09:41:10AM -0400, Elvis Pranskevichus wrote:
If there only were a name that would make it explicit, like TaskLocalStorage. ;) Seriously, the problem with 'context' is that it is: a) A predefined set of state values like in the Decimal (I think also the OpenSSL) context. But such a context is put inside another context (the ExecutionContext). b) A theoretical concept from typed Lambda calculus (in the context 'gamma' the variable 'v' has type 't'). But this concept would be associated with lexical scope and would extend to functions (not only tasks and generators). c) ``man 3 setcontext``. A replacement for setjmp/longjmp. Somewhat related in that it could be used to implement coroutines. d) The .NET flowery language. I do did not fully understand what the .NET ExecutionContext and its 2881 implicit flow rules are. ... Stefan Krah

On Thursday, September 7, 2017 6:37:58 AM EDT Greg Ewing wrote:
The advantage is that context managers don't need to *always* allocate and push an LC. [1]
By default, generators reference an empty LogicalContext object that is allocated once (like the None object). We can do that because LCs are immutable. Elvis [1] https://mail.python.org/pipermail/python-dev/2017-September/ 149265.html

Elvis Pranskevichus wrote:
Ah, I see. That wasn't clear from the implementation, where gen.__logical_context__ = contextvars.LogicalContext() looks like it's creating a new one. However, there's another thing: it looks like every time a generator is resumed/suspended, an execution context node is created/discarded. -- Greg

There is one thing I misunderstood. Since generators and coroutines are almost exactly the same underneath, I had thought that the automatic logical_context creation for generators was also going to apply to coroutines, but from reading the PEP again it seems that's not the case. Somehow I missed that the first time. Sorry about that. So, context vars do behave like "task locals storage" for asyncio Tasks, which is good. The only issue is whether a generator should be considered an "ad-hoc task" for this purpose. I can see your reasons for thinking that it should be. I can also understand your thinking that the yield-from issue is such an obscure corner case that it's not worth worrying about, especially since there is a workaround available (setting _logical_context_ to None) if needed. I'm not sure how I feel about that now. I agree that it's an obscure case, but the workaround seems even more obscure, and is unlikely to be found by anyone who isn't closely familiar with the inner workings. I think I'd be happier if there were a higher-level way of applying this workaround, such as a decorator: @subgenerator def g(): ... Then the docs could say "If you want a generator to *not* have its own task local storage, wrap it with @subgenerator." By the way, I think "Task Local Storage" would be a much better title for this PEP. It instantly conveys the basic idea in a way that "Execution Context" totally fails to do. It might also serve as a source for some better terminology for parts of the implementation, such as TaskLocalStorage and TaskLocalStorageStack instead of logical_context and execution_context. I found the latter terms almost devoid of useful meaning when trying to understand the implementation. -- Greg

There are a couple of things in the PEP I'm confused about: 1) Under "Generators" it says: once set in the generator, the context variable is guaranteed not to change between iterations; This suggests that you're not allowed to set() a given context variable more than once in a given generator, but some of the examples seem to contradict that. So I'm not sure what this is trying to say. 2) I don't understand why the logical_contexts have to be immutable. If every task or generator that wants its own task-local storage has its own logical_context instance, why can't it be updated in-place? -- Greg

On 09/07/2017 04:39 AM, Greg Ewing wrote:
I believe I can answer this part: the guarantee is that - the context variable will not be changed while the yield is in effect -- or, said another way, while the generator is suspended; - the context variable will not be changed by subgenerators - the context variable /may/ be changed by normal functions/class methods (since calling them would be part of the iteration) -- ~Ethan~

On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
It would be great if you or Greg could show a couple of real-world examples showing the "issue" (with the current PEP 550 APIs/semantics). PEP 550 treats coroutines and generators as objects that support out of order execution. OS threads are similar to them in some ways. I find it questionable to try to enforce context management rules we have for regular functions to generators/coroutines. I don't really understand the "refactoring" argument you and Greg are talking about all the time. PEP 555 still doesn't clearly explain how exactly it is different from PEP 550. Because 555 was posted *after* 550, I think that it's PEP 555 that should have that comparison. Yury

On Wed, Sep 6, 2017 at 8:22 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote: [...]
PEP 550 treats coroutines and generators as objects that support out of order execution.
Out of order? More like interleaved.
555 was *posted* as a pep after 550, yes. And yes, there could be a comparison, especially now that PEP 550 semantics seem to have converged, so PEP 555 does not have to adapt the comparison to PEP 550 changes. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Yury Selivanov wrote:
Here's one way that refactoring could trip you up. Start with this: async def foo(): calculate_something() #in a coroutine, so we can be lazy and not use a cm ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() And factor part of it out (into an *ordinary* function!) async def foo(): calculate_something() calculate_something_else_with_5_digits() def calculate_something_else_with_5_digits(): ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() Now we add some more calculation to the end of foo(): async def foo(): calculate_something() calculate_something_else_with_5_digits() calculate_more_stuff() Here we didn't intend calculate_more_stuff() to be done with prec=5, but we forgot that calculate_something_else_ with_5_digits() changes the precision and *doesn't restore it* because we didn't add a context manager to it. If we hadn't been lazy and had used a context manager in the first place, that wouldn't have happened. Summary: I think that skipping context managers in some circumstances is a bad habit that shouldn't be encouraged. -- Greg

On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Where exactly does PEP 550 encourage users to be "lazy and not use a cm"? PEP 550 provides a mechanism for implementing context managers! What is this example supposed to show?
How is PEP 550 is at fault of somebody being lazy and not using a context manager? PEP 550 has a hard requirement to make it possible for decimal/other libraries to start using its APIs and stay backwards compatible, so it allows `decimal.setcontext(ctx)` function to be implemented. We are fixing things here. When you are designing a new library/API, you can use CMs and only CMs. It's up to you, as a library author, PEP 550 does not limit you. And when you use CMs, there's no "problems" with 'yield from' or anything in PEP 550.
Summary: I think that skipping context managers in some circumstances is a bad habit that shouldn't be encouraged.
PEP 550 does not encourage coding without context managers. It does, in fact, solve the problem of reliably storing context to make writing context managers possible. To reiterate: it provides mechanism to set a variable within the current logical thread, like storing a current request in an async HTTP handler. Or to implement `decimal.setcontext`. But you are free to use it to only implement context managers in your library. Yury

On 09/06/2017 11:57 PM, Yury Selivanov wrote:
On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing wrote:
That using a CM is not required, and tracking down a bug caused by not using a CM can be difficult.
How is PEP 550 is at fault of somebody being lazy and not using a context manager?
Because PEP 550 makes a CM unnecessary in the simple (common?) case, hiding the need for a CM in not-so-simple cases. For comparison: in Python 3 we are now warned about files that have been left open (because explicitly closing files was unnecessary in CPython due to an implementation detail) -- the solution? make files context managers whose __exit__ closes the file.
I appreciate that the scientific and number-crunching communities have been a major driver of enhancements for Python (such as rich comparisons and, more recently, matrix operators), but I don't think an enhancement for them that makes life more difficult for the rest is a net win. -- ~Ethan~

On Wed, Aug 30, 2017 at 2:36 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
FYI, I've been sketching an alternative solution that addresses these kinds of things. I've been hesitant to post about it, partly because of the PEP550-based workarounds that Nick, Nathaniel, Yury etc. have been describing, and partly because that might be a major distraction from other useful discussions, especially because I wasn't completely sure yet about whether my approach has some fatal flaw compared to PEP 550 ;). —Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Aug 30, 2017 at 9:44 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote: [..]
The only alternative design that I considered for PEP 550 and ultimately rejected was to have a the following thread-specific mapping: { var1: [stack of values for var1], var2: [stack of values for var2] } So the idea is that when we set a value for the variable in some frame, we push it to its stack. When the frame is done, we pop it. This is a classic approach (called Shallow Binding) to implement dynamic scope. The fatal flow that made me to reject this approach was the CM protocol (__enter__). Specifically, context managers need to be able to control values in outer frames, and this is where this approach becomes super messy. Yury

Can Execution Context be implemented outside of CPython
I know I'm well late to the game and a bit dense, but where in the pep is the justification for this assertion? I ask because we buy something to solve the same problem in Twisted some time ago: https://bitbucket.org/hipchat/txlocal . We were able to leverage generator/coroutine decorators to preserve state without modifying the runtime. Given that this problem only exists in runtime that multiplex coroutines on a single thread and the fact that coroutine execution engines only exist in user space, why doesn't it make more sense to leave this to a library that engines like asyncio and Twisted are responsible for standardising on? On Wed, Aug 30, 2017, 09:40 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

On Wed, Aug 30, 2017 at 1:39 PM, Kevin Conway <kevinjacobconway@gmail.com> wrote:
To work with coroutines we have asyncio/twisted or other frameworks. They create async tasks and manage them. Generators, OTOH, don't have a framework that runs them, they are managed by the Python interpreter. So its not possible to implement a *complete context solution* that equally supports generators and coroutines outside of the interpreter. Another problem, is that every framework has its own local context solution. Twisted has one, gevent has another. But libraries like numpy and decimal can't use them to store their local context data, because they are non-standard. That's why we need to solve this problem once in Python directly. Yury

On Wed, Aug 30, 2017 at 5:36 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Anyway, thanks to these efforts, your proposal has become somewhat more competitive compared to mine ;). I'll post mine as soon as I find the time to write everything down. My intention is before next week. —Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Yury Selivanov wrote:
That's understandable, but fixing that problem shouldn't come at the expense of breaking the ability to refactor generator code or async code without changing its semantics. I'm not convinced that it has to, either. In this example, the with-statement is the thing that should be establishing a new nested context. Yielding and re-entering the generator should only be swapping around between existing contexts.
The following non-generator code is "broken" in exactly the same way: def foo(): decimal.context(...) do_some_decimal_calculations() # Context has now been changed for the caller
I simply want consistency.
So do I! We just have different ideas about what consistency means here.
No, generators should *always* leak their context changes to exactly the same extent that normal functions do. If you don't want to leak a context change, you should use a with statement. What you seem to be suggesting is that generators shouldn't leak context changes even when you *don't* use a with-statement. If you're going to to that, you'd better make sure that the same thing applies to regular functions, otherwise you've introduced an inconsistency. -- Greg

On Tue, Aug 29, 2017 at 5:45 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
What you seem to be suggesting is that generators shouldn't leak context changes even when you *don't* use a with-statement.
Yes, generators shouldn't leak context changes regardless of what and how changes the context inside them: var = new_context_var() def gen(): old_val = var.get() try: var.set('blah') yield yield yield finally: var.set(old_val) with the above code, when you do "next(gen())" it would leak the state without PEP 550. "finally" block (or "with" block wouldn't help you here) and corrupt the state of the caller. That's the problem the PEP fixes. The EC interaction with generators is explained here with a great detail: https://www.python.org/dev/peps/pep-0550/#id4 We explain the motivation behind desiring a working context-local solution for generators in the Rationale section: https://www.python.org/dev/peps/pep-0550/#rationale Basically half of the PEP is about isolating context in generators.
Regular functions cannot pause/resume their execution, so they can't leak an inconsistent context change due to out of order or partial execution. PEP 550 positions itself as a replacement for TLS, and clearly defines its semantics for regular functions in a single thread, regular functions in multithreaded code, generators, and asynchronous code (async/await). Everything is specified in the High-level Specification section. I wouldn't call slightly differently defined semantics for generators/coroutines/functions an "inconsistency" -- they just have a different EC semantics given how different they are from each other. Drawing a parallel between 'yield from' and function calls is possible, but we shouldn't forget that you can 'yield from' a half-iterated generator. Yury

On Tue, Aug 29, 2017 at 06:01:40PM -0400, Yury Selivanov wrote:
What I don't find so consistent is that the async universe is guarded with async {def, for, with, ...}, but in this proposal regular context managers and context setters implicitly adapt their behavior. So, pedantically, having a language extension like async set(var, value) x = async get(var) and making async-safe context managers explicit async with decimal.localcontext(): ... would feel more consistent. I know generators are a problem, but even allowing something like "async set" in generators would be a step up. Stefan Krah

On Tue, Aug 29, 2017 at 7:06 PM, Stefan Krah <stefan@bytereef.org> wrote:
But regular context managers work just fine with asynchronous code. Not all of them have some local state. For example, you could have a context manager to time how long the code wrapped into it executes: async def foo(): with timing(): await ... We use asynchronous context managers only when they need to do asynchronous operations in their __aenter__ and __aexit__ (like DB transaction begin/rollback/commit). Requiring "await" to set a value for context variable would force us to write specialized async CMs for cases where a sync CM would do just fine. This in turn, would make it impossible to use some sync libraries in async code. But there's nothing wrong in using numpy/numpy.errstate in a coroutine. I want to be able to copy/paste their examples into my async code and I'd expect it to just work -- that's the point of the PEP. async/await already requires to have separate APIs in libraries that involve IO. Let's not make the situation worse by asking people to use asynchronous version of PEP 550 even though it's not really needed. Yury

On 08/28/2017 04:19 AM, Stefan Krah wrote:
If I understand correctly, ctx.prec is whatever the default is, because foo comes before bar on the stack, and after the current value for i is grabbed bar is no longer executing, and therefore no longer on the stack. I hope I'm right. ;) -- ~Ethan~

A question appeared here about a simple mental model for PEP 550. It looks much clearer now, than in the first version, but I still would like to clarify: can one say that PEP 550 just provides more fine-grained version of threading.local(), that works not only per thread, but even per coroutine within the same thread? -- Ivan On 28 August 2017 at 17:29, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

On Sat, Aug 26, 2017 at 2:34 AM, Nathaniel Smith <njs@pobox.com> wrote:
That exception is why the semantics cannot be equivalent.
I'll cover the refactoring argument later in this email. [..]
I don't think it's non-trivial though: First, we have a cache in ContextVar which makes lookup O(1) for any tight code that uses libraries like decimal and numpy. Second, most of the LCs in the chain will be empty, so even the uncached lookup will still be fast. Third, you will usually have your "with my_context()" block right around your code (or within a few awaits distance), otherwise it will be hard to reason what's the context. And if, occasionally, you have a one single "var.lookup()" call that won't be cached, the cost of it will still be measured in microseconds. Finally, the easy to follow semantics is the main argument for the change (even at the cost of making "get()" a bit slower in corner cases).
Yes.
This example is very similar to: await sub() and await create_task(sub()) So it's really about making the semantics for coroutines be predictable.
(And fwiw I'm still not convinced we should give up on 'yield from' as a mechanism for refactoring generators.)
I don't get this "refactoring generators" and "refactoring coroutines" argument. Suppose you have this code: def gen(): i = 0 for _ in range(3): i += 1 yield i for _ in range(5): i += 1 yield i You can't refactor gen() by simply copying/pasting parts of its body into a separate generator: def count3(): for _ in range(3): i += 1 yield def gen(): i = 0 yield from count3() for _ in range(5): i += 1 yield i The above won't work for some obvious reasons: 'i' is a nonlocal variable for 'count3' block of code. Almost exactly the same thing will happen with the current PEP 550 specification, which is a *good* thing. 'yield from' and 'await' are not about refactoring. They can be used for splitting large generators/coroutines into a set of smaller ones, sure. But there's *no* magical, always working, refactoring mechanism that allows to do that blindly.
Right. Before we continue, let me make sure we are on the same page here: await asyncio.wait_for(sub(), timeout=2) can be refactored into: task = asyncio.wait_for(sub(), timeout=2) # sub() is scheduled now, and a "loop.call_soon" call has been # made to advance it soon. await task Now, if we look at the following example (1): async def foo(): await bar() The "bar()" coroutine will execute within "foo()". If we add a timeout logic (2): async def foo(): await wait_for(bar() ,1) The "bar()" coroutine will execute outside of "foo()", and "foo()" will only wait for the result of that execution. Now, Async Tasks capture the context when they are created -- that's the only sane option they have. If coroutines don't have their own LC, "bar()" in examples (1) and (2) would interact with the execution context differently! And this is something that we can't let happen, as it would force asyncio users to think about the EC every time they want to wrap a coroutine into a task. [..]
Correct. Both LC (and EC) objects will be both wrapped into "shell" objects before being exposed to the end user. run_with_logical_context() will mutate the user-visible LC object (keeping the underlying LC immutable, of course). Ideally, we would want run_with_logical_context to have the following signature: result, updated_lc = run_with_logical_context(lc, callable) But because "callable" can raise an exception this would not work.
Yeah, you're right. Thanks!
Fixed! Yury

Hi, I'm aware that the current implementation is not final, but I already adapted the coroutine changes for Cython to allow for some initial integration testing with real external (i.e. non-Python coroutine) targets. I haven't adapted the tests yet, so the changes are currently unused and mostly untested. https://github.com/scoder/cython/tree/pep550_exec_context I also left some comments in the github commits along the way. Stefan

On Sat, Aug 26, 2017 at 6:22 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Huge thanks for thinking about how this proposal will work for Cython and trying it out. Although I must warn you that the last reference implementation is very outdated, and the implementation we will end up with will be very different (think a total rewrite from scratch). Yury

Hi, thanks, on the whole this is *much* easier to understand. I'll add some comments on the decimal examples. The thing is, decimal is already quite tricky and people do read PEPs long after they have been accepted, so they should probably reflect best practices. On Fri, Aug 25, 2017 at 06:32:22PM -0400, Yury Selivanov wrote:
"Many people (wrongly) expect the values of ``items`` to be::" ;)
[(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))]
I'm not sure why this approach has limited use for decimal: from decimal import * def fractions(precision, x, y): ctx = Context(prec=precision) yield ctx.divide(Decimal(x), Decimal(y)) yield ctx.divide(Decimal(x), Decimal(y**2)) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) print(list(zip(g1, g2))) This is the first thing I'd do when writing async-safe code. Again, people do read PEPs. So if an asyncio programmer without any special knowledge of decimal reads the PEP, he probably assumes that localcontext() is currently the only option, while the safer and easy-to-reason-about context methods exist.
As I understand it, the example creates a context with a custom precision and attempts to use that context to create a Decimal. This doesn't switch the actual decimal context. Secondly, the precision in the context argument to the Decimal() constructor has no effect --- the context there is only used for error handling. Lastly, if the constructor *did* use the precision, one would have to be careful about double rounding when using MyDecimal(). I get that this is supposed to be for illustration only, but please let's be careful about what people might take away from that code.
I think it'll work, but can we agree on hard numbers like max 2% slowdown for the non-threaded case and 4% for applications that only use threads? I'm a bit cautious because other C-extension state-managing PEPs didn't come close to these figures. Stefan Krah

On Sat, Aug 26, 2017 at 7:45 AM, Stefan Krah <stefan@bytereef.org> wrote:
Hi,
thanks, on the whole this is *much* easier to understand.
Thanks!
Agree. [..]
Because you have to know the limitations of implicit decimal context to make this choice. Most people don't (at least from my experience).
This is the first thing I'd do when writing async-safe code.
Because you know the decimal module very well :)
I agree.
In the next iteration of the PEP we'll remove decimal examples and replace them with something with simpler semantics. This is clearly the best choice now.
I'd be *very* surprised if wee see any noticeable slowdown at all. The way ContextVars will implement caching is very similar to the trick you use now. Yury

On Sat, Aug 26, 2017 at 12:21:44PM -0400, Yury Selivanov wrote:
I'd also be surprised, but what do we do if the PEP is accepted and for some yet unknown reason the implementation turns out to be 12-15% slower? The slowdown related to the module-state/heap-type PEPs wasn't immediately obvious either; it would be nice to have actual figures before the PEP is accepted. Stefan Krah

Thanks for the update. Comments in-line below. -eric On Fri, Aug 25, 2017 at 4:32 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
With threads we have a directed graph of execution, rooted at the root thread, branching with each new thread and merging with each .join(). Each thread gets its own copy of each threading.local, regardless of the relationship between branches (threads) in the execution graph. With async (and generators) we also have a directed graph of execution, rooted in the calling thread, branching with each new async call. Currently there is no equivalent to threading.local for the async execution graph. This proposal involves adding such an equivalent. However, the proposed solution isn''t quite equivalent, right? It adds a concept of lookup on the chain of namespaces, traversing up the execution graph back to the root. threading.local does not do this. Furthermore, you can have more than one threading.local per thread.
From what I read in the PEP, each node in the execution graph has (at most) one Execution Context.
The PEP doesn't really say much about these differences from threadlocals, including a rationale. FWIW, I think such a COW mechanism could be useful. However, it does add complexity to the feature. So a clear explanation in the PEP of why it's worth it would be valuable.
#1-4 are consistent with a single EC per Python thread. However, #5-7 imply that more than one EC per thread is supported, but only one is active in the current execution stack (notably the EC is rooted at the calling frame). threading.local provides a much simpler mechanism but does not support the chained context (COW) semantics...

Hi Eric, On Sat, Aug 26, 2017 at 1:25 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Correct.
Currently, the PEP covers the proposed mechanism in-depth, explaining why every detail of the spec is the way it is. But I think it'd be valuable to highlight differences from theading.local() in a separate section. We'll think about adding one. Yury

On Sat, Aug 26, 2017 at 10:25 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
You might be interested in these notes I wrote to motivate why we need a chain of namespaces, and why simple "async task locals" aren't sufficient: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb They might be a bit verbose to include directly in the PEP, but Yury/Elvis, feel free to steal whatever if you think it'd be useful. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Aug 26, 2017 at 3:09 PM, Nathaniel Smith <njs@pobox.com> wrote:
Thanks, Nathaniel! That helped me understand the rationale, though I'm still unconvinced chained lookup is necessary for the stated goal of the PEP. (The rest of my reply is not specific to Nathaniel.) tl;dr Please: * make the chained lookup aspect of the proposal more explicit (and distinct) in the beginning sections of the PEP (or drop chained lookup). * explain why normal frames do not get to take advantage of chained lookup (or allow them to). -------------------- If I understood right, the problem is that we always want context vars resolved relative to the current frame and then to the caller's frame (and on up the call stack). For generators, "caller" means the frame that resumed the generator. Since we don't know what frame will resume the generator beforehand, we can't simply copy the current LC when a generator is created and bind it to the generator's frame. However, I'm still not convinced that's the semantics we need. The key statement is "and then to the caller's frame (and on up the call stack)", i.e. chained lookup. On the linked page Nathaniel explained the position (quite clearly, thank you) using sys.exc_info() as an example of async-local state. I posit that that example isn't particularly representative of what we actually need. Isn't the point of the PEP to provide an async-safe alternative to threading.local()? Any existing code using threading.local() would not expect any kind of chained lookup since threads don't have any. So introducing chained lookup in the PEP is unnecessary and consequently not ideal since it introduces significant complexity. As the PEP is currently written, chained lookup is a key part of the proposal, though it does not explicitly express this. I suppose this is where my confusion has been. At this point I think I understand one rationale for the chained lookup functionality; it takes advantage of the cooperative scheduling characteristics of generators, et al. Unlike with threads, a programmer can know the context under which a generator will be resumed. Thus it may be useful to the programmer to allow (or expect) the resumed generator to fall back to the calling context. However, given the extra complexity involved, is there enough evidence that such capability is sufficiently useful? Could chained lookup be addressed separately (in another PEP)? Also, wouldn't it be equally useful to support chained lookup for function calls? Programmers have the same level of knowledge about the context stack with function calls as with generators. I would expect evidence in favor of chained lookups for generators to also favor the same for normal function calls. -eric

On Mon, Aug 28, 2017 at 3:14 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
There's a lot of Python code out there, and it's hard to know what it all wants :-). But I don't think we should get hung up on matching threading.local() -- no-one sits down and says "okay, what my users want is for me to write some code that uses a thread-local", i.e., threading.local() is a mechanism, not an end-goal. My hypothesis is in most cases, when people reach for threading.local(), it's because they have some "contextual" variable, and they want to be able to do things like set it to a value that affects all and only the code that runs inside a 'with' block. So far the only way to approximate this in Python has been to use threading.local(), but chained lookup would work even better. As evidence for this hypothesis: something like chained lookup is important for exc_info() [1] and for Trio's cancellation semantics, and I'm pretty confident that it's what users naturally expect for use cases like 'with decimal.localcontext(): ...' or 'with numpy.errstate(...): ...'. And it works fine for cases like Flask's request-locals that get set once near the top of a callstack and then treated as read-only by most of the code. I'm not aware of any alternative to chained lookup that fulfills all of these use cases -- are you? And I'm not aware of any use cases that require something more than threading.local() but less than chained lookup -- are you? [1] I guess I should say something about including sys.exc_info() as evidence that chained lookup as useful, given that CPython probably won't share code between it's PEP 550 implementation and its sys.exc_info() implementation. I'm mostly citing it as a evidence that this is a real kind of need that can arise when writing programs -- if it happens once, it'll probably happen again. But I can also imagine that other implementations might want to share code here, and it's certainly nice if the Python-the-language spec can just say "exc_info() has semantics 'as if' it were implemented using PEP 550 storage" and leave it at that. Plus it's kind of rude for the interpreter to claim semantics for itself that it won't let anyone else implement :-).
The important difference between generators/coroutines and normal function calls is that with normal function calls, the link between the caller and callee is fixed for the entire lifetime of the inner frame, so there's no way for the context to shift under your feet. If all we had were normal function calls, then (green-) thread locals using the save/restore trick would be enough to handle all the use cases above -- it's only for generators/coroutines where the save/restore trick breaks down. This means that pushing/popping LCs when crossing into/out of a generator frame is the minimum needed to get the desired semantics, and it keeps the LC stack small (important since lookups can be O(n) in the worst case), and it minimizes the backcompat breakage for operations like decimal.setcontext() where people *do* expect to call it in a subroutine and have the effects be visible in the caller. -n -- Nathaniel J. Smith -- https://vorpus.org

On Mon, Aug 28, 2017 at 6:07 PM, Nathaniel Smith <njs@pobox.com> wrote:
I like this way of looking at things. Does this have any bearing on asyncio.Task? To me those look more like threads than like generators. Or possibly they should inherit the lookup chain from the point when the Task was created, but not be affected at all by the lookup chain in place when they are executed. FWIW we *could* have a policy that OS threads also inherit the lookup chain from their creator, but I doubt that's going to fly with backwards compatibility. I guess my general (hurried, sorry) view is that we're at a good point where we have a small number of mechanisms but are still debating policies on how those mechanisms should be used. (The basic mechanism is chained lookup and the policies are about how the chains are fit together for various language/library constructs.) -- --Guido van Rossum (python.org/~guido)

On 8/28/2017 6:50 PM, Guido van Rossum wrote:
Since LC is new, how could such a policy affect backwards compatibility? The obvious answer would be that some use cases that presently use other mechanisms that "should" be ported to using LC would have to be careful in how they do the port, but discussion seems to indicate that they would have to be careful in how they do the port anyway. One of the most common examples is the decimal context. IIUC, each thread gets its initial decimal context from a global template, rather than inheriting from its parent thread. Porting decimal context to LC then, in the event of OS threads inheriting the lookup chain from their creator, would take extra work for compatibility: setting the decimal context from the global template (a step it must already take) rather than accepting the inheritance. It might be appropriate that an updated version of decimal that uses LC would offer the option of inheriting the decimal context from the parent thread, or using the global template, as an enhancement.

On Mon, Aug 28, 2017 at 9:50 PM, Guido van Rossum <guido@python.org> wrote:
We explain why tasks have to inherit the lookup chain from the point where they are created in the PEP (in the new High-level Specification section): https://www.python.org/dev/peps/pep-0550/#coroutines-and-asynchronous-tasks In short, without inheriting the chain we can't wrap coroutines into tasks (like wrapping an await in wait_for() would break the code, if we don't inherit the chain). In the latest version (v4) we made all coroutines to have their own Logical Context, which, as we discovered today, makes us unable to set context variables in __aenter__ coroutines. This will be fixed in the next version.
Backwards compatibility is indeed an issue. Inheriting the chain for threads would mean another difference between PEP 550 and 'threading.local()', that could cause backwards incompatible behaviour for decimal/numpy when they are updated to new APIs. For decimal, for example, we could use the following pattern to fallback to use the default decimal context for ECs (threads) that don't have it set: ctx = decimal_var.get(default=default_decimal_ctx) We can also add an 'initializer' keyword-argument to 'new_context_var' to specify a callable that will be used to give a default value to the var. Another issue, is that with the current C API, we can only inherit EC for threads started with 'threading.Thread'. There's no reliable way to inherit the chain if a thread was initialized by a C extension. IMO, inheriting the lookup chain in threads makes sense when we use them for pools, like concurrent.futures.ThreadPoolExecutor. When threads are used as long-running subprograms, inheriting the chain should be an opt-in. Yury

Hi,
it's by design that the execution context for new threads to be empty or should it be possible to set it to some initial value? Like e.g: var = new_context_var('init') def sub(): assert var.lookup() == 'init' var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.lookup() == 'main' Thanks, --francis

On 26.08.2017 04:19, Ethan Furman wrote:
Why not the same interface as thread-local storage? This has been the question which bothered me from the beginning of PEP550. I don't understand what inventing a new way of access buys us here. Python features regular attribute access for years. It's even simpler than method-based access. Best, Sven

On Sat, Aug 26, 2017 at 9:33 AM, Sven R. Kunze <srkunze@mail.de> wrote: [..]
This was covered at length in these threads: https://mail.python.org/pipermail/python-ideas/2017-August/046888.html https://mail.python.org/pipermail/python-ideas/2017-August/046889.html I forgot to add a subsection to "Design Consideration" with a summary of that thread. Will be fixed in the next revision. Yury

On Mon, Aug 28, 2017 at 6:19 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
And it should not be trivial, as the PEP 550 semantics is different from TLS. Using PEP 550 instead of TLS should be carefully evaluated. Please also see this: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int... Yury

On Fri, Aug 25, 2017 at 10:19 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
All in all, I like it. Nice job.
Thanks!
ContextVar.set(value) method writes the `value` to the *topmost LC*. ContextVar.lookup() method *traverses the stack* until it finds the LC that has a value. "get()" does not reflect this subtle semantics difference. Yury

On 08/26/2017 09:25 AM, Yury Selivanov wrote:
On Fri, Aug 25, 2017 at 10:19 PM, Ethan Furman wrote:
A good point; however, ChainMap, which behaves similarly as far as lookup goes, uses "get" and does not have a "lookup" method. I think we lose more than we gain by changing that method name. -- ~Ethan~

On 26.08.2017 19:23, Yury Selivanov wrote:
I like "get" more. ;-) Best, Sven PS: This might be a result of still leaning towards attribute access despite the discussion you referenced. I still don't think complicating and reinventing terminology (which basically results in API names) buys us much. And I am still with Ethan, a context stack is just a ChainMap. Renaming basic methods won't hide that fact. That's my only criticism of the PEP. The rest is fine and useful.

On 27 August 2017 at 03:23, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I don't think "we may want to add extra parameters" is a good reason to omit a conventional `get()` method - I think it's a reason to offer a separate API to handle use cases where the question of *where* the var is set matters (for example, `my_var.is_set()` would indicate whether or not `my_var.set()` has been called in the current logical context without requiring a parameter check for normal lookups that don't care). Cheers, Nick. P.S. And I say that as a reader who correctly guessed why you had changed the method name in the current iteration of the proposal. I'm sympathetic to those reasons, but I think sticking with the conventional API will make this one easier to learn and use :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Aug 29, 2017 at 5:01 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [..]
Yeah, I agree. We'll switch lookup -> get in the next iteration. Guido's parallel with getattr/setattr/delattr is also useful. getattr can also lookup the attribute in base classes, but we still call it "get". Yury

I agree with David; this PEP has really gotten to a great place and the new organization makes it much easier to understand.
On Aug 25, 2017, at 22:19, Ethan Furman <ethan@stoneleaf.us> wrote:
Why "lookup" and not "get" ? Many APIs use "get" and it's functionality is well understood.
I have the same question as Sven as to why we can’t have attribute access semantics. I probably asked that before, and you probably answered, so maybe if there’s a specific reason why this can’t be supported, the PEP should include a “rejected ideas” section explaining the choice. That said, if we have to use method lookup, then I agree that `.get()` is a better choice than `.lookup()`. But in that case, would it be possible to add an optional `default=None` argument so that you can specify a marker object for a missing value? I worry that None might be a valid value in some cases, but that currently can’t be distinguished from “missing”. I’d also like a debugging interface, such that I can ask “context_var.get()” and get some easy diagnostics about the resolution order. Cheers, -Barry

On Sat, Aug 26, 2017 at 12:30 PM, Barry Warsaw <barry@python.org> wrote:
Elvis just added it: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int...
That said, if we have to use method lookup, then I agree that `.get()` is a better choice than `.lookup()`. But in that case, would it be possible to add an optional `default=None` argument so that you can specify a marker object for a missing value? I worry that None might be a valid value in some cases, but that currently can’t be distinguished from “missing”.
Nathaniel has a use case where he needs to know if the value is in the topmost LC or not. One way to address that need is to have the following signature for lookup(): lookup(*, default=None, traverse=True) IMO "lookup" is a slightly better name in this particular context. Yury

On Aug 26, 2017, at 14:15, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Elvis just added it: https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int...
Thanks, that’s exactly what I was looking for. Great summary of the issue.
Given that signature (which +1), I agree. You could add keywords for debugging lookup fairly easily too. Cheers, -Barry

I'm convinced by the new section explaining why a single value is better than a namespace. Nonetheless, it would feel more "Pythonic" to me to create a property `ContextVariable.val` whose getter and setter was `.lookup()` and `.set()` (or maybe `._lookup()` and `._set()`). Lookup might require a more complex call signature in rare cases, but the large majority of the time it would simply be `var.val`, and that should be the preferred API IMO. That provides a nice parallel between `var.name` and `var.val` also. On Sat, Aug 26, 2017 at 11:22 AM, Barry Warsaw <barry@python.org> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

This is now looking really good and I can understands it. One question though. Sometimes creation of a context variable is done with a name argument, other times not. E.g. var1 = new_context_var('var1') var = new_context_var() The signature is given as: sys.new_context_var(name: str) But it seems like it should be: sys.new_context_var(name: Optional[str]=None) On Aug 25, 2017 3:35 PM, "Yury Selivanov" <yselivanov.ml@gmail.com> wrote: Hi, This is the 4th iteration of the PEP that Elvis and I have rewritten from scratch. The specification section has been separated from the implementation section, which makes them easier to follow. During the rewrite, we realized that generators and coroutines should work with the EC in exactly the same way (coroutines used to be created with no LC in prior versions of the PEP). We also renamed Context Keys to Context Variables which seems to be a more appropriate name. Hopefully this update will resolve the remaining questions about the specification and the proposed implementation, and will allow us to focus on refining the API. Yury PEP: 550 Title: Execution Context Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io>, Elvis Pranskevichus <elvis@magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Aug-2017 Python-Version: 3.7 Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017 Abstract ======== This PEP adds a new generic mechanism of ensuring consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines. Thread-local storage, such as ``threading.local()``, is inadequate for programs that execute concurrently in the same OS thread. This PEP proposes a solution to this problem. Rationale ========= Prior to the advent of asynchronous programming in Python, programs used OS threads to achieve concurrency. The need for thread-specific state was solved by ``threading.local()`` and its C-API equivalent, ``PyThreadState_GetDict()``. A few examples of where Thread-local storage (TLS) is commonly relied upon: * Context managers like decimal contexts, ``numpy.errstate``, and ``warnings.catch_warnings``. * Request-related data, such as security tokens and request data in web applications, language context for ``gettext`` etc. * Profiling, tracing, and logging in large code bases. Unfortunately, TLS does not work well for programs which execute concurrently in a single thread. A Python generator is the simplest example of a concurrent program. Consider the following:: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The expected value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] Rather surprisingly, the actual result is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.111111'), Decimal('0.222222'))] This is because Decimal context is stored as a thread-local, so concurrent iteration of the ``fractions()`` generator would corrupt the state. A similar problem exists with coroutines. Applications also often need to associate certain data with a given thread of execution. For example, a web application server commonly needs access to the current HTTP request object. The inadequacy of TLS in asynchronous code has lead to the proliferation of ad-hoc solutions, which are limited in scope and do not support all required use cases. The current status quo is that any library (including the standard library), which relies on TLS, is likely to be broken when used in asynchronous code or with generators (see [3]_ as an example issue.) Some languages, that support coroutines or generators, recommend passing the context manually as an argument to every function, see [1]_ for an example. This approach, however, has limited use for Python, where there is a large ecosystem that was built to work with a TLS-like context. Furthermore, libraries like ``decimal`` or ``numpy`` rely on context implicitly in overloaded operator implementations. The .NET runtime, which has support for async/await, has a generic solution for this problem, called ``ExecutionContext`` (see [2]_). Goals ===== The goal of this PEP is to provide a more reliable ``threading.local()`` alternative, which: * provides the mechanism and the API to fix non-local state issues with coroutines and generators; * has no or negligible performance impact on the existing code or the code that will be using the new mechanism, including libraries like ``decimal`` and ``numpy``. High-Level Specification ======================== The full specification of this PEP is broken down into three parts: * High-Level Specification (this section): the description of the overall solution. We show how it applies to generators and coroutines in user code, without delving into implementation details. * Detailed Specification: the complete description of new concepts, APIs, and related changes to the standard library. * Implementation Details: the description and analysis of data structures and algorithms used to implement this PEP, as well as the necessary changes to CPython. For the purpose of this section, we define *execution context* as an opaque container of non-local state that allows consistent access to its contents in the concurrent execution environment. A *context variable* is an object representing a value in the execution context. A new context variable is created by calling the ``new_context_var()`` function. A context variable object has two methods: * ``lookup()``: returns the value of the variable in the current execution context; * ``set()``: sets the value of the variable in the current execution context. Regular Single-threaded Code ---------------------------- In regular, single-threaded code that doesn't involve generators or coroutines, context variables behave like globals:: var = new_context_var() def sub(): assert var.lookup() == 'main' var.set('sub') def main(): var.set('main') sub() assert var.lookup() == 'sub' Multithreaded Code ------------------ In multithreaded code, context variables behave like thread locals:: var = new_context_var() def sub(): assert var.lookup() is None # The execution context is empty # for each new thread. var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.lookup() == 'main' Generators ---------- In generators, changes to context variables are local and are not visible to the caller, but are visible to the code called by the generator. Once set in the generator, the context variable is guaranteed not to change between iterations:: var = new_context_var() def gen(): var.set('gen') assert var.lookup() == 'gen' yield 1 assert var.lookup() == 'gen' yield 2 def main(): var.set('main') g = gen() next(g) assert var.lookup() == 'main' var.set('main modified') next(g) assert var.lookup() == 'main modified' Changes to caller's context variables are visible to the generator (unless they were also modified inside the generator):: var = new_context_var() def gen(): assert var.lookup() == 'var' yield 1 assert var.lookup() == 'var modified' yield 2 def main(): g = gen() var.set('var') next(g) var.set('var modified') next(g) Now, let's revisit the decimal precision example from the `Rationale`_ section, and see how the execution context can improve the situation:: import decimal decimal_prec = new_context_var() # create a new context variable # Pre-PEP 550 Decimal relies on TLS for its context. # This subclass switches the decimal context storage # to the execution context for illustration purposes. # class MyDecimal(decimal.Decimal): def __init__(self, value="0"): prec = decimal_prec.lookup() if prec is None: raise ValueError('could not find decimal precision') context = decimal.Context(prec=prec) super().__init__(value, context=context) def fractions(precision, x, y): # Normally, this would be set by a context manager, # but for simplicity we do this directly. decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) items = list(zip(g1, g2)) The value of ``items`` is:: [(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))] which matches the expected result. Coroutines and Asynchronous Tasks --------------------------------- In coroutines, like in generators, context variable changes are local and are not visible to the caller:: import asyncio var = new_context_var() async def sub(): assert var.lookup() == 'main' var.set('sub') assert var.lookup() == 'sub' async def main(): var.set('main') await sub() assert var.lookup() == 'main' loop = asyncio.get_event_loop() loop.run_until_complete(main()) To establish the full semantics of execution context in couroutines, we must also consider *tasks*. A task is the abstraction used by *asyncio*, and other similar libraries, to manage the concurrent execution of coroutines. In the example above, a task is created implicitly by the ``run_until_complete()`` function. ``asyncio.wait_for()`` is another example of implicit task creation:: async def sub(): await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') # waiting for sub() directly await sub() # waiting for sub() with a timeout await asyncio.wait_for(sub(), timeout=2) var.set('main changed') Intuitively, we expect the assertion in ``sub()`` to hold true in both invocations, even though the ``wait_for()`` implementation actually spawns a task, which runs ``sub()`` concurrently with ``main()``. Thus, tasks **must** capture a snapshot of the current execution context at the moment of their creation and use it to execute the wrapped coroutine whenever that happens. If this is not done, then innocuous looking changes like wrapping a coroutine in a ``wait_for()`` call would cause surprising breakage. This leads to the following:: import asyncio var = new_context_var() async def sub(): # Sleeping will make sub() run after # `var` is modified in main(). await asyncio.sleep(1) assert var.lookup() == 'main' async def main(): var.set('main') loop.create_task(sub()) # schedules asynchronous execution # of sub(). assert var.lookup() == 'main' var.set('main changed') loop = asyncio.get_event_loop() loop.run_until_complete(main()) In the above code we show how ``sub()``, running in a separate task, sees the value of ``var`` as it was when ``loop.create_task(sub())`` was called. Like tasks, the intuitive behaviour of callbacks scheduled with either ``Loop.call_soon()``, ``Loop.call_later()``, or ``Future.add_done_callback()`` is to also capture a snapshot of the current execution context at the point of scheduling, and use it to run the callback:: current_request = new_context_var() def log_error(e): logging.error('error when handling request %r', current_request.lookup()) async def render_response(): ... async def handle_get_request(request): current_request.set(request) try: return await render_response() except Exception as e: get_event_loop().call_soon(log_error, e) return '500 - Internal Server Error' Detailed Specification ====================== Conceptually, an *execution context* (EC) is a stack of logical contexts. There is one EC per Python thread. A *logical context* (LC) is a mapping of context variables to their values in that particular LC. A *context variable* is an object representing a value in the execution context. A new context variable object is created by calling the ``sys.new_context_var(name: str)`` function. The value of the ``name`` argument is not used by the EC machinery, but may be used for debugging and introspection. The context variable object has the following methods and attributes: * ``name``: the value passed to ``new_context_var()``. * ``lookup()``: traverses the execution context top-to-bottom, until the variable value is found. Returns ``None``, if the variable is not present in the execution context; * ``set()``: sets the value of the variable in the topmost logical context. Generators ---------- When created, each generator object has an empty logical context object stored in its ``__logical_context__`` attribute. This logical context is pushed onto the execution context at the beginning of each generator iteration and popped at the end:: var1 = sys.new_context_var('var1') var2 = sys.new_context_var('var2') def gen(): var1.set('var1-gen') var2.set('var2-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] n = nested_gen() # nested_gen_LC is created next(n) # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}) # ] var1.set('var1-gen-mod') var2.set('var2-gen-mod') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}) # ] next(n) def nested_gen(): # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC() # ] assert var1.lookup() == 'var1-gen' assert var2.lookup() == 'var2-gen' var1.set('var1-nested-gen') # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen', var2: 'var2-gen'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] yield # EC = [ # outer_LC(), # gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}), # nested_gen_LC({var1: 'var1-nested-gen'}) # ] assert var1.lookup() == 'var1-nested-gen' assert var2.lookup() == 'var2-gen-mod' yield # EC = [outer_LC()] g = gen() # gen_LC is created for the generator object `g` list(g) # EC = [outer_LC()] The snippet above shows the state of the execution context stack throughout the generator lifespan. contextlib.contextmanager ------------------------- Earlier, we've used the following example:: import decimal # create a new context variable decimal_prec = sys.new_context_var('decimal_prec') # ... def fractions(precision, x, y): decimal_prec.set(precision) yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) Let's extend it by adding a context manager:: @contextlib.contextmanager def precision_context(prec): old_rec = decimal_prec.lookup() try: decimal_prec.set(prec) yield finally: decimal_prec.set(old_prec) Unfortunately, this would not work straight away, as the modification to the ``decimal_prec`` variable is contained to the ``precision_context()`` generator, and therefore will not be visible inside the ``with`` block:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC becomes [{}, {}, {decimal_prec: precision}] in the # *precision_context()* generator, # but here the EC is still [{}, {}] # raises ValueError('could not find decimal precision')! yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) The way to fix this is to set the generator's ``__logical_context__`` attribute to ``None``. This will cause the generator to avoid modifying the execution context stack. We modify the ``contextlib.contextmanager()`` decorator to set ``genobj.__logical_context__`` to ``None`` to produce well-behaved context managers:: def fractions(precision, x, y): # EC = [{}, {}] with precision_context(precision): # EC = [{}, {decimal_prec: precision}] yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y**2) # EC becomes [{}, {decimal_prec: None}] asyncio ------- ``asyncio`` uses ``Loop.call_soon``, ``Loop.call_later``, and ``Loop.call_at`` to schedule the asynchronous execution of a function. ``asyncio.Task`` uses ``call_soon()`` to further the execution of the wrapped coroutine. We modify ``Loop.call_{at,later,soon}`` to accept the new optional *execution_context* keyword argument, which defaults to the copy of the current execution context:: def call_soon(self, callback, *args, execution_context=None): if execution_context is None: execution_context = sys.get_execution_context() # ... some time later sys.run_with_execution_context( execution_context, callback, args) The ``sys.get_execution_context()`` function returns a shallow copy of the current execution context. By shallow copy here we mean such a new execution context that: * lookups in the copy provide the same results as in the original execution context, and * any changes in the original execution context do not affect the copy, and * any changes to the copy do not affect the original execution context. Either of the following satisfy the copy requirements: * a new stack with shallow copies of logical contexts; * a new stack with one squashed logical context. The ``sys.run_with_execution_context(ec, func, *args, **kwargs)`` function runs ``func(*args, **kwargs)`` with *ec* as the execution context. The function performs the following steps: 1. Set *ec* as the current execution context stack in the current thread. 2. Push an empty logical context onto the stack. 3. Run ``func(*args, **kwargs)``. 4. Pop the logical context from the stack. 5. Restore the original execution context stack. 6. Return or raise the ``func()`` result. These steps ensure that *ec* cannot be modified by *func*, which makes ``run_with_execution_context()`` idempotent. ``asyncio.Task`` is modified as follows:: class Task: def __init__(self, coro): ... # Get the current execution context snapshot. self._exec_context = sys.get_execution_context() self._loop.call_soon( self._step, execution_context=self._exec_context) def _step(self, exc=None): ... self._loop.call_soon( self._step, execution_context=self._exec_context) ... Generators Transformed into Iterators ------------------------------------- Any Python generator can be represented as an equivalent iterator. Compilers like Cython rely on this axiom. With respect to the execution context, such iterator should behave the same way as the generator it represents. This means that there needs to be a Python API to create new logical contexts and run code with a given logical context. The ``sys.new_logical_context()`` function creates a new empty logical context. The ``sys.run_with_logical_context(lc, func, *args, **kwargs)`` function can be used to run functions in the specified logical context. The *lc* can be modified as a result of the call. The ``sys.run_with_logical_context()`` function performs the following steps: 1. Push *lc* onto the current execution context stack. 2. Run ``func(*args, **kwargs)``. 3. Pop *lc* from the execution context stack. 4. Return or raise the ``func()`` result. By using ``new_logical_context()`` and ``run_with_logical_context()``, we can replicate the generator behaviour like this:: class Generator: def __init__(self): self.logical_context = sys.new_logical_context() def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): # Actual __next__ implementation. ... Let's see how this pattern can be applied to a real generator:: # create a new context variable decimal_prec = sys.new_context_var('decimal_precision') def gen_series(n, precision): decimal_prec.set(precision) for i in range(1, n): yield MyDecimal(i) / MyDecimal(3) # gen_series is equivalent to the following iterator: class Series: def __init__(self, n, precision): # Create a new empty logical context on creation, # like the generators do. self.logical_context = sys.new_logical_context() # run_with_logical_context() will pushes # self.logical_context onto the execution context stack, # runs self._next_impl, and pops self.logical_context # from the stack. return sys.run_with_logical_context( self.logical_context, self._init, n, precision) def _init(self, n, precision): self.i = 1 self.n = n decimal_prec.set(precision) def __iter__(self): return self def __next__(self): return sys.run_with_logical_context( self.logical_context, self._next_impl) def _next_impl(self): decimal_prec.set(self.precision) result = MyDecimal(self.i) / MyDecimal(3) self.i += 1 return result For regular iterators such approach to logical context management is normally not necessary, and it is recommended to set and restore context variables directly in ``__next__``:: class Series: def __next__(self): old_prec = decimal_prec.lookup() try: decimal_prec.set(self.precision) ... finally: decimal_prec.set(old_prec) Asynchronous Generators ----------------------- The execution context semantics in asynchronous generators does not differ from that of regular generators and coroutines. Implementation ============== Execution context is implemented as an immutable linked list of logical contexts, where each logical context is an immutable weak key mapping. A pointer to the currently active execution context is stored in the OS thread state:: +-----------------+ | | ec | PyThreadState +-------------+ | | | +-----------------+ | | ec_node ec_node ec_node v +------+------+ +------+------+ +------+------+ | NULL | lc |<----| prev | lc |<----| prev | lc | +------+--+---+ +------+--+---+ +------+--+---+ | | | LC v LC v LC v +-------------+ +-------------+ +-------------+ | var1: obj1 | | EMPTY | | var1: obj4 | | var2: obj2 | +-------------+ +-------------+ | var3: obj3 | +-------------+ The choice of the immutable list of immutable mappings as a fundamental data structure is motivated by the need to efficiently implement ``sys.get_execution_context()``, which is to be frequently used by asynchronous tasks and callbacks. When the EC is immutable, ``get_execution_context()`` can simply copy the current execution context *by reference*:: def get_execution_context(self): return PyThreadState_Get().ec Let's review all possible context modification scenarios: * The ``ContextVariable.set()`` method is called:: def ContextVar_set(self, val): # See a more complete set() definition # in the `Context Variables` section. tstate = PyThreadState_Get() top_ec_node = tstate.ec top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, val) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) * The ``sys.run_with_logical_context()`` is called, in which case the passed logical context object is appended to the execution context:: def run_with_logical_context(lc, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * The ``sys.run_with_execution_context()`` is called, in which case the current execution context is set to the passed execution context with a new empty logical context appended to it:: def run_with_execution_context(ec, func, *args, **kwargs): tstate = PyThreadState_Get() old_top_ec_node = tstate.ec new_lc = sys.new_logical_context() new_top_ec_node = ec_node(prev=ec, lc=new_lc) try: tstate.ec = new_top_ec_node return func(*args, **kwargs) finally: tstate.ec = old_top_ec_node * Either ``genobj.send()``, ``genobj.throw()``, ``genobj.close()`` are called on a ``genobj`` generator, in which case the logical context recorded in ``genobj`` is pushed onto the stack:: PyGen_New(PyGenObject *gen): gen.__logical_context__ = sys.new_logical_context() gen_send(PyGenObject *gen, ...): tstate = PyThreadState_Get() if gen.__logical_context__ is not None: old_top_ec_node = tstate.ec new_top_ec_node = ec_node( prev=old_top_ec_node, lc=gen.__logical_context__) try: tstate.ec = new_top_ec_node return _gen_send_impl(gen, ...) finally: gen.__logical_context__ = tstate.ec.lc tstate.ec = old_top_ec_node else: return _gen_send_impl(gen, ...) * Coroutines and asynchronous generators share the implementation with generators, and the above changes apply to them as well. In certain scenarios the EC may need to be squashed to limit the size of the chain. For example, consider the following corner case:: async def repeat(coro, delay): await coro() await asyncio.sleep(delay) loop.create_task(repeat(coro, delay)) async def ping(): print('ping') loop = asyncio.get_event_loop() loop.create_task(repeat(ping, 1)) loop.run_forever() In the above code, the EC chain will grow as long as ``repeat()`` is called. Each new task will call ``sys.run_in_execution_context()``, which will append a new logical context to the chain. To prevent unbounded growth, ``sys.get_execution_context()`` checks if the chain is longer than a predetermined maximum, and if it is, squashes the chain into a single LC:: def get_execution_context(): tstate = PyThreadState_Get() if tstate.ec_len > EC_LEN_MAX: squashed_lc = sys.new_logical_context() ec_node = tstate.ec while ec_node: # The LC.merge() method does not replace existing keys. squashed_lc = squashed_lc.merge(ec_node.lc) ec_node = ec_node.prev return ec_node(prev=NULL, lc=squashed_lc) else: return tstate.ec Logical Context --------------- Logical context is an immutable weak key mapping which has the following properties with respect to garbage collection: * ``ContextVar`` objects are strongly-referenced only from the application code, not from any of the Execution Context machinery or values they point to. This means that there are no reference cycles that could extend their lifespan longer than necessary, or prevent their collection by the GC. * Values put in the Execution Context are guaranteed to be kept alive while there is a ``ContextVar`` key referencing them in the thread. * If a ``ContextVar`` is garbage collected, all of its values will be removed from all contexts, allowing them to be GCed if needed. * If a thread has ended its execution, its thread state will be cleaned up along with its ``ExecutionContext``, cleaning up all values bound to all context variables in the thread. As discussed earluier, we need ``sys.get_execution_context()`` to be consistently fast regardless of the size of the execution context, so logical context is necessarily an immutable mapping. Choosing ``dict`` for the underlying implementation is suboptimal, because ``LC.set()`` will cause ``dict.copy()``, which is an O(N) operation, where *N* is the number of items in the LC. ``get_execution_context()``, when squashing the EC, is a O(M) operation, where *M* is the total number of context variable values in the EC. So, instead of ``dict``, we choose Hash Array Mapped Trie (HAMT) as the underlying implementation of logical contexts. (Scala and Clojure use HAMT to implement high performance immutable collections [5]_, [6]_.) With HAMT ``.set()`` becomes an O(log N) operation, and ``get_execution_context()`` squashing is more efficient on average due to structural sharing in HAMT. See `Appendix: HAMT Performance Analysis`_ for a more elaborate analysis of HAMT performance compared to ``dict``. Context Variables ----------------- The ``ContextVar.lookup()`` and ``ContextVar.set()`` methods are implemented as follows (in pseudo-code):: class ContextVar: def get(self): tstate = PyThreadState_Get() ec_node = tstate.ec while ec_node: if self in ec_node.lc: return ec_node.lc[self] ec_node = ec_node.prev return None def set(self, value): tstate = PyThreadState_Get() top_ec_node = tstate.ec if top_ec_node is not None: top_lc = top_ec_node.lc new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=top_ec_node.prev, lc=new_top_lc) else: top_lc = sys.new_logical_context() new_top_lc = top_lc.set(self, value) tstate.ec = ec_node( prev=NULL, lc=new_top_lc) For efficient access in performance-sensitive code paths, such as in ``numpy`` and ``decimal``, we add a cache to ``ContextVar.get()``, making it an O(1) operation when the cache is hit. The cache key is composed from the following: * The new ``uint64_t PyThreadState->unique_id``, which is a globally unique thread state identifier. It is computed from the new ``uint64_t PyInterpreterState->ts_counter``, which is incremented whenever a new thread state is created. * The ``uint64_t ContextVar->version`` counter, which is incremented whenever the context variable value is changed in any logical context in any thread. The cache is then implemented as follows:: class ContextVar: def set(self, value): ... # implementation self.version += 1 def get(self): tstate = PyThreadState_Get() if (self.last_tstate_id == tstate.unique_id and self.last_version == self.version): return self.last_value value = self._get_uncached() self.last_value = value # borrowed ref self.last_tstate_id = tstate.unique_id self.last_version = self.version return value Note that ``last_value`` is a borrowed reference. The assumption is that if the version checks are fine, the object will be alive. This allows the values of context variables to be properly garbage collected. This generic caching approach is similar to what the current C implementation of ``decimal`` does to cache the the current decimal context, and has similar performance characteristics. Performance Considerations ========================== Tests of the reference implementation based on the prior revisions of this PEP have shown 1-2% slowdown on generator microbenchmarks and no noticeable difference in macrobenchmarks. The performance of non-generator and non-async code is not affected by this PEP. Summary of the New APIs ======================= Python ------ The following new Python APIs are introduced by this PEP: 1. The ``sys.new_context_var(name: str='...')`` function to create ``ContextVar`` objects. 2. The ``ContextVar`` object, which has: * the read-only ``.name`` attribute, * the ``.lookup()`` method which returns the value of the variable in the current execution context; * the ``.set()`` method which sets the value of the variable in the current execution context. 3. The ``sys.get_execution_context()`` function, which returns a copy of the current execution context. 4. The ``sys.new_execution_context()`` function, which returns a new empty execution context. 5. The ``sys.new_logical_context()`` function, which returns a new empty logical context. 6. The ``sys.run_with_execution_context(ec: ExecutionContext, func, *args, **kwargs)`` function, which runs *func* with the provided execution context. 7. The ``sys.run_with_logical_context(lc:LogicalContext, func, *args, **kwargs)`` function, which runs *func* with the provided logical context on top of the current execution context. C API ----- 1. ``PyContextVar * PyContext_NewVar(char *desc)``: create a ``PyContextVar`` object. 2. ``PyObject * PyContext_LookupVar(PyContextVar *)``: return the value of the variable in the current execution context. 3. ``int PyContext_SetVar(PyContextVar *, PyObject *)``: set the value of the variable in the current execution context. 4. ``PyLogicalContext * PyLogicalContext_New()``: create a new empty ``PyLogicalContext``. 5. ``PyLogicalContext * PyExecutionContext_New()``: create a new empty ``PyExecutionContext``. 6. ``PyExecutionContext * PyExecutionContext_Get()``: return the current execution context. 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the passed EC object as the current for the active thread state. 8. ``int PyExecutionContext_SetWithLogicalContext(PyExecutionContext *, PyLogicalContext *)``: allows to implement ``sys.run_with_logical_context`` Python API. Design Considerations ===================== Should ``PyThreadState_GetDict()`` use the execution context? ------------------------------------------------------------- No. ``PyThreadState_GetDict`` is based on TLS, and changing its semantics will break backwards compatibility. PEP 521 ------- :pep:`521` proposes an alternative solution to the problem, which extends the context manager protocol with two new methods: ``__suspend__()`` and ``__resume__()``. Similarly, the asynchronous context manager protocol is also extended with ``__asuspend__()`` and ``__aresume__()``. This allows implementing context managers that manage non-local state, which behave correctly in generators and coroutines. For example, consider the following context manager, which uses execution state:: class Context: def __init__(self): self.var = new_context_var('var') def __enter__(self): self.old_x = self.var.lookup() self.var.set('something') def __exit__(self, *err): self.var.set(self.old_x) An equivalent implementation with PEP 521:: local = threading.local() class Context: def __enter__(self): self.old_x = getattr(local, 'x', None) local.x = 'something' def __suspend__(self): local.x = self.old_x def __resume__(self): local.x = 'something' def __exit__(self, *err): local.x = self.old_x The downside of this approach is the addition of significant new complexity to the context manager protocol and the interpreter implementation. This approach is also likely to negatively impact the performance of generators and coroutines. Additionally, the solution in :pep:`521` is limited to context managers, and does not provide any mechanism to propagate state in asynchronous tasks and callbacks. Can Execution Context be implemented outside of CPython? -------------------------------------------------------- No. Proper generator behaviour with respect to the execution context requires changes to the interpreter. Should we update sys.displayhook and other APIs to use EC? ---------------------------------------------------------- APIs like redirecting stdout by overwriting ``sys.stdout``, or specifying new exception display hooks by overwriting the ``sys.displayhook`` function are affecting the whole Python process **by design**. Their users assume that the effect of changing them will be visible across OS threads. Therefore we cannot just make these APIs to use the new Execution Context. That said we think it is possible to design new APIs that will be context aware, but that is outside of the scope of this PEP. Greenlets --------- Greenlet is an alternative implementation of cooperative scheduling for Python. Although greenlet package is not part of CPython, popular frameworks like gevent rely on it, and it is important that greenlet can be modified to support execution contexts. Conceptually, the behaviour of greenlets is very similar to that of generators, which means that similar changes around greenlet entry and exit can be done to add support for execution context. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Appendix: HAMT Performance Analysis =================================== .. figure:: pep-0550-hamt_vs_dict-v2.png :align: center :width: 100% Figure 1. Benchmark code can be found here: [9]_. The above chart demonstrates that: * HAMT displays near O(1) performance for all benchmarked dictionary sizes. * ``dict.copy()`` becomes very slow around 100 items. .. figure:: pep-0550-lookup_hamt.png :align: center :width: 100% Figure 2. Benchmark code can be found here: [10]_. Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based immutable mapping. HAMT lookup time is 30-40% slower than Python dict lookups on average, which is a very good result, considering that the latter is very well optimized. Thre is research [8]_ showing that there are further possible improvements to the performance of HAMT. The reference implementation of HAMT for CPython can be found here: [7]_. Acknowledgments =============== Thanks to Victor Petrovykh for countless discussions around the topic and PEP proofreading and edits. Thanks to Nathaniel Smith for proposing the ``ContextVar`` design [17]_ [18]_, for pushing the PEP towards a more complete design, and coming up with the idea of having a stack of contexts in the thread state. Thanks to Nick Coghlan for numerous suggestions and ideas on the mailing list, and for coming up with a case that cause the complete rewrite of the initial PEP version [19]_. Version History =============== 1. Initial revision, posted on 11-Aug-2017 [20]_. 2. V2 posted on 15-Aug-2017 [21]_. The fundamental limitation that caused a complete redesign of the first version was that it was not possible to implement an iterator that would interact with the EC in the same way as generators (see [19]_.) Version 2 was a complete rewrite, introducing new terminology (Local Context, Execution Context, Context Item) and new APIs. 3. V3 posted on 18-Aug-2017 [22]_. Updates: * Local Context was renamed to Logical Context. The term "local" was ambiguous and conflicted with local name scopes. * Context Item was renamed to Context Key, see the thread with Nick Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details. * Context Item get cache design was adjusted, per Nathaniel Smith's idea in [25]_. * Coroutines are created without a Logical Context; ceval loop no longer needs to special case the ``await`` expression (proposed by Nick Coghlan in [24]_.) 4. V4 posted on 25-Aug-2017: the current version. * The specification section has been completely rewritten. * Context Key renamed to Context Var. * Removed the distinction between generators and coroutines with respect to logical context isolation. References ========== .. [1] https://blog.golang.org/context .. [2] https://msdn.microsoft.com/en-us/library/system.threading. executioncontext.aspx .. [3] https://github.com/numpy/numpy/issues/9444 .. [4] http://bugs.python.org/issue31179 .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures- persistenthashmap-part-ii.html .. [7] https://github.com/1st1/cpython/tree/hamt .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e .. [11] https://github.com/1st1/cpython/tree/pep550 .. [12] https://www.python.org/dev/peps/pep-0492/#async-await .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/ bench/echoserver.py .. [14] https://github.com/MagicStack/pgbench .. [15] https://github.com/python/performance .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c .. [17] https://mail.python.org/pipermail/python-ideas/2017- August/046752.html .. [18] https://mail.python.org/pipermail/python-ideas/2017- August/046772.html .. [19] https://mail.python.org/pipermail/python-ideas/2017- August/046775.html .. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b 3ad73a1d01/pep-0550.rst .. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e 9e5960c175/pep-0550.rst .. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71 c1248f67e7/pep-0550.rst .. [23] https://mail.python.org/pipermail/python-ideas/2017- August/046801.html .. [24] https://mail.python.org/pipermail/python-ideas/2017- August/046790.html .. [25] https://mail.python.org/pipermail/python-ideas/2017- August/046786.html Copyright ========= This document has been placed in the public domain. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ mertz%40gnosis.cx

On Sat, Aug 26, 2017 at 12:56 AM, David Mertz <mertz@gnosis.cx> wrote:
This is now looking really good and I can understands it.
Great!
We were very focused on making the High-level Specification as succinct as possible, omitting some API details that are not important for understanding the semantics. "name" argument is not optional and will be required. If it's optional, people will not provide it, making it very hard to introspect the context when we want it. I guess we'll just update the High-level Specification section to use the correct signature of "new_context_var". Yury

Would it be possible/desirable to make the default a unique string value like a UUID or a stringified counter? On Aug 26, 2017 9:35 AM, "Yury Selivanov" <yselivanov.ml@gmail.com> wrote: On Sat, Aug 26, 2017 at 12:56 AM, David Mertz <mertz@gnosis.cx> wrote:
This is now looking really good and I can understands it.
Great!
One question though. Sometimes creation of a context variable is done
with a
We were very focused on making the High-level Specification as succinct as possible, omitting some API details that are not important for understanding the semantics. "name" argument is not optional and will be required. If it's optional, people will not provide it, making it very hard to introspect the context when we want it. I guess we'll just update the High-level Specification section to use the correct signature of "new_context_var". Yury

On Sat, Aug 26, 2017 at 1:10 PM, David Mertz <mertz@gnosis.cx> wrote:
Would it be possible/desirable to make the default a unique string value like a UUID or a stringified counter?
Sure, or we could just use the id of ContextVar. In the end, when we want to introspect the EC while debugging, we would see something like this: { ContextVar(name='518CDD4F-D676-408F-B968-E144F792D055'): 42, ContextVar(name='decimal_context'): DecimalContext(precision=2), ContextVar(name='7A44D3BE-F7A1-40B7-BE51-7DFFA7E0E02F'): 'spam' } That's why I think it's easier to force users always specify the name: my_var = sys.new_context_var('my_var') This is similar to namedtuples, and nobody really complains about them. Yury

On Sat, Aug 26, 2017 at 11:19 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This is similar to namedtuples, and nobody really complains about them.
FWIW, there are plenty of complaints on python-ideas about this (and never a satisfactory solution). :) That said, I don't think it is as big a deal here since the target audience is much smaller. -eric

On Fri, Aug 25, 2017 at 3:32 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I think this change is a bad idea. I think that generally, an async call like 'await async_sub()' should have the equivalent semantics to a synchronous call like 'sync_sub()', except for the part where the former is able to contain yields. Giving every coroutine an LC breaks that equivalence. It also makes it so in async code, you can't necessarily refactor by moving code in and out of subroutines. Like, if we inline 'sub' into 'main', that shouldn't change the semantics, but... async def main(): var.set('main') # inlined copy of sub() assert var.lookup() == 'main' var.set('sub') assert var.lookup() == 'sub' # end of inlined copy assert var.lookup() == 'main' # fails It also adds non-trivial overhead, because now lookup() is O(depth of async callstack), instead of O(depth of (async) generator nesting), which is generally much smaller. I think I see the motivation: you want to make await sub() and await ensure_future(sub()) have the same semantics, right? And the latter has to create a Task and split it off into a new execution context, so you want the former to do so as well? But to me this is like saying that we want sync_sub() and thread_pool_executor.submit(sync_sub).result() to have the same semantics: they mostly do, but sync_sub() access thread-locals then they won't. Oh well. That's perhaps a but unfortunate, but it doesn't mean we should give every synchronous frame its own thread-locals. (And fwiw I'm still not convinced we should give up on 'yield from' as a mechanism for refactoring generators.)
I found this example confusing -- you talk about sub() and main() running concurrently, but ``wait_for`` blocks main() until sub() has finished running, right? Is this just supposed to show that there should be some sort of inheritance across tasks, and then the next example is to show that it has to be a copy rather than sharing the actual object? (This is just a issue of phrasing/readability.)
It occurs to me that both this and the way generator/coroutines expose their logic context means that logical context objects are semantically mutable. This could create weird effects if someone attaches the same LC to two different generators, or tries to use it simultaneously in two different threads, etc. We should have a little interlock like generator's ag_running, where an LC keeps track of whether it's currently in use and if you try to push the same LC onto two ECs simultaneously then it errors out.
I'm pretty sure you need to also invalidate on context push/pop. Consider: def gen(): var.set("gen") var.lookup() # cache now holds "gen" yield print(var.lookup()) def main(): var.set("main") g = gen() next(g) # This should print "main", but it's the same thread and the last call to set() was # the one inside gen(), so we get the cached "gen" instead print(var.lookup()) var.set("no really main") var.lookup() # cache now holds "no really main" next(g) # should print "gen" but instead prints "no really main"
I think you missed a s/get/lookup/ here :-) -n -- Nathaniel J. Smith -- https://vorpus.org

On Saturday, August 26, 2017 2:34:29 AM EDT Nathaniel Smith wrote:
If we could easily, we'd given each _normal function_ its own logical context as well. What we are talking about here is variable scope leaking up the call stack. I think this is a bad pattern. For decimal context-like uses of the EC you should always use a context manager. For uses like Web request locals, you always have a top function that sets the context vars.
What we want is for `await sub()` to be equivalent to `await asyncio.wait_for(sub())` and to `await asyncio.gather(sub())`. Imagine we allow context var changes to leak out of `async def`. It's easy to write code that relies on this: async def init(): var.set('foo') async def main(): await init() assert var.lookup() == 'foo' If we change `await init()` to `await asyncio.wait_for(init())`, the code will break (and in real world, possibly very subtly).
You would hit cache in lookup() most of the time. Elvis

On Sat, Aug 26, 2017 at 7:58 AM, Elvis Pranskevichus <elprans@gmail.com> wrote:
I mean... you could do that. It'd be easy to do technically, right? But it would make the PEP useless, because then projects like decimal and numpy couldn't adopt it without breaking backcompat, meaning they couldn't adopt it at all. The backcompat argument isn't there in the same way for async code, because it's new and these functions have generally been broken there anyway. But it's still kinda there in spirit: there's a huge amount of collective knowledge about how (synchronous) Python code works, and IMO async code should match that whenever possible.
It's perfectly reasonable to have a script where you call decimal.setcontext or np.seterr somewhere at the top to set the defaults for the rest of the script. Yeah, maybe it'd be a bit cleaner to use a 'with' block wrapped around main(), and certainly in a complex app you want to stick to that, but Python isn't just used for complex apps :-). I foresee confused users trying to figure out why np.seterr suddenly stopped working when they ported their app to use async. This also seems like it makes some cases much trickier. Like, say you have an async context manager that wants to manipulate a context local. If you write 'async def __aenter__', you just lost -- it'll be isolated. I think you have to write some awkward thing like: def __aenter__(self): coro = self._real_aenter() coro.__logical_context__ = None return coro It would be really nice if libraries like urllib3/requests supported async as an option, but it's difficult because they can't drop support for synchronous operation and python 2, and we want to keep a single codebase. One option I've been exploring is to write them in "synchronous style" but with async/await keywords added, and then generating a py2-compatible version with a script that strips out async/await etc. (Like a really simple 3to2 that just works at the token level.) One transformation you'd want to apply is replacing __aenter__ -> __enter__, but this gets much more difficult if we have to worry about elaborate transformations like the above... If I have an async generator, and I set its __logical_context__ to None, then do I also have to set this attribute on every coroutine returned from calling __anext__/asend/athrow/aclose?
I don't feel like there's any need to make gather() have exactly the same semantics as a regular call -- it's pretty clearly a task-spawning primitive that runs all of the given coroutines concurrently, so it makes sense that it would have task-spawning semantics rather than call semantics. wait_for is a more unfortunate case; there's really no reason for it to create a Task at all, except that asyncio made the decision to couple cancellation and Tasks, so if you want one then you're stuck with the other. Yury's made some comments about stealing Trio's cancellation system and adding it to asyncio -- I don't know how serious he was. If he did then it would let you use timeouts without creating a new Task, and this problem would go away. OTOH if you stick with pushing a new LC on every coroutine call, then that makes Trio's cancellation system way slower, because it has to walk the whole stack of LCs on every yield to register/unregister each cancel scope. PEP 550v4 makes that stack much deeper, plus breaks the optimization I was planning to use to let us mostly skip this entirely. (To be clear, this isn't the main reason I think these semantics are a bad idea -- the main reason is that I think async and sync code should have the same semantics. But it definitely doesn't help that it creates obstacles to improving asyncio/improving on asyncio.)
But instead you're making it so that it will break if the user adds/removes async/await keywords: def init(): var.set('foo') def main(): init()
You've just reduced the cache hit rate too, because the cache gets invalidated on every push/pop. Presumably you'd optimize this to skip invalidating if the LC that gets pushed/popped is empty, so this isn't as catastrophic as it might initially look, but you still have to invalidate all the cached variables every time any variable gets touched and then you return from a function. Which might happen quite a bit if, for example, using timeouts involves touching the LC :-). -n -- Nathaniel J. Smith -- https://vorpus.org

On Sun, Aug 27, 2017 at 6:08 AM, Stefan Krah <stefan@bytereef.org> wrote:
TBH Nathaniel's argument isn't entirely correct. With the semantics defined in PEP 550 v4, you still can set decimal context on top of your file, in your async functions etc. This will work: decimal.setcontext(ctx) def foo(): # use decimal with context=ctx and this: def foo(): decimal.setcontext(ctx) # use decimal with context=ctx and this: def bar(): # use decimal with context=ctx def foo(): decimal.setcontext(ctx) bar() and this: def bar(): decimal.setcontext(ctx) def foo(): bar() # use decimal with context=ctx and this: decimal.setcontext(ctx) async def foo(): # use decimal with context=ctx and this: async def bar(): # use decimal with context=ctx async def foo(): decimal.setcontext(ctx) await bar() The only thing that will not work, is this (ex1): async def bar(): decimal.setcontext(ctx) async def foo(): await bar() # use decimal with context=ctx The reason why this one example worked in PEP 550 v3 and doesn't work in v4 is that we want to avoid random code breakage if you wrap your coroutine in a task, like here (ex2): async def bar(): decimal.setcontext(ctx) async def foo(): await wait_for(bar(), 1) # use decimal with context=ctx We want (ex1) and (ex2) to work the same way always. That's the only difference in semantics between v3 and v4, and it's the only sane one, because implicit task creation is an extremely subtle detail that most users aren't aware of. We can't have a semantics that let you easily break your code by adding a timeout in one await. Speaking of (ex1), there's an example that didn't work in any PEP 550 version: def bar(): decimal.setcontext(ctx) yield async def foo(): list(bar()) # use decimal with context=ctx In the above code, bar() generator sets some decimal context, and it will not leak outside of it. This semantics is one of PEP 550 goals. The last change just unifies this semantics for coroutines, generators, and asynchronous generators, which is a good thing. Yury

On Sun, Aug 27, 2017 at 11:19:20AM -0400, Yury Selivanov wrote:
Okay, so if I understand this correctly we actually will not have dynamic scoping for regular functions: bar() has returned, so the new context would not be found on the stack with proper dynamic scoping.
Here we do have dynamic scoping.
What about this? async def bar(): setcontext(Context(prec=1)) for i in range(10): await asyncio.sleep(1) yield i async def foo(): async for i in bar(): # ctx.prec=1? print(Decimal(100) / 3) I'm searching for some abstract model to reason about the scopes. Stefan Krah

On Mon, Aug 28, 2017 at 7:19 AM, Stefan Krah <stefan@bytereef.org> wrote:
Correct. Although I would avoid associating PEP 550 with dynamic scoping entirely, as we never intended to implement it. [..]
Whatever is set in coroutines, generators, and async generators does not leak out. In the above example, "prec=1" will only be set inside "bar()", and "foo()" will not see that. (Same will happen for a regular function and a generator). Yury

On Mon, Aug 28, 2017 at 11:23:12AM -0400, Yury Selivanov wrote:
Good, I agree it does not make sense.
But the state "leaks in" as per your previous example: async def bar(): # use decimal with context=ctx async def foo(): decimal.setcontext(ctx) await bar() IMHO it shouldn't with coroutine-local-storage (let's call it CLS). So, as I see it, there's still some mixture between dynamic scoping and CLS because it this example bar() is allowed to search the stack. Stefan Krah

On Mon, Aug 28, 2017 at 11:52 AM, Stefan Krah <stefan@bytereef.org> wrote: [..]
The whole proposal will then be mostly useless. If we forget about the dynamic scoping (I don't know why it's being brought up all the time, TBH; nobody uses it, almost no language implements it) the current proposal is well balanced and solves multiple problems. Three points listed in the rationale section: * Context managers like decimal contexts, numpy.errstate, and warnings.catch_warnings. * Request-related data, such as security tokens and request data in web applications, language context for gettext etc. * Profiling, tracing, and logging in large code bases. Two of them require context propagation *down* the stack of coroutines. What latest PEP 550 revision does, it prohibits context propagation *up* the stack in coroutines (it's a requirement to make async code refactorable and easy to reason about). Propagation of context "up" the stack in regular code is allowed with threading.local(), and everybody is used to it. Doing that for coroutines doesn't work, because of the reasons covered here: https://www.python.org/dev/peps/pep-0550/#coroutines-and-asynchronous-tasks Yury

On 08/28/2017 09:12 AM, Yury Selivanov wrote:
If we forget about dynamic scoping (I don't know why it's being brought up all the time, TBH; nobody uses it, almost no language implements it)
Probably because it's not lexical scoping, and possibly because it's possible for a function to be running with one EC on one call, and a different EC on the next -- hence, the EC it's using is dynamically determined. It seems to me the biggest difference between "true" dynamic scoping and what PEP 550 implements is the granularity: i.e. not every single function gets it's own LC, just a select few: generators, async stuff, etc. Am I right? (No CS degree here.) If not, what are the differences? -- ~Ethan~

On Mon, Aug 28, 2017 at 12:43 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Sounds right to me. If PEP 550 was about adding true dynamic scoping, we couldn't use it as a suitable context management solution for libraries like decimal. For example, converting decimal/numpy to use new APIs would be a totally backwards-incompatible change. I still prefer using a "better TLS" analogy for PEP 550. We'll likely add a section summarizing differences between threading.local() and new APIs (as suggested by Eric Snow). Yury

On Mon, Aug 28, 2017 at 12:12:00PM -0400, Yury Selivanov wrote:
Because a) it was brought up by proponents of the PEP early on python-ideas, b) people desperately want a mental model of what is going on. :-)
* Context managers like decimal contexts, numpy.errstate, and warnings.catch_warnings.
The decimal context works like this: 1) There is a default context template (user settable). 2) Whenever the first operation *in a new thread* occurs, the thread-local context is initialized with a copy of the template. I don't find it very intuitive if setcontext() is somewhat local in coroutines but they don't participate in some form of CLS. You have to think about things like "what happens in a fresh thread when a coroutine calls setcontext() before any other decimal operation has taken place". So perhaps Nathaniel is right that the PEP is not so useful for numpy and decimal backwards compat. Stefan Krah

On Mon, Aug 28, 2017 at 1:33 PM, Stefan Krah <stefan@bytereef.org> wrote: [..]
I'm sorry, I don't follow you here. PEP 550 semantics: setcontext() in a regular code would set the context for the whole thread. setcontext() in a coroutine/generator/async generator would set the context for all the code it calls.
So perhaps Nathaniel is right that the PEP is not so useful for numpy and decimal backwards compat.
Nathaniel's argument is pretty weak as I see it. He argues that some people would take the following code: def bar(): # set decimal context def foo(): bar() # use the decimal context set in bar() and blindly convert it to async/await: async def bar(): # set decimal context async def foo(): await bar() # use the decimal context set in bar() And that it's a problem that it will stop working. But almost nobody converts the code by simply slapping async/await on top of it -- things don't work this way. It was never a goal for async/await or asyncio, or even trio/curio. Porting code to async/await almost always requires a thoughtful rewrite. In async/await, the above code is an *anti-pattern*. It's super fragile and can break by adding a timeout around "await bar". There's no workaround here. Asynchronous code is fundamentally non-local and a more complex topic on its own, with its own concepts: Asynchronous Tasks, timeouts, cancellation, etc. Fundamentally: "(synchronous code) != (asynchronous code) - (async/await)". Yury

Yury Selivanov wrote:
Maybe not, but it will also affect refactoring of code that is *already* using async/await, e.g. taking async def foobar(): # set decimal context # use the decimal context we just set and refactoring it as above. Given that one of the main motivations for yield-from (and subsequently async/await) was so that you *can* perform that kind of refactoring easily, that does indeed seem like a problem to me. It seems to me that individual generators/coroutines shouldn't automatically get a context of their own, they should have to explicitly ask for one. -- Greg -- things don't work this way. It was never a goal for

On Mon, Aug 28, 2017 at 6:22 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
There's no code that already uses async/await and decimal context managers/setters. Any such code is broken right now, because decimal context set in one coroutine affects them all. Your example would work only if foobar() is the only coroutine in your program.
With the current PEP 550 semantics w.r.t. generators you still can refactor them. The following code would work as expected: def nested_gen(): # use some_context def gen(): with some_context(): yield from nested_gen() list(gen()) I saying that the following should not work: def nested_gen(): set_some_context() yield def gen(): # some_context is not set yield from nested_gen() # use some_context ??? list(gen()) IOW, any context set in generators should not leak to the caller, ever. This is the whole point of the PEP. As for async/await, see this: https://mail.python.org/pipermail/python-dev/2017-August/149022.html Yury

On Mon, Aug 28, 2017 at 6:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Consider the following generator: def gen(): with decimal.context(...): yield We don't want gen's context to leak to the outer scope -- that's one of the reasons why PEP 550 exists. Even if we do this: g = gen() next(g) # the decimal.context won't leak out of gen So a Python user would have a mental model: context set in generators doesn't leak. Not, let's consider a "broken" generator: def gen(): decimal.context(...) yield If we iterate gen() with next(), it still won't leak its context. But if "yield from" has semantics that you want -- "yield from" to be just like function call -- then calling yield from gen() will corrupt the context of the caller. I simply want consistency. It's easier for everybody to say that generators never leaked their context changes to the outer scope, rather than saying that "generators can sometimes leak their context". Yury

On Mon, Aug 28, 2017 at 7:16 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Adding to the above: there's a fundamental reason why we can't make "yield from" transparent for EC modifications. While we want "yield from" to have semantics close to a function call, in some situations we simply can't. Because you can manually iterate a generator and then 'yield from' it, you can have this weird 'partial-function-call' semantics. For example: var = new_context_var() def gen(): var.set(42) yield yield Now, we can partially iterate the generator (1): def main(): g = gen() next(g) # we don't want 'g' to leak its EC changes, # so var.get() is None here. assert var.get() is None and then we can "yield from" it (2): def main(): g = gen() next(g) # we don't want 'g' to leak its EC changes, # so var.get() is None here. assert var.get() is None yield from g # at this point it's too late for us to let var leak into # main().__logical_context__ For (1) we want the context change to be isolated. For (2) you say that the context change should propagate to the caller. But it's impossible: 'g' already has its own LC({var: 42}), and we can't merge it with the LC of "main()". "await" is fundamentally different, because it's not possible to partially iterate the coroutine before awaiting it (asyncio will break if you call "coro.send(None)" manually). Yury

Yury Selivanov wrote:
While we want "yield from" to have semantics close to a function call,
That's not what I said! I said that "yield from foo()" should have semantics close to a function call. If you separate the "yield from" from the "foo()", then of course you can get different behaviours. But that's beside the point, because I'm not suggesting that generators should behave differently depending on when or if you use "yield from" on them.
For (1) we want the context change to be isolated. For (2) you say that the context change should propagate to the caller.
No, I'm saying that the context change should *always* propagate to the caller, unless you do something explicit within the generator to prevent it. I have some ideas on what that something might be, which I'll post later. -- Greg

On Tue, Aug 29, 2017 at 7:36 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
BTW we already have mechanisms to always propagate context to the caller -- just use threading.local() or a global variable. PEP 550 is for situations when you explicitly don't want to propagate the state. Anyways, I'm curious to hear your ideas. Yury

On 30 August 2017 at 10:18, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Writing an "update_parent_context" decorator is also trivial (and will work for both sync and async generators): def update_parent_context(gf): @functools.wraps(gf): def wrapper(*args, **kwds): gen = gf(*args, **kwds): gen.__logical_context__ = None return gen return wrapper The PEP already covers that approach when it talks about the changes to contextlib.contextmanager to get context changes to propagate automatically. With contextvars getting its own module, it would also be straightforward to simply include that decorator as part of its API, so folks won't need to write their own. While I'm not sure how much practical use it will see, I do think it's important to preserve the *ability* to transparently refactor generators using yield from - I'm just OK with such a refactoring becoming "yield from update_parent_context(subgen())" instead of the current "yield from subgen()" (as I think *not* updating the parent context is a better default than updating it). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 30 August 2017 at 16:40, Nick Coghlan <ncoghlan@gmail.com> wrote:
Oops, I got mixed up between whether I thought this should be a decorator or an explicitly called helper function. One option would be to provide both: def update_parent_context(gen): ""Configures a generator-iterator to update its caller's context variables"""" gen.__logical_context__ = None return gen def updates_parent_context(gf): ""Wraps a generator function's instances with update_parent_context"""" @functools.wraps(gf): def wrapper(*args, **kwds): return update_parent_context(gf(*args, **kwds)) return wrapper Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Yury Selivanov wrote:
BTW we already have mechanisms to always propagate context to the caller -- just use threading.local() or a global variable.
But then you don't have a way to *not* propagate the context change when you don't want to. Here's my suggestion: Make an explicit distinction between creating a new binding for a context var and updating an existing one. So instead of two API calls there would be three: contextvar.new(value) # Creates a new binding only # visible to this frame and # its callees contextvar.set(value) # Updates existing binding in # context inherited from caller contextvar.get() # Retrieves the current binding If we assume an extension to the decimal module so that decimal.localcontext is a context var, we can now do this: async def foo(): # Establish a new context for this task decimal.localcontext.new(decimal.Context()) # Delegate changing the context await bar() # Do some calculations yield 17 * math.pi + 42 async def bar(): # Change context for caller decimal.localcontext.prec = 5 -- Greg

On Wed, Aug 30, 2017 at 8:55 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Interesting. Question: how to write a context manager with contextvar.new? var = new_context_var() class CM: def __enter__(self): var.new(42) with CM(): print(var.get() or 'None') My understanding that the above code will print "None", because "var.new()" makes 42 visible only to callees of __enter__. But if I use "set()" in "CM.__enter__", presumably, it will traverse the stack of LCs to the very bottom and set "var=42" in in it. Right? If so, how can fix the example in PEP 550 Rationale: https://www.python.org/dev/peps/pep-0550/#rationale where we zip() the "fractions()" generator? With current PEP 550 semantics that's trivial: https://www.python.org/dev/peps/pep-0550/#generators Yury

Yury Selivanov wrote:
If you tie the introduction of a new scope for context vars to generators, as PEP 550 currently does, then this isn't a problem. But I'm trying to avoid doing that. The basic issue is that, ever since yield-from, "generator" and "task" are not synonymous. When you use a generator to implement an iterator, you probably want it to behave as a distinct task with its own local context. But a generator used with yield-from isn't a task of its own, it's just part of another task, and there is nothing built into Python that lets you tell the difference automatically. So I'm now thinking that the introduction of a new local context should also be explicit. Suppose we have these primitives: push_local_context() pop_local_context() Now introducing a temporary decimal context looks like: push_local_context() decimal.localcontextvar.new(decimal.getcontext().copy()) decimal.localcontextvar.prec = 5 do_some_calculations() pop_local_context() Since calls (either normal or generator) no longer automatically result in a new local context, we can easily factor this out into a context manager: class LocalDecimalContext(): def __enter__(self): push_local_context() ctx = decimal.getcontext().copy() decimal.localcontextvar.new(ctx) return ctx def __exit__(self): pop_local_context() Usage: with LocalDecimalContext() as ctx: ctx.prec = 5 do_some_calculations() -- Greg

On Tue, Sep 5, 2017 at 4:59 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Greg, have you seen this new section: https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-cha... ? It has a couple of examples that illustrate some issues with the "But a generator used with yield-from isn't a task of its own, it's just part of another task," reasoning. In principle, we can modify PEP 550 to make 'yield from' transparent to context changes. The interpreter can just reset g.__logical_context__ to None whenever 'g' is being 'yield-frommed'. The key issue is that there are a couple of edge-cases when having this semantics is problematic. The bottomline is that it's easier to reason about context when it's guaranteed that context changes are always isolated in generators no matter what. I think this semantics actually makes the refactoring easier. Please take a look at the linked section.
This will have some performance implications and make the API way more complex. But I'm not convinced yet that real-life code needs the semantics you want. This will work with the current PEP 550 design: def g(): with DecimalContext() as ctx: ctx.prec = 5 yield from do_some_calculations() # will run with the correct ctx the only thing that won't work is this: def do_some_calculations(): ctx = DecimalContext() ctx.prec = 10 decimal.setcontext(ctx) yield def g(): yield from do_some_calculations() # Context changes in do_some_calculations() will not leak to g() In the above example, do_some_calculations() deliberately tries to leak context changes (by not using a contextmanager). And I consider it a feature that PEP 550 does not allow generators to leak state. If you write code that uses 'with' statements consistently, you will never even know that context changes are isolated in generators. Yury

Yury Selivanov wrote:
Greg, have you seen this new section: https://www.python.org/dev/peps/pep-0550/#should-yield-from-leak-context-cha...
That section seems to be addressing the idea of a generator behaving differently depending on whether you use yield-from on it. I never suggested that, and I'm still not suggesting it.
I don't see a lot of value in trying to automagically isolate changes to global state *only* in generators. Under PEP 550, if you want to e.g. change the decimal context temporarily in a non-generator function, you're still going to have to protect those changes using a with-statement or something equivalent. I don't see why the same thing shouldn't apply to generators. It seems to me that it will be *more* confusing to give generators this magical ability to avoid with-statements.
This will have some performance implications and make the API way more complex.
I can't see how it would have any significant effect on performance. The implementation would be very similar to what's currently described in the PEP. You'll have to elaborate on how you think it would be less efficient. As for complexity, push_local_context() and push_local_context() would be considered low-level primitives that you wouldn't often use directly. Most of the time they would be hidden inside context managers. You could even have a context manager just for applying them: with new_local_context(): # go nuts with context vars here
But I'm not convinced yet that real-life code needs the semantics you want.
And I'm not convinced that it needs as much magic as you want.
If you write code that uses 'with' statements consistently, you will never even know that context changes are isolated in generators.
But if you write code that uses context managers consistently, and those context managers know about and handle local contexts properly, generators don't *need* to isolate their context automatically. -- Greg

Another comment from bystander point of view: it looks like the discussions of API design and implementation are a bit entangled here. This is much better in the current version of the PEP, but still there is a _feelling_ that some design decisions are influenced by the implementation strategy. As I currently see the "philosophy" at large is like this: there are different level of coupling between concurrently executing code: * processes: practically not coupled, designed to be long running * threads: more tightly coupled, designed to be less long-lived, context is managed by threading.local, which is not inherited on "forking" * tasks: tightly coupled, designed to be short-lived, context will be managed by PEP 550, context is inherited on "forking" This seems right to me. Normal generators fall out from this "scheme", and it looks like their behavior is determined by the fact that coroutines are implemented as generators. What I think miht help is to add few more motivational examples to the design section of the PEP. -- Ivan

On Wed, Sep 6, 2017 at 1:49 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
Literally the first motivating example at the beginning of the PEP ('def fractions ...') involves only generators, not coroutines, and only works correctly if generators get special handling. (In fact, I'd be curious to see how Greg's {push,pop}_local_storage could handle this case.) The implementation strategy changed radically between v1 and v2 because of considerations around generator (not coroutine) semantics. I'm not sure what more it can do to dispel these feelings :-). -n -- Nathaniel J. Smith -- https://vorpus.org

On Wed, Sep 6, 2017 at 12:13 PM, Nathaniel Smith <njs@pobox.com> wrote:
Just to mention that this is now closely related to the discussion on my proposal on python-ideas. BTW, that proposal is now submitted as PEP 555 on the peps repo. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 6 September 2017 at 11:13, Nathaniel Smith <njs@pobox.com> wrote:
And this is probably what confuses people. As I understand, the tasks/coroutines are among the primary motivations for the PEP, but they appear somewhere later. There are four potential ways to see the PEP: 1) Generators are broken*, and therefore coroutines are broken, we want to fix the latter therefore we fix the former. 2) Coroutines are broken, we want to fix them and let's also fix generators while we are at it. 3) Generators are broken, we want to fix them and let's also fix coroutines while we are at it. 4) Generators and coroutines are broken in similar ways, let us fix them as consistently as we can. As I understand the PEP is based on option (4), please correct me if I am wrong. Therefore maybe this should be said more straight, and maybe then we should show _in addition_ a task example in rationale, show how it is broken, and explain that they are broken in slightly different ways (since expected semantics is a bit different). -- Ivan * here and below by broken I mean "broken" (sometimes behave in non-intuitive way, and lack some functionality we would like them to have)

On Wed, Sep 6, 2017 at 5:58 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
Ivan, generators and coroutines are fundamentally different objects (even though they share the implementation). The only common thing is that they both allow for out of order execution of code in the same OS thread. The PEP explains the semantical difference of EC in the High-level Specification in detail, literally on the 2nd page of the PEP. I don't see any benefit in reshuffling the rationale section. Yury

Nathaniel Smith wrote:
I've given a decimal-based example, but it was a bit scattered. Here's a summary and application to the fractions example. I'm going to assume that the decimal module has been modified to keep the current context in a context var, and that getcontext() and setcontext() access that context var. THe decimal.localcontext context manager is also redefined as: class localcontext(): def __enter__(self): push_local_context() ctx = getcontext().copy() setcontext(ctx) return ctx def __exit__(self): pop_local_context() Now we can write the fractions generator as: def fractions(precision, x, y): with decimal.localcontext() as ctx: ctx.prec = precision yield Decimal(x) / Decimal(y) yield Decimal(x) / Decimal(y ** 2) You may notice that this is exactly the same as what you would write today for the same task... -- Greg

On Wed, Sep 6, 2017 at 5:00 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
1. So essentially this means that we will have one "local context" per context manager storing one value. 2. If somebody makes a mistake and calls "push_local_context" without a corresponding "pop_local_context" -- you will have an unbounded growth of LCs (happen's in Koos' proposal too, btw). 3. Users will need to know way more to correctly use the mechanism. So far, both you and Koos can't give us a realistic example which illustrates why we should suffer the implications of (1), (2), and (3). Yury

Yury Selivanov wrote:
1. So essentially this means that we will have one "local context" per context manager storing one value.
I can't see that being a major problem. Context vars will (I hope!) be very rare things, and needing to change a bunch of them in one function ought to be rarer still. But if you do, it would be easy to provide a context manager whose sole effect is to introduce a new context: with new_local_context(): cvar1.set(something) cvar2.set(otherthing) ...
2. If somebody makes a mistake and calls "push_local_context" without a corresponding "pop_local_context"
You wouldn't normally call them directly, they would be encapsulated in carefully-written context managers. If you do use them, you're taking responsibility for using them correctly. If it would make you feel happier, they could be named _push_local_context and _pop_local_context to emphasise that they're not intended for everyday use.
3. Users will need to know way more to correctly use the mechanism.
Most users will simply be using already-provided context managers, which they're *already used to doing*. So they won't have to know anything more than they already do. See my last decimal example, which required *no change* to existing correct user code.
And you haven't given a realistic example that convinces me your proposed with-statement-elimination feature would be of significant benefit. -- Greg

Nathaniel Smith wrote:
I can't say the changes have dispelled any feelings on my part. The implementation suggested in the PEP seems very complicated and messy. There are garbage collection issues, which it proposes using weak references to mitigate. There is also apparently some issue with long chains building up and having to be periodically collapsed. None of this inspires confidence that we have the basic design right. My approach wouldn't have any of those problems. The implementation would be a lot simpler. -- Greg

On Wed, Sep 6, 2017 at 5:06 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
"messy" and "complicated" doesn't sound like a valuable feedback :( There are no "garbage collection issues", sorry. The issue that we use weak references for is the same issue why threading.local() uses them: def foo(): var = ContextVar() var.set(1) for _ in range(10**6): foo() If 'var' is strongly referenced, we would have a bunch of them.
Cool. Yury

Yury Selivanov wrote:
Erk. This is not how I envisaged context vars would be used. What I thought you would do is this: my_context_var = ContextVar() def foo(): my_context_var.set(1) This problem would also not arise if context vars simply had names instead of being magic key objects: def foo(): contextvars.set("mymodule.myvar", 1) That's another thing I think would be an improvement, but it's orthogonal to what we're talking about here and would be best discussed separately. -- Greg

On Thu, Sep 7, 2017 at 10:54 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
There are lots of things in this discussion that I should have commented on, but here's one related to this. PEP 555 does not have the resource-management issue described above and needs no additional tricks to achieve that: # using PEP 555 def foo(): var = contextvars.Var() with var.assign(1): # do something [*] for _ in range(10**6): foo() Every time foo is called, a new context variable is created, but that's perfectly fine, and lightweight. As soon as the context manager exits, there are no references to the Assignment object returned by var.assign(1), and as soon as foo() returns, there are no references to var, so everything should get cleaned up nicely. And regarding string keys, they have pros and cons, and they can be added easily, so let's not go there now. -- Koos [*] (nit-picking) without closures that would keep the var reference alive -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Thursday, September 7, 2017 3:54:15 AM EDT Greg Ewing wrote:
On the contrary, using simple names (PEP 550 V1 was actually doing that) is a regression. It opens up namespace clashing issues. Imagine you have a variable named "foo", and then some library you import also decides to use the name "foo", what then? That's one of the reasons why we do `local = threading.local()` instead of `threading.set_local("foo", 1)`. Elvis

On Wednesday, September 6, 2017 8:06:36 PM EDT Greg Ewing wrote:
I might have missed something, but your claim doesn't make any sense to me. All you've proposed is to replace the implicit and guaranteed push_lc()/pop_lc() around each generator with explicit LC stack management. You *still* need to retain and switch the current stack on every generator send() and throw(). Everything else written out in PEP 550 stays relevant as well. As for the "long chains building up", your approach is actually much worse. The absense of a guaranteed context fence around generators would mean that contextvar context managers will *have* to push LCs whether really needed or not. Consider the following (naive) way of computing the N-th Fibonacci number: def fib(n): with decimal.localcontext(): if n == 0: return 0 elif n == 1: return 1 else: return fib(n - 1) + fib(n - 2) Your proposal can cause the LC stack to grow incessantly even in simple cases, and will affect code that doesn't even use generators. A great deal of effort was put into PEP 550, and the matter discussed is far from trivial. What you see as "complicated and messy" is actually the result of us carefully considering the solutions to real- world problems, and then the implications of those solutions (including the worst-case scenarios.) Elvis

Ivan Levkivskyi wrote:
This is what I disagree with. Generators don't implement coroutines, they implement *parts* of coroutines. We want "task local storage" that behaves analogously to thread local storage. But PEP 550 as it stands doesn't give us that; it gives something more like "function local storage" for certain kinds of function. -- Greg

On Wed, Sep 6, 2017 at 4:27 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
The PEP gives you a Task Local Storage, where Task is: 1. your single-threaded code 2. a generator 3. an async task If you correctly use context managers, PEP 550 works intuitively and similar to how one would think that threading.local() should work. The only example you (and Koos) can come up with is this: def generator(): set_decimal_context() yield next(generator()) # decimal context is not set # or yield from generator() # decimal context is still not set I consider that the above is a feature. Yury

Yury Selivanov wrote:
My version works *more* similarly to thread-local storage, IMO. Currently, if you change the decimal context without using a with-statement or something equivalent, you *don't* expect the change to be confined to the current function or sub-generator or async sub-task. All I'm asking for is one consistent rule: If you want a context change encapsulated, use a with-statement. If you don't, don't. Not only is this rule simpler than yours, it's the *same* rule that we have now, so there is less for users to learn. -- Greg

On Wed, Sep 6, 2017 at 12:07 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Greg, just to make sure that we are talking about the same thing, could you please show an example (using the current PEP 550 API/semantics) of something that in your opinion should work differently for generators? Yury

On Wed, Sep 6, 2017 at 10:07 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Regarding this, I think yield from should have the same semantics as iterating over the generator with next/send, and PEP 555 has no issues with this.
Exactly. To state it clearly: PEP 555 does not have this issue. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I think yield from should have the same semantics as iterating over the generator with next/send, and PEP 555 has no issues with this.
I think the onus is on you and Greg to show a realistic example that shows why this is necessary. So far all the argumentation about this has been of the form "if you have code that currently does this (example using foo) and you refactor it in using yield from (example using bar), and if you were relying on context propagation back out of calls, then it should still propagate out." This feels like a very abstract argument. I have a feeling that context state propagating out of a call is used relatively rarely -- it must work for cases where you refactor something that changes context inline into a utility function (e.g. decimal.setcontext()), but I just can't think of a realistic example where coroutines (either of the yield-from variety or of the async/def form) would be used for such a utility function. A utility function that sets context state but also makes a network call just sounds like asking for trouble! -- --Guido van Rossum (python.org/~guido)

On Wed, Sep 6, 2017 at 8:16 PM, Guido van Rossum <guido@python.org> wrote:
Well, regarding this part, it's just that things like for obj in gen: yield obj often get modernized into yield from gen And realistic examples of that include pretty much any normal use of yield from. So far all the argumentation about this has been of the form "if you have
So here's a realistic example, with the semantics of PEP 550 applied to a decimal.setcontext() kind of thing, but it could be anything using var.set(value): def process_data_buffers(buffers): setcontext(default_context) for buf in buffers: for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) Code smells? Yes, but maybe you often see much worse things, so let's say it's fine. But then, if you refactor it into a subgenerator like this: def process_data_buffer(buffer): for data in buf: if data.tag == "NEW_PRECISION": setcontext(context_based_on(data)) else: yield compute(data) def process_data_buffers(buffers): setcontext(default_context) for buf in buffers: yield from buf Now, if setcontext uses PEP 550 semantics, the refactoring broke the code, because a generator introduce a scope barrier by adding a LogicalContext on the stack, and setcontext is only local to the process_data_buffer subroutine. But the programmer is puzzled, because with regular functions it had worked just fine in a similar situation before they learned about generators: def process_data_buffer(buffer, output): for data in buf: if data.tag == "precision change": setcontext(context_based_on(data)) else: output.append(compute(data)) def process_data_buffers(buffers): output = [] setcontext(default_context) for buf in buffers: process_data_buffer(buf, output) In fact, this code had another problem, namely that the context state is leaked out of process_data_buffers, because PEP 550 leaks context state out of functions, but not out of generators. But we can easily imagine that the unit tests for process_data_buffers *do* pass. But let's look at a user of the functionality: def get_total(): return sum(process_data_buffers(get_buffers())) setcontext(somecontext) value = get_total() * compute_factor() Now the code is broken, because setcontext(somecontext) has no effect, because get_total() leaks out another context. Not to mention that our data buffer source now has control over the behavior of compute_factor(). But if one is lucky, the last line was written as value = compute_factor() * get_total() And hooray, the code works! (Except for perhaps the code that is run after this.) Now this was of course a completely fictional example, and hopefully I didn't introduce any bugs or syntax errors other than the ones I described. I haven't seen code like this anywhere, but somehow we caught the problems anyway. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I know that that's the pattern, but everybody just shows the same foo/bar example.
And realistic examples of that include pretty much any normal use of yield from.
There aren't actually any "normal" uses of yield from. The vast majority of uses of yield from are in coroutines written using yield from.
Yeah, so my claim this is simply a non-problem, and you've pretty much just proved that by failing to come up with pointers to actual code that would suffer from this. Clearly you're not aware of any such code. -- --Guido van Rossum (python.org/~guido)

On Wed, Sep 6, 2017 at 11:55 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A real-code example: make it possible to implement decimal.setcontext() on top of PEP 550 semantics. I still feel that there's some huge misunderstanding in the discussion: PEP 550 does not promote "not using context managers". It simply implements a low-level mechanism to make it possible to implement context managers for generators/coroutines/etc. Whether this API is used to write context managers or not is completely irrelevant to the discussion. How does threading.local() promote or demote using of context managers? The answer: it doesn't. Same answer is for PEP 550, which is a similar mechanism. Yury

On Wed, Sep 6, 2017 at 1:39 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
Thank you for the example, Koos. FWIW I agree it is a "completely fictional example". There are two ways how we can easily adapt PEP 550 to follow your semantics: 1. Set gen.__logical_context__ to None when it is being 'yield frommmed' 2. Merge gen.__logical_context__ with the outer LC when the generator is iterated to the end. But I still really dislike the examples you and Greg show to us. They are not typical or real-world examples, they are showcases of ways to abuse contexts. I still think that giving Python programmers one strong rule: "context mutation is always isolated in generators" makes it easier to reason about the EC and write maintainable code. Yury

Yury Selivanov wrote:
Whereas I think it makes code *harder* to reason about, because to take advantage of it you need to be acutely aware of whether the code you're working on is in a generator/coroutine or not. It seems simpler to me to have one rule for all kinds of functions: If you're making a temporary change to contextual state, always encapsulate it in a with statement. -- Greg

Guido van Rossum wrote:
Yuri has already found one himself, the __aenter__ and __aexit__ methods of an async context manager.
A utility function that sets context state but also makes a network call just sounds like asking for trouble!
I'm coming from the other direction. It seems to me that it's not very useful to allow with-statements to be skipped in certain very restricted circumstances. The only situation in which you will be able to take advantage of this is if the context change is being made in a generator or coroutine, and it is to apply to the whole body of that generator or coroutine. If you're in an ordinary function, you'll still have to use a context manager. If you only want the change to apply to part of the body, you'll still have to use a context manager. It would be simpler to just tell people to always use a context manager, wouldn't it? -- Greg

On Wed, Sep 6, 2017 at 11:26 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
__aenter__ is not a generator and there's no 'yield from' there. Coroutines (within an async task) leak state just like regular functions (within a thread). Your argument is to allow generators to leak context changes (right?). AFAIK we don't use generators to implement __enter__ or __aenter__ (generators decorated with @types.coroutine or @asyncio.coroutine are coroutines, according to PEP 492). So this is irrelevant.
Can you clarify what do you mean by "with-statements to be skipped"? This language is not used in PEP 550 or in Python documentation. I honestly don't understand what it means.
Yes, PEP 550 wants people to always use a context managers! Which will work as you expect them to work for coroutines, generators, and regular functions. At this point I suspect you have some wrong idea about some specification detail of PEP 550. I understand what Koos is talking about, but I really don't follow you. Using the "with-statements to be skipped" language is very confusing and doesn't help to understand you. Yury

Yury Selivanov wrote:
If I understand correctly, instead of using a context manager, your fractions example could be written like this: def fractions(precision, x, y): ctx = decimal.getcontext().copy() decimal.setcontext(ctx) ctx.prec = precision yield MyDecimal(x) / MyDecimal(y) yield MyDecimal(x) / MyDecimal(y ** 2) and it would work without leaking changes to the decimal context, despite the fact that it doesn't use a context manager or do anything else to explicitly put back the old context. Am I right about that? This is what I mean by "skipping context managers" -- that it's possible in some situations to get by without using a context manager, by taking advantage of the implicit local context push that happens whenever a generator is started up. Now, there are two possibilities: 1) You take advantage of this, and don't use context managers in some or all of the places where you don't need to. You seem to agree that this would be a bad idea. 2) You ignore it and always use a context manager, in which case it's not strictly necessary for the implicit context push to occur, since the relevant context managers can take care of it. So there doesn't seem to be any great advantage to the automatic context push, and it has some disadvantages, such as yield-from not quite working as expected in some situations. Also, it seems that every generator is going to incur the overhead of allocating a logical_context even when it doesn't actually change any context vars, which most generators won't. -- Greg

On 09/07/2017 03:37 AM, Greg Ewing wrote:
The disagreement seems to be whether a LogicalContext should be created implicitly vs explicitly (or opt-out vs opt-in). As a user trying to track down a decimal context change not propagating, I would not suspect the above code of automatically creating a LogicalContext and isolating the change, whereas Greg's context manager version is abundantly clear. The implicit vs explicit argument comes down, I think, to resource management: some resources in Python are automatically managed (memory), and some are not (files) -- which type should LCs be? -- ~Ethan~

On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote:
You are confusing resource management with the isolation mechanism. PEP 550 contextvars are analogous to threading.local(), which the PEP makes very clear from the outset. threading.local(), the isolation mechanism, is *implicit*. decimal.localcontext() is an *explicit* resource manager that relies on threading.local() magic. PEP 550 simply provides a threading.local() alternative that works in tasks and generators. That's it! Elvis

On 09/07/2017 06:41 AM, Elvis Pranskevichus wrote:
On Thursday, September 7, 2017 9:05:58 AM EDT Ethan Furman wrote:
I might be, and I wouldn't be surprised. :) On the other hand, one can look at isolation as being a resource.
threading.local(), the isolation mechanism, is *implicit*.
I don't think so. You don't get threading.local() unless you call it -- that makes it explicit.
The concern is *how* PEP 550 provides it: - explicitly, like threading.local(): has to be set up manually, preferably with a context manager - implicitly: it just happens under certain conditions -- ~Ethan~

On Thursday, September 7, 2017 10:06:14 AM EDT Ethan Furman wrote:
You literally replace threading.local() with contextvars.ContextVar(): import threading _decimal_context = threading.local() def set_decimal_context(ctx): _decimal_context.context = ctx Becomes: import contextvars _decimal_context = contextvars.ContextVar('decimal.Context') def set_decimal_context(ctx): _decimal_context.set(ctx) Elvis

I write it in a new thread, but I also want to write it here -- I need a time out in this discussion so I can think about it more. -- --Guido van Rossum (python.org/~guido)

On 7 September 2017 at 07:06, Ethan Furman <ethan@stoneleaf.us> wrote:
A recurring point of confusion with the threading.local() analogy seems to be that there are actually *two* pieces to that analogy: * threading.local() <-> contextvars.ContextVar * PyThreadState_GetDict() <-> LogicalContext (See https://github.com/python/cpython/blob/a6a4dc816d68df04a7d592e0b6af8c7ecc4d4... for the definition of the PyThreadState_GetDict) For most practical purposes as a *user* of thread locals, the involvement of PyThreadState and the state dict is a completely hidden implementation detail. However, every time you create a new thread, you're implicitly getting a new Python thread state, and hence a new thread state dict, and hence a new set of thread local values. Similarly, as a *user* of context variables, you'll generally be able to ignore the manipulation of the execution context going on behind the scenes - you'll just get, set, and delete individual context variables without worrying too much about exactly where and how they're stored. PEP 550 itself doesn't have that luxury, though, since in addition to defining how users will access and update these values, it *also* needs to define how the interpreter will implicitly manage the execution context for threads and generators and how event loops (including asyncio as the reference implementation) are going to be expected to manage the execution context explicitly when scheduling coroutines. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Sep 07, 2017 at 09:41:10AM -0400, Elvis Pranskevichus wrote:
If there only were a name that would make it explicit, like TaskLocalStorage. ;) Seriously, the problem with 'context' is that it is: a) A predefined set of state values like in the Decimal (I think also the OpenSSL) context. But such a context is put inside another context (the ExecutionContext). b) A theoretical concept from typed Lambda calculus (in the context 'gamma' the variable 'v' has type 't'). But this concept would be associated with lexical scope and would extend to functions (not only tasks and generators). c) ``man 3 setcontext``. A replacement for setjmp/longjmp. Somewhat related in that it could be used to implement coroutines. d) The .NET flowery language. I do did not fully understand what the .NET ExecutionContext and its 2881 implicit flow rules are. ... Stefan Krah

On Thursday, September 7, 2017 6:37:58 AM EDT Greg Ewing wrote:
The advantage is that context managers don't need to *always* allocate and push an LC. [1]
By default, generators reference an empty LogicalContext object that is allocated once (like the None object). We can do that because LCs are immutable. Elvis [1] https://mail.python.org/pipermail/python-dev/2017-September/ 149265.html

Elvis Pranskevichus wrote:
Ah, I see. That wasn't clear from the implementation, where gen.__logical_context__ = contextvars.LogicalContext() looks like it's creating a new one. However, there's another thing: it looks like every time a generator is resumed/suspended, an execution context node is created/discarded. -- Greg

There is one thing I misunderstood. Since generators and coroutines are almost exactly the same underneath, I had thought that the automatic logical_context creation for generators was also going to apply to coroutines, but from reading the PEP again it seems that's not the case. Somehow I missed that the first time. Sorry about that. So, context vars do behave like "task locals storage" for asyncio Tasks, which is good. The only issue is whether a generator should be considered an "ad-hoc task" for this purpose. I can see your reasons for thinking that it should be. I can also understand your thinking that the yield-from issue is such an obscure corner case that it's not worth worrying about, especially since there is a workaround available (setting _logical_context_ to None) if needed. I'm not sure how I feel about that now. I agree that it's an obscure case, but the workaround seems even more obscure, and is unlikely to be found by anyone who isn't closely familiar with the inner workings. I think I'd be happier if there were a higher-level way of applying this workaround, such as a decorator: @subgenerator def g(): ... Then the docs could say "If you want a generator to *not* have its own task local storage, wrap it with @subgenerator." By the way, I think "Task Local Storage" would be a much better title for this PEP. It instantly conveys the basic idea in a way that "Execution Context" totally fails to do. It might also serve as a source for some better terminology for parts of the implementation, such as TaskLocalStorage and TaskLocalStorageStack instead of logical_context and execution_context. I found the latter terms almost devoid of useful meaning when trying to understand the implementation. -- Greg

There are a couple of things in the PEP I'm confused about: 1) Under "Generators" it says: once set in the generator, the context variable is guaranteed not to change between iterations; This suggests that you're not allowed to set() a given context variable more than once in a given generator, but some of the examples seem to contradict that. So I'm not sure what this is trying to say. 2) I don't understand why the logical_contexts have to be immutable. If every task or generator that wants its own task-local storage has its own logical_context instance, why can't it be updated in-place? -- Greg

On 09/07/2017 04:39 AM, Greg Ewing wrote:
I believe I can answer this part: the guarantee is that - the context variable will not be changed while the yield is in effect -- or, said another way, while the generator is suspended; - the context variable will not be changed by subgenerators - the context variable /may/ be changed by normal functions/class methods (since calling them would be part of the iteration) -- ~Ethan~

On Wed, Sep 6, 2017 at 8:07 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
It would be great if you or Greg could show a couple of real-world examples showing the "issue" (with the current PEP 550 APIs/semantics). PEP 550 treats coroutines and generators as objects that support out of order execution. OS threads are similar to them in some ways. I find it questionable to try to enforce context management rules we have for regular functions to generators/coroutines. I don't really understand the "refactoring" argument you and Greg are talking about all the time. PEP 555 still doesn't clearly explain how exactly it is different from PEP 550. Because 555 was posted *after* 550, I think that it's PEP 555 that should have that comparison. Yury

On Wed, Sep 6, 2017 at 8:22 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote: [...]
PEP 550 treats coroutines and generators as objects that support out of order execution.
Out of order? More like interleaved.
555 was *posted* as a pep after 550, yes. And yes, there could be a comparison, especially now that PEP 550 semantics seem to have converged, so PEP 555 does not have to adapt the comparison to PEP 550 changes. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Yury Selivanov wrote:
Here's one way that refactoring could trip you up. Start with this: async def foo(): calculate_something() #in a coroutine, so we can be lazy and not use a cm ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() And factor part of it out (into an *ordinary* function!) async def foo(): calculate_something() calculate_something_else_with_5_digits() def calculate_something_else_with_5_digits(): ctx = decimal.getcontext().copy() ctx.prec = 5 decimal.setcontext(ctx) calculate_something_else() Now we add some more calculation to the end of foo(): async def foo(): calculate_something() calculate_something_else_with_5_digits() calculate_more_stuff() Here we didn't intend calculate_more_stuff() to be done with prec=5, but we forgot that calculate_something_else_ with_5_digits() changes the precision and *doesn't restore it* because we didn't add a context manager to it. If we hadn't been lazy and had used a context manager in the first place, that wouldn't have happened. Summary: I think that skipping context managers in some circumstances is a bad habit that shouldn't be encouraged. -- Greg

On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Where exactly does PEP 550 encourage users to be "lazy and not use a cm"? PEP 550 provides a mechanism for implementing context managers! What is this example supposed to show?
How is PEP 550 is at fault of somebody being lazy and not using a context manager? PEP 550 has a hard requirement to make it possible for decimal/other libraries to start using its APIs and stay backwards compatible, so it allows `decimal.setcontext(ctx)` function to be implemented. We are fixing things here. When you are designing a new library/API, you can use CMs and only CMs. It's up to you, as a library author, PEP 550 does not limit you. And when you use CMs, there's no "problems" with 'yield from' or anything in PEP 550.
Summary: I think that skipping context managers in some circumstances is a bad habit that shouldn't be encouraged.
PEP 550 does not encourage coding without context managers. It does, in fact, solve the problem of reliably storing context to make writing context managers possible. To reiterate: it provides mechanism to set a variable within the current logical thread, like storing a current request in an async HTTP handler. Or to implement `decimal.setcontext`. But you are free to use it to only implement context managers in your library. Yury

On 09/06/2017 11:57 PM, Yury Selivanov wrote:
On Wed, Sep 6, 2017 at 11:39 PM, Greg Ewing wrote:
That using a CM is not required, and tracking down a bug caused by not using a CM can be difficult.
How is PEP 550 is at fault of somebody being lazy and not using a context manager?
Because PEP 550 makes a CM unnecessary in the simple (common?) case, hiding the need for a CM in not-so-simple cases. For comparison: in Python 3 we are now warned about files that have been left open (because explicitly closing files was unnecessary in CPython due to an implementation detail) -- the solution? make files context managers whose __exit__ closes the file.
I appreciate that the scientific and number-crunching communities have been a major driver of enhancements for Python (such as rich comparisons and, more recently, matrix operators), but I don't think an enhancement for them that makes life more difficult for the rest is a net win. -- ~Ethan~

On Wed, Aug 30, 2017 at 2:36 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
FYI, I've been sketching an alternative solution that addresses these kinds of things. I've been hesitant to post about it, partly because of the PEP550-based workarounds that Nick, Nathaniel, Yury etc. have been describing, and partly because that might be a major distraction from other useful discussions, especially because I wasn't completely sure yet about whether my approach has some fatal flaw compared to PEP 550 ;). —Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Wed, Aug 30, 2017 at 9:44 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote: [..]
The only alternative design that I considered for PEP 550 and ultimately rejected was to have a the following thread-specific mapping: { var1: [stack of values for var1], var2: [stack of values for var2] } So the idea is that when we set a value for the variable in some frame, we push it to its stack. When the frame is done, we pop it. This is a classic approach (called Shallow Binding) to implement dynamic scope. The fatal flow that made me to reject this approach was the CM protocol (__enter__). Specifically, context managers need to be able to control values in outer frames, and this is where this approach becomes super messy. Yury

Can Execution Context be implemented outside of CPython
I know I'm well late to the game and a bit dense, but where in the pep is the justification for this assertion? I ask because we buy something to solve the same problem in Twisted some time ago: https://bitbucket.org/hipchat/txlocal . We were able to leverage generator/coroutine decorators to preserve state without modifying the runtime. Given that this problem only exists in runtime that multiplex coroutines on a single thread and the fact that coroutine execution engines only exist in user space, why doesn't it make more sense to leave this to a library that engines like asyncio and Twisted are responsible for standardising on? On Wed, Aug 30, 2017, 09:40 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

On Wed, Aug 30, 2017 at 1:39 PM, Kevin Conway <kevinjacobconway@gmail.com> wrote:
To work with coroutines we have asyncio/twisted or other frameworks. They create async tasks and manage them. Generators, OTOH, don't have a framework that runs them, they are managed by the Python interpreter. So its not possible to implement a *complete context solution* that equally supports generators and coroutines outside of the interpreter. Another problem, is that every framework has its own local context solution. Twisted has one, gevent has another. But libraries like numpy and decimal can't use them to store their local context data, because they are non-standard. That's why we need to solve this problem once in Python directly. Yury

On Wed, Aug 30, 2017 at 5:36 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Anyway, thanks to these efforts, your proposal has become somewhat more competitive compared to mine ;). I'll post mine as soon as I find the time to write everything down. My intention is before next week. —Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Yury Selivanov wrote:
That's understandable, but fixing that problem shouldn't come at the expense of breaking the ability to refactor generator code or async code without changing its semantics. I'm not convinced that it has to, either. In this example, the with-statement is the thing that should be establishing a new nested context. Yielding and re-entering the generator should only be swapping around between existing contexts.
The following non-generator code is "broken" in exactly the same way: def foo(): decimal.context(...) do_some_decimal_calculations() # Context has now been changed for the caller
I simply want consistency.
So do I! We just have different ideas about what consistency means here.
No, generators should *always* leak their context changes to exactly the same extent that normal functions do. If you don't want to leak a context change, you should use a with statement. What you seem to be suggesting is that generators shouldn't leak context changes even when you *don't* use a with-statement. If you're going to to that, you'd better make sure that the same thing applies to regular functions, otherwise you've introduced an inconsistency. -- Greg

On Tue, Aug 29, 2017 at 5:45 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: [..]
What you seem to be suggesting is that generators shouldn't leak context changes even when you *don't* use a with-statement.
Yes, generators shouldn't leak context changes regardless of what and how changes the context inside them: var = new_context_var() def gen(): old_val = var.get() try: var.set('blah') yield yield yield finally: var.set(old_val) with the above code, when you do "next(gen())" it would leak the state without PEP 550. "finally" block (or "with" block wouldn't help you here) and corrupt the state of the caller. That's the problem the PEP fixes. The EC interaction with generators is explained here with a great detail: https://www.python.org/dev/peps/pep-0550/#id4 We explain the motivation behind desiring a working context-local solution for generators in the Rationale section: https://www.python.org/dev/peps/pep-0550/#rationale Basically half of the PEP is about isolating context in generators.
Regular functions cannot pause/resume their execution, so they can't leak an inconsistent context change due to out of order or partial execution. PEP 550 positions itself as a replacement for TLS, and clearly defines its semantics for regular functions in a single thread, regular functions in multithreaded code, generators, and asynchronous code (async/await). Everything is specified in the High-level Specification section. I wouldn't call slightly differently defined semantics for generators/coroutines/functions an "inconsistency" -- they just have a different EC semantics given how different they are from each other. Drawing a parallel between 'yield from' and function calls is possible, but we shouldn't forget that you can 'yield from' a half-iterated generator. Yury

On Tue, Aug 29, 2017 at 06:01:40PM -0400, Yury Selivanov wrote:
What I don't find so consistent is that the async universe is guarded with async {def, for, with, ...}, but in this proposal regular context managers and context setters implicitly adapt their behavior. So, pedantically, having a language extension like async set(var, value) x = async get(var) and making async-safe context managers explicit async with decimal.localcontext(): ... would feel more consistent. I know generators are a problem, but even allowing something like "async set" in generators would be a step up. Stefan Krah

On Tue, Aug 29, 2017 at 7:06 PM, Stefan Krah <stefan@bytereef.org> wrote:
But regular context managers work just fine with asynchronous code. Not all of them have some local state. For example, you could have a context manager to time how long the code wrapped into it executes: async def foo(): with timing(): await ... We use asynchronous context managers only when they need to do asynchronous operations in their __aenter__ and __aexit__ (like DB transaction begin/rollback/commit). Requiring "await" to set a value for context variable would force us to write specialized async CMs for cases where a sync CM would do just fine. This in turn, would make it impossible to use some sync libraries in async code. But there's nothing wrong in using numpy/numpy.errstate in a coroutine. I want to be able to copy/paste their examples into my async code and I'd expect it to just work -- that's the point of the PEP. async/await already requires to have separate APIs in libraries that involve IO. Let's not make the situation worse by asking people to use asynchronous version of PEP 550 even though it's not really needed. Yury

On 08/28/2017 04:19 AM, Stefan Krah wrote:
If I understand correctly, ctx.prec is whatever the default is, because foo comes before bar on the stack, and after the current value for i is grabbed bar is no longer executing, and therefore no longer on the stack. I hope I'm right. ;) -- ~Ethan~

A question appeared here about a simple mental model for PEP 550. It looks much clearer now, than in the first version, but I still would like to clarify: can one say that PEP 550 just provides more fine-grained version of threading.local(), that works not only per thread, but even per coroutine within the same thread? -- Ivan On 28 August 2017 at 17:29, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

On Sat, Aug 26, 2017 at 2:34 AM, Nathaniel Smith <njs@pobox.com> wrote:
That exception is why the semantics cannot be equivalent.
I'll cover the refactoring argument later in this email. [..]
I don't think it's non-trivial though: First, we have a cache in ContextVar which makes lookup O(1) for any tight code that uses libraries like decimal and numpy. Second, most of the LCs in the chain will be empty, so even the uncached lookup will still be fast. Third, you will usually have your "with my_context()" block right around your code (or within a few awaits distance), otherwise it will be hard to reason what's the context. And if, occasionally, you have a one single "var.lookup()" call that won't be cached, the cost of it will still be measured in microseconds. Finally, the easy to follow semantics is the main argument for the change (even at the cost of making "get()" a bit slower in corner cases).
Yes.
This example is very similar to: await sub() and await create_task(sub()) So it's really about making the semantics for coroutines be predictable.
(And fwiw I'm still not convinced we should give up on 'yield from' as a mechanism for refactoring generators.)
I don't get this "refactoring generators" and "refactoring coroutines" argument. Suppose you have this code: def gen(): i = 0 for _ in range(3): i += 1 yield i for _ in range(5): i += 1 yield i You can't refactor gen() by simply copying/pasting parts of its body into a separate generator: def count3(): for _ in range(3): i += 1 yield def gen(): i = 0 yield from count3() for _ in range(5): i += 1 yield i The above won't work for some obvious reasons: 'i' is a nonlocal variable for 'count3' block of code. Almost exactly the same thing will happen with the current PEP 550 specification, which is a *good* thing. 'yield from' and 'await' are not about refactoring. They can be used for splitting large generators/coroutines into a set of smaller ones, sure. But there's *no* magical, always working, refactoring mechanism that allows to do that blindly.
Right. Before we continue, let me make sure we are on the same page here: await asyncio.wait_for(sub(), timeout=2) can be refactored into: task = asyncio.wait_for(sub(), timeout=2) # sub() is scheduled now, and a "loop.call_soon" call has been # made to advance it soon. await task Now, if we look at the following example (1): async def foo(): await bar() The "bar()" coroutine will execute within "foo()". If we add a timeout logic (2): async def foo(): await wait_for(bar() ,1) The "bar()" coroutine will execute outside of "foo()", and "foo()" will only wait for the result of that execution. Now, Async Tasks capture the context when they are created -- that's the only sane option they have. If coroutines don't have their own LC, "bar()" in examples (1) and (2) would interact with the execution context differently! And this is something that we can't let happen, as it would force asyncio users to think about the EC every time they want to wrap a coroutine into a task. [..]
Correct. Both LC (and EC) objects will be both wrapped into "shell" objects before being exposed to the end user. run_with_logical_context() will mutate the user-visible LC object (keeping the underlying LC immutable, of course). Ideally, we would want run_with_logical_context to have the following signature: result, updated_lc = run_with_logical_context(lc, callable) But because "callable" can raise an exception this would not work.
Yeah, you're right. Thanks!
Fixed! Yury

Hi, I'm aware that the current implementation is not final, but I already adapted the coroutine changes for Cython to allow for some initial integration testing with real external (i.e. non-Python coroutine) targets. I haven't adapted the tests yet, so the changes are currently unused and mostly untested. https://github.com/scoder/cython/tree/pep550_exec_context I also left some comments in the github commits along the way. Stefan

On Sat, Aug 26, 2017 at 6:22 AM, Stefan Behnel <stefan_ml@behnel.de> wrote:
Huge thanks for thinking about how this proposal will work for Cython and trying it out. Although I must warn you that the last reference implementation is very outdated, and the implementation we will end up with will be very different (think a total rewrite from scratch). Yury

Hi, thanks, on the whole this is *much* easier to understand. I'll add some comments on the decimal examples. The thing is, decimal is already quite tricky and people do read PEPs long after they have been accepted, so they should probably reflect best practices. On Fri, Aug 25, 2017 at 06:32:22PM -0400, Yury Selivanov wrote:
"Many people (wrongly) expect the values of ``items`` to be::" ;)
[(Decimal('0.33'), Decimal('0.666667')), (Decimal('0.11'), Decimal('0.222222'))]
I'm not sure why this approach has limited use for decimal: from decimal import * def fractions(precision, x, y): ctx = Context(prec=precision) yield ctx.divide(Decimal(x), Decimal(y)) yield ctx.divide(Decimal(x), Decimal(y**2)) g1 = fractions(precision=2, x=1, y=3) g2 = fractions(precision=6, x=2, y=3) print(list(zip(g1, g2))) This is the first thing I'd do when writing async-safe code. Again, people do read PEPs. So if an asyncio programmer without any special knowledge of decimal reads the PEP, he probably assumes that localcontext() is currently the only option, while the safer and easy-to-reason-about context methods exist.
As I understand it, the example creates a context with a custom precision and attempts to use that context to create a Decimal. This doesn't switch the actual decimal context. Secondly, the precision in the context argument to the Decimal() constructor has no effect --- the context there is only used for error handling. Lastly, if the constructor *did* use the precision, one would have to be careful about double rounding when using MyDecimal(). I get that this is supposed to be for illustration only, but please let's be careful about what people might take away from that code.
I think it'll work, but can we agree on hard numbers like max 2% slowdown for the non-threaded case and 4% for applications that only use threads? I'm a bit cautious because other C-extension state-managing PEPs didn't come close to these figures. Stefan Krah

On Sat, Aug 26, 2017 at 7:45 AM, Stefan Krah <stefan@bytereef.org> wrote:
Hi,
thanks, on the whole this is *much* easier to understand.
Thanks!
Agree. [..]
Because you have to know the limitations of implicit decimal context to make this choice. Most people don't (at least from my experience).
This is the first thing I'd do when writing async-safe code.
Because you know the decimal module very well :)
I agree.
In the next iteration of the PEP we'll remove decimal examples and replace them with something with simpler semantics. This is clearly the best choice now.
I'd be *very* surprised if wee see any noticeable slowdown at all. The way ContextVars will implement caching is very similar to the trick you use now. Yury

On Sat, Aug 26, 2017 at 12:21:44PM -0400, Yury Selivanov wrote:
I'd also be surprised, but what do we do if the PEP is accepted and for some yet unknown reason the implementation turns out to be 12-15% slower? The slowdown related to the module-state/heap-type PEPs wasn't immediately obvious either; it would be nice to have actual figures before the PEP is accepted. Stefan Krah

Thanks for the update. Comments in-line below. -eric On Fri, Aug 25, 2017 at 4:32 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
With threads we have a directed graph of execution, rooted at the root thread, branching with each new thread and merging with each .join(). Each thread gets its own copy of each threading.local, regardless of the relationship between branches (threads) in the execution graph. With async (and generators) we also have a directed graph of execution, rooted in the calling thread, branching with each new async call. Currently there is no equivalent to threading.local for the async execution graph. This proposal involves adding such an equivalent. However, the proposed solution isn''t quite equivalent, right? It adds a concept of lookup on the chain of namespaces, traversing up the execution graph back to the root. threading.local does not do this. Furthermore, you can have more than one threading.local per thread.
From what I read in the PEP, each node in the execution graph has (at most) one Execution Context.
The PEP doesn't really say much about these differences from threadlocals, including a rationale. FWIW, I think such a COW mechanism could be useful. However, it does add complexity to the feature. So a clear explanation in the PEP of why it's worth it would be valuable.
#1-4 are consistent with a single EC per Python thread. However, #5-7 imply that more than one EC per thread is supported, but only one is active in the current execution stack (notably the EC is rooted at the calling frame). threading.local provides a much simpler mechanism but does not support the chained context (COW) semantics...

Hi Eric, On Sat, Aug 26, 2017 at 1:25 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
Correct.
Currently, the PEP covers the proposed mechanism in-depth, explaining why every detail of the spec is the way it is. But I think it'd be valuable to highlight differences from theading.local() in a separate section. We'll think about adding one. Yury

On Sat, Aug 26, 2017 at 10:25 AM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
You might be interested in these notes I wrote to motivate why we need a chain of namespaces, and why simple "async task locals" aren't sufficient: https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb They might be a bit verbose to include directly in the PEP, but Yury/Elvis, feel free to steal whatever if you think it'd be useful. -n -- Nathaniel J. Smith -- https://vorpus.org

On Sat, Aug 26, 2017 at 3:09 PM, Nathaniel Smith <njs@pobox.com> wrote:
Thanks, Nathaniel! That helped me understand the rationale, though I'm still unconvinced chained lookup is necessary for the stated goal of the PEP. (The rest of my reply is not specific to Nathaniel.) tl;dr Please: * make the chained lookup aspect of the proposal more explicit (and distinct) in the beginning sections of the PEP (or drop chained lookup). * explain why normal frames do not get to take advantage of chained lookup (or allow them to). -------------------- If I understood right, the problem is that we always want context vars resolved relative to the current frame and then to the caller's frame (and on up the call stack). For generators, "caller" means the frame that resumed the generator. Since we don't know what frame will resume the generator beforehand, we can't simply copy the current LC when a generator is created and bind it to the generator's frame. However, I'm still not convinced that's the semantics we need. The key statement is "and then to the caller's frame (and on up the call stack)", i.e. chained lookup. On the linked page Nathaniel explained the position (quite clearly, thank you) using sys.exc_info() as an example of async-local state. I posit that that example isn't particularly representative of what we actually need. Isn't the point of the PEP to provide an async-safe alternative to threading.local()? Any existing code using threading.local() would not expect any kind of chained lookup since threads don't have any. So introducing chained lookup in the PEP is unnecessary and consequently not ideal since it introduces significant complexity. As the PEP is currently written, chained lookup is a key part of the proposal, though it does not explicitly express this. I suppose this is where my confusion has been. At this point I think I understand one rationale for the chained lookup functionality; it takes advantage of the cooperative scheduling characteristics of generators, et al. Unlike with threads, a programmer can know the context under which a generator will be resumed. Thus it may be useful to the programmer to allow (or expect) the resumed generator to fall back to the calling context. However, given the extra complexity involved, is there enough evidence that such capability is sufficiently useful? Could chained lookup be addressed separately (in another PEP)? Also, wouldn't it be equally useful to support chained lookup for function calls? Programmers have the same level of knowledge about the context stack with function calls as with generators. I would expect evidence in favor of chained lookups for generators to also favor the same for normal function calls. -eric

On Mon, Aug 28, 2017 at 3:14 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
There's a lot of Python code out there, and it's hard to know what it all wants :-). But I don't think we should get hung up on matching threading.local() -- no-one sits down and says "okay, what my users want is for me to write some code that uses a thread-local", i.e., threading.local() is a mechanism, not an end-goal. My hypothesis is in most cases, when people reach for threading.local(), it's because they have some "contextual" variable, and they want to be able to do things like set it to a value that affects all and only the code that runs inside a 'with' block. So far the only way to approximate this in Python has been to use threading.local(), but chained lookup would work even better. As evidence for this hypothesis: something like chained lookup is important for exc_info() [1] and for Trio's cancellation semantics, and I'm pretty confident that it's what users naturally expect for use cases like 'with decimal.localcontext(): ...' or 'with numpy.errstate(...): ...'. And it works fine for cases like Flask's request-locals that get set once near the top of a callstack and then treated as read-only by most of the code. I'm not aware of any alternative to chained lookup that fulfills all of these use cases -- are you? And I'm not aware of any use cases that require something more than threading.local() but less than chained lookup -- are you? [1] I guess I should say something about including sys.exc_info() as evidence that chained lookup as useful, given that CPython probably won't share code between it's PEP 550 implementation and its sys.exc_info() implementation. I'm mostly citing it as a evidence that this is a real kind of need that can arise when writing programs -- if it happens once, it'll probably happen again. But I can also imagine that other implementations might want to share code here, and it's certainly nice if the Python-the-language spec can just say "exc_info() has semantics 'as if' it were implemented using PEP 550 storage" and leave it at that. Plus it's kind of rude for the interpreter to claim semantics for itself that it won't let anyone else implement :-).
The important difference between generators/coroutines and normal function calls is that with normal function calls, the link between the caller and callee is fixed for the entire lifetime of the inner frame, so there's no way for the context to shift under your feet. If all we had were normal function calls, then (green-) thread locals using the save/restore trick would be enough to handle all the use cases above -- it's only for generators/coroutines where the save/restore trick breaks down. This means that pushing/popping LCs when crossing into/out of a generator frame is the minimum needed to get the desired semantics, and it keeps the LC stack small (important since lookups can be O(n) in the worst case), and it minimizes the backcompat breakage for operations like decimal.setcontext() where people *do* expect to call it in a subroutine and have the effects be visible in the caller. -n -- Nathaniel J. Smith -- https://vorpus.org

On Mon, Aug 28, 2017 at 6:07 PM, Nathaniel Smith <njs@pobox.com> wrote:
I like this way of looking at things. Does this have any bearing on asyncio.Task? To me those look more like threads than like generators. Or possibly they should inherit the lookup chain from the point when the Task was created, but not be affected at all by the lookup chain in place when they are executed. FWIW we *could* have a policy that OS threads also inherit the lookup chain from their creator, but I doubt that's going to fly with backwards compatibility. I guess my general (hurried, sorry) view is that we're at a good point where we have a small number of mechanisms but are still debating policies on how those mechanisms should be used. (The basic mechanism is chained lookup and the policies are about how the chains are fit together for various language/library constructs.) -- --Guido van Rossum (python.org/~guido)

On 8/28/2017 6:50 PM, Guido van Rossum wrote:
Since LC is new, how could such a policy affect backwards compatibility? The obvious answer would be that some use cases that presently use other mechanisms that "should" be ported to using LC would have to be careful in how they do the port, but discussion seems to indicate that they would have to be careful in how they do the port anyway. One of the most common examples is the decimal context. IIUC, each thread gets its initial decimal context from a global template, rather than inheriting from its parent thread. Porting decimal context to LC then, in the event of OS threads inheriting the lookup chain from their creator, would take extra work for compatibility: setting the decimal context from the global template (a step it must already take) rather than accepting the inheritance. It might be appropriate that an updated version of decimal that uses LC would offer the option of inheriting the decimal context from the parent thread, or using the global template, as an enhancement.

On Mon, Aug 28, 2017 at 9:50 PM, Guido van Rossum <guido@python.org> wrote:
We explain why tasks have to inherit the lookup chain from the point where they are created in the PEP (in the new High-level Specification section): https://www.python.org/dev/peps/pep-0550/#coroutines-and-asynchronous-tasks In short, without inheriting the chain we can't wrap coroutines into tasks (like wrapping an await in wait_for() would break the code, if we don't inherit the chain). In the latest version (v4) we made all coroutines to have their own Logical Context, which, as we discovered today, makes us unable to set context variables in __aenter__ coroutines. This will be fixed in the next version.
Backwards compatibility is indeed an issue. Inheriting the chain for threads would mean another difference between PEP 550 and 'threading.local()', that could cause backwards incompatible behaviour for decimal/numpy when they are updated to new APIs. For decimal, for example, we could use the following pattern to fallback to use the default decimal context for ECs (threads) that don't have it set: ctx = decimal_var.get(default=default_decimal_ctx) We can also add an 'initializer' keyword-argument to 'new_context_var' to specify a callable that will be used to give a default value to the var. Another issue, is that with the current C API, we can only inherit EC for threads started with 'threading.Thread'. There's no reliable way to inherit the chain if a thread was initialized by a C extension. IMO, inheriting the lookup chain in threads makes sense when we use them for pools, like concurrent.futures.ThreadPoolExecutor. When threads are used as long-running subprograms, inheriting the chain should be an opt-in. Yury

Hi,
it's by design that the execution context for new threads to be empty or should it be possible to set it to some initial value? Like e.g: var = new_context_var('init') def sub(): assert var.lookup() == 'init' var.set('sub') def main(): var.set('main') thread = threading.Thread(target=sub) thread.start() thread.join() assert var.lookup() == 'main' Thanks, --francis
participants (19)
-
Antoine Pitrou
-
Barry Warsaw
-
David Mertz
-
Elvis Pranskevichus
-
Eric Snow
-
Ethan Furman
-
francismb
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
Ivan Levkivskyi
-
Kevin Conway
-
Koos Zevenhoven
-
Nathaniel Smith
-
Nick Coghlan
-
Stefan Behnel
-
Stefan Krah
-
Sven R. Kunze
-
Yury Selivanov