PEP draft: context variables

Hi all, as promised, here is a draft PEP for context variable semantics and implementation. Apologies for the slight delay; I had a not-so-minor autosave accident and had to retype the majority of this first draft. During the past years, there has been growing interest in something like task-local storage or async-local storage. This PEP proposes an alternative approach to solving the problems that are typically stated as motivation for such concepts. This proposal is based on sketches of solutions since spring 2015, with some minor influences from the recent discussion related to PEP 550. I can also see some potential implementation synergy between this PEP and PEP 550, even if the proposed semantics are quite different. So, here it is. This is the first draft and some things are still missing, but the essential things should be there. -- Koos |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PEP: 999 Title: Context-local variables (contextvars) Version: $Revision$ Last-Modified: $Date$ Author: Koos Zevenhoven Status: Draft Type: Standards Track Content-Type: text/x-rst Created: DD-Mmm-YYYY Post-History: DD-Mmm-YYYY Abstract ======== Sometimes, in special cases, it is desired that code can pass information down the function call chain to the callees without having to explicitly pass the information as arguments to each function in the call chain. This proposal describes a construct which allows code to explicitly switch in and out of a context where a certain context variable has a given value assigned to it. This is a modern alternative to some uses of things like global variables in traditional single-threaded (or thread-unsafe) code and of thread-local storage in traditional *concurrency-unsafe* code (single- or multi-threaded). In particular, the proposed mechanism can also be used with more modern concurrent execution mechanisms such as asynchronously executed coroutines, without the concurrently executed call chains interfering with each other's contexts. The "call chain" can consist of normal functions, awaited coroutines, or generators. The semantics of context variable scope are equivalent in all cases, allowing code to be refactored freely into *subroutines* (which here refers to functions, sub-generators or sub-coroutines) without affecting the semantics of context variables. Regarding implementation, this proposal aims at simplicity and minimum changes to the CPython interpreter and to other Python interpreters. Rationale ========= Consider a modern Python *call chain* (or call tree), which in this proposal refers to any chained (nested) execution of *subroutines*, using any possible combinations of normal function calls, or expressions using ``await`` or ``yield from``. In some cases, passing necessary *information* down the call chain as arguments can substantially complicate the required function signatures, or it can even be impossible to achieve in practice. In these cases, one may search for another place to store this information. Let us look at some historical examples. The most naive option is to assign the value to a global variable or similar, where the code down the call chain can access it. However, this immediately makes the code thread-unsafe, because with multiple threads, all threads assign to the same global variable, and another thread can interfere at any point in the call chain. A somewhat less naive option is to store the information as per-thread information in thread-local storage, where each thread has its own "copy" of the variable which other threads cannot interfere with. Although non-ideal, this has been the best solution in many cases. However, thanks to generators and coroutines, the execution of the call chain can be suspended and resumed, allowing code in other contexts to run concurrently. Therefore, using thread-local storage is *concurrency-unsafe*, because other call chains in other contexts may interfere with the thread-local variable. Note that in the above two historical approaches, the stored information has the *widest* available scope without causing problems. For a third solution along the same path, one would first define an equivalent of a "thread" for asynchronous execution and concurrency. This could be seen as the largest amount of code and nested calls that is guaranteed to be executed sequentially without ambiguity in execution order. This might be referred to as concurrency-local or task-local storage. In this meaning of "task", there is no ambiguity in the order of execution of the code within one task. (This concept of a task is close to equivalent to a ``Task`` in ``asyncio``, but not exactly.) In such concurrency-locals, it is possible to pass information down the call chain to callees without another code path interfering with the value in the background. Common to the above approaches is that they indeed use variables with a wide but just-narrow-enough scope. Thread-locals could also be called thread-wide globals---in single-threaded code, they are indeed truly global. And task-locals could be called task-wide globals, because tasks can be very big. The issue here is that neither global variables, thread-locals nor task-locals are really meant to be used for this purpose of passing information of the execution context down the call chain. Instead of the widest possible variable scope, the scope of the variables should be controlled by the programmer, typically of a library, to have the desired scope---not wider. In other words, task-local variables (and globals and thread-locals) have nothing to do with the kind of context-bound information passing that this proposal intends to enable, even if task-locals can be used to emulate the desired semantics. Therefore, in the following, this proposal describes the semantics and the outlines of an implementation for *context-local variables* (or context variables, contextvars). In fact, as a side effect of this PEP, an async framework can use the proposed feature to implement task-local variables. Proposal ======== Because the proposed semantics are not a direct extension to anything already available in Python, this proposal is first described in terms of semantics and API at a fairly high level. In particular, Python ``with`` statements are heavily used in the description, as they are a good match with the proposed semantics. However, the underlying ``__enter__`` and ``__exit__`` methods correspond to functions in the lower-level speed-optimized (C) API. For clarity of this document, the lower-level functions are not explicitly named in the definition of the semantics. After describing the semantics and high-level API, the implementation is described, going to a lower level. Semantics and higher-level API ------------------------------ Core concept '''''''''''' A context-local variable is represented by a single instance of ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` object can ask for its value with respect to the current context. In the high-level API, this value is given by the ``cvar.value`` property:: cvar = contextvars.Var(default="the default value", description="example context variable") assert cvar.value == "the default value" # default still applies # In code examples, all ``assert`` statements should # succeed according to the proposed semantics. No assignments to ``cvar`` have been applied for this context, so ``cvar.value`` gives the default value. Assigning new values to contextvars is done in a highly scope-aware manner:: with cvar.assign(new_value): assert cvar.value is new_value # Any code here, or down the call chain from here, sees: # cvar.value is new_value # unless another value has been assigned in a # nested context assert cvar.value is new_value # the assignment of ``cvar`` to ``new_value`` is no longer visible assert cvar.value == "the default value" Here, ``cvar.assign(value)`` returns another object, namely ``contextvars.Assignment(cvar, new_value)``. The essential part here is that applying a context variable assignment (``Assignment.__enter__``) is paired with a de-assignment (``Assignment.__exit__``). These operations set the bounds for the scope of the assigned value. Assignments to the same context variable can be nested to override the outer assignment in a narrower context:: assert cvar.value == "the default value" with cvar.assign("outer"): assert cvar.value == "outer" with cvar.assign("inner"): assert cvar.value == "inner" assert cvar.value == "outer" assert cvar.value == "the default value" Also multiple variables can be assigned to in a nested manner without affecting each other:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() assert cvar1.value is None # default is None by default assert cvar2.value is None with cvar1.assign(value1): assert cvar1.value is value1 assert cvar2.value is None with cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 assert cvar1.value is value1 assert cvar2.value is None assert cvar1.value is None assert cvar2.value is None Or with more convenient Python syntax:: with cvar1.assign(value1), cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 In another *context*, in another thread or otherwise concurrently executed task or code path, the context variables can have a completely different state. The programmer thus only needs to worry about the context at hand. Refactoring into subroutines '''''''''''''''''''''''''''' Code using contextvars can be refactored into subroutines without affecting the semantics. For instance:: assi = cvar.assign(new_value) def apply(): assi.__enter__() assert cvar.value == "the default value" apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or similarly in an asynchronous context where ``await`` expressions are used. The subroutine can now be a coroutine:: assi = cvar.assign(new_value) async def apply(): assi.__enter__() assert cvar.value == "the default value" await apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or when the subroutine is a generator:: def apply(): yield assi.__enter__() which is called using ``yield from apply()`` or with calls to ``next`` or ``.send``. This is discussed further in later sections. Semantics for generators and generator-based coroutines ''''''''''''''''''''''''''''''''''''''''''''''''''''''' Generators, coroutines and async generators act as subroutines in much the same way that normal functions do. However, they have the additional possibility of being suspended by ``yield`` expressions. Assignment contexts entered inside a generator are normally preserved across yields:: def genfunc(): with cvar.assign(new_value): assert cvar.value is new_value yield assert cvar.value is new_value g = genfunc() next(g) assert cvar.value == "the default value" with cvar.assign(another_value): next(g) However, the outer context visible to the generator may change state across yields:: def genfunc(): assert cvar.value is value2 yield assert cvar.value is value1 yield with cvar.assign(value3): assert cvar.value is value3 with cvar.assign(value1): g = genfunc() with cvar.assign(value2): next(g) next(g) next(g) assert cvar.value is value1 Similar semantics apply to async generators defined by ``async def ... yield ...`` ). By default, values assigned inside a generator do not leak through yields to the code that drives the generator. However, the assignment contexts entered and left open inside the generator *do* become visible outside the generator after the generator has finished with a ``StopIteration`` or another exception:: assi = cvar.assign(new_value) def genfunc(): yield assi.__enter__(): yield g = genfunc() assert cvar.value == "the default value" next(g) assert cvar.value == "the default value" next(g) # assi.__enter__() is called here assert cvar.value == "the default value" next(g) assert cvar.value is new_value assi.__exit__() Special functionality for framework authors ------------------------------------------- Frameworks, such as ``asyncio`` or third-party libraries, can use additional functionality in ``contextvars`` to achieve the desired semantics in cases which are not determined by the Python interpreter. Some of the semantics described in this section are also afterwards used to describe the internal implementation. Leaking yields '''''''''''''' Using the ``contextvars.leaking_yields`` decorator, one can choose to leak the context through ``yield`` expressions into the outer context that drives the generator:: @contextvars.leaking_yields def genfunc(): assert cvar.value == "outer" with cvar.assign("inner"): yield assert cvar.value == "inner" assert cvar.value == "outer" g = genfunc(): with cvar.assign("outer"): assert cvar.value == "outer" next(g) assert cvar.value == "inner" next(g) assert cvar.value == "outer" Capturing contextvar assignments '''''''''''''''''''''''''''''''' Using ``contextvars.capture()``, one can capture the assignment contexts that are entered by a block of code. The changes applied by the block of code can then be reverted and subsequently reapplied, even in another context:: assert cvar1.value is None # default assert cvar2.value is None # default assi1 = cvar1.assign(value1) assi2 = cvar1.assign(value2) with contextvars.capture() as delta: assi1.__enter__() with cvar2.assign("not captured"): assert cvar2.value is "not captured" assi2.__enter__() assert cvar1.value is value2 delta.revert() assert cvar1.value is None assert cvar2.value is None ... with cvar1.assign(1), cvar2.assign(2): delta.reapply() assert cvar1.value is value2 assert cvar2.value == 2 However, reapplying the "delta" if its net contents include deassignments may not be possible (see also Implementation and Open Issues). Getting a snapshot of context state ''''''''''''''''''''''''''''''''''' The function ``contextvars.get_local_state()`` returns an object representing the applied assignments to all context-local variables in the context where the function is called. This can be seen as equivalent to using ``contextvars.capture()`` to capture all context changes from the beginning of execution. The returned object supports methods ``.revert()`` and ``reapply()`` as above. Running code in a clean state ''''''''''''''''''''''''''''' Although it is possible to revert all applied context changes using the above primitives, a more convenient way to run a block of code in a clean context is provided:: with context_vars.clean_context(): # here, all context vars start off with their default values # here, the state is back to what it was before the with block. Implementation -------------- This section describes to a variable level of detail how the described semantics can be implemented. At present, an implementation aimed at simplicity but sufficient features is described. More details will be added later. Alternatively, a somewhat more complicated implementation offers minor additional features while adding some performance overhead and requiring more code in the implementation. Data structures and implementation of the core concept '''''''''''''''''''''''''''''''''''''''''''''''''''''' Each thread of the Python interpreter keeps its on stack of ``contextvars.Assignment`` objects, each having a pointer to the previous (outer) assignment like in a linked list. The local state (also returned by ``contextvars.get_local_state()``) then consists of a reference to the top of the stack and a pointer/weak reference to the bottom of the stack. This allows efficient stack manipulations. An object produced by ``contextvars.capture()`` is similar, but refers to only a part of the stack with the bottom reference pointing to the top of the stack as it was in the beginning of the capture block. Now, the stack evolves according to the assignment ``__enter__`` and ``__exit__`` methods. For example:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() # stack: [] assert cvar1.value is None assert cvar2.value is None with cvar1.assign("outer"): # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" with cvar1.assign("inner"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" with cvar2.assign("hello"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner"), # Assignment(cvar2, "hello")] assert cvar2.value == "hello" # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" assert cvar2.value is None # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" # stack: [] assert cvar1.value is None assert cvar2.value is None Getting a value from the context using ``cvar1.value`` can be implemented as finding the topmost occurrence of a ``cvar1`` assignment on the stack and returning the value there, or the default value if no assignment is found on the stack. However, this can be optimized to instead be an O(1) operation in most cases. Still, even searching through the stack may be reasonably fast since these stacks are not intended to grow very large. The above description is already sufficient for implementing the core concept. Suspendable frames require some additional attention, as explained in the following. Implementation of generator and coroutine semantics ''''''''''''''''''''''''''''''''''''''''''''''''''' Within generators, coroutines and async generators, assignments and deassignments are handled in exactly the same way as anywhere else. However, some changes are needed in the builtin generator methods ``send``, ``__next__``, ``throw`` and ``close``. Here is the Python equivalent of the changes needed in ``send`` for a generator (here ``_old_send`` refers to the behavior in Python 3.6):: def send(self, value): # if decorated with contextvars.leaking_yields if self.gi_contextvars is LEAK: # nothing needs to be done to leak context through yields :) return self._old_send(value) try: with contextvars.capture() as delta: if self.gi_contextvars: # non-zero captured content from previous iteration self.gi_contextvars.reapply() ret = self._old_send(value) except Exception: raise else: # suspending, revert context changes but delta.revert() self.gi_contextvars = delta return ret The corresponding modifications to the other methods is essentially identical. The same applies to coroutines and async generators. For code that does not use ``contextvars``, the additions are O(1) and essentially reduce to a couple of pointer comparisons. For code that does use ``contextvars``, the additions are still O(1) in most cases. More on implementation '''''''''''''''''''''' The rest of the functionality, including ``contextvars.leaking_yields``, contextvars.capture()``, ``contextvars.get_local_state()`` and ``contextvars.clean_context()`` are in fact quite straightforward to implement, but their implementation will be discussed further in later versions of this proposal. Caching of assigned values is somewhat more complicated, and will be discussed later, but it seems that most cases should achieve O(1) complexity. Backwards compatibility ======================= There are no *direct* backwards-compatibility concerns, since a completely new feature is proposed. However, various traditional uses of thread-local storage may need a smooth transition to ``contextvars`` so they can be concurrency-safe. There are several approaches to this, including emulating task-local storage with a little bit of help from async frameworks. A fully general implementation cannot be provided, because the desired semantics may depend on the design of the framework. Another way to deal with the transition is for code to first look for a context created using ``contextvars``. If that fails because a new-style context has not been set or because the code runs on an older Python version, a fallback to thread-local storage is used. Open Issues =========== Out-of-order de-assignments --------------------------- In this proposal, all variable deassignments are made in the opposite order compared to the preceding assignments. This has two useful properties: it encourages using ``with`` statements to define assignment scope and has a tendency to catch errors early (forgetting a ``.__exit__()`` call often results in a meaningful error. To have this as a requirement requirement is beneficial also in terms of implementation simplicity and performance. Nevertheless, allowing out-of-order context exits is not completely out of the question, and reasonable implementation strategies for that do exist. Rejected Ideas ============== Dynamic scoping linked to subroutine scopes ------------------------------------------- The scope of value visibility should not be determined by the way the code is refactored into subroutines. It is necessary to have per-variable control of the assignment scope. Acknowledgements ================ To be added. References ========== To be added. -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Hi! On Tue, Sep 05, 2017 at 12:50:35AM +0300, Koos Zevenhoven <k7hoven@gmail.com> wrote:
cvar = contextvars.Var(default="the default value", description="example context variable")
Why ``description`` and not ``doc``?
with cvar.assign(new_value):
Why ``assign`` and not ``set``?
Each thread of the Python interpreter keeps its on stack of
"its own", I think.
``contextvars.Assignment`` objects, each having a pointer to the previous (outer) assignment like in a linked list.
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Tue, Sep 5, 2017 at 1:20 AM, Oleg Broytman <phd@phdru.name> wrote:
Cause that's a nice thing to bikeshed about? In fact, I probably should have left it out at this point. Really, it's just to get a meaningful repr for the object and better error messages, without any significance for the substance of the PEP. There are also concepts in the PEP that don't have a name yet.
with cvar.assign(new_value):
Why ``assign`` and not ``set``?
To distinguish from typical set-operations (setattr, setitem), and from sets and from settings. I would rather enter an "assignment context" than a "set context" or "setting context". One key point of this PEP is to promote defining context variable scopes on a per-variable (and per-value) basis. I combined the variable and value aspects in this concept of Assignment(variable, value) objects, which define a context that one can enter and exit.
Each thread of the Python interpreter keeps its on stack of
"its own", I think.
That's right, thanks. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

So every generator stores "captured" modifications. This is similar to PEP 550, which adds Logical Context to generators to store their EC modifications. The implementation is different, but the intent is the same. PEP 550 uses a stack of hash tables, this proposal has a linked list of Assignment objects. In the worst case, this proposal will have worse performance guarantees. It's hard to say more, because the implementation isn't described in full. With PEP 550 it's trivial to implement a context manager to control variable assignments. If we do that, how exactly this proposal is different? Can you list all semantical differences between this proposal and PEP 550? So far, it looks like if I call "var.assign(value).__enter__()" it would be equivalent to PEP 550's "var.set(value)". Yury On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:

On Monday, September 4, 2017 at 6:37:44 PM UTC-4, Yury Selivanov wrote:
I think you really should add a context manager to PEP 550 since it is better than calling "set", which leaks state. Nathaniel is right that you need set to support legacy numpy methods like seterr. Had there been a way of setting context variables using a context manager, then numpy would only have had to implement the "errstate" context manager on top of it. There would have been no need for seterr, which leaks state between code blocks and is error-prone.

On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
There is nothing in current Python to prevent numpy to use a context manager for seterr; it's easy enough to write your own context manager that saves and restores thread-local state (decimal shows how). In fact with PEP 550 it's so easy that it's really not necessary for the PEP to define this as a separate API -- whoever needs it can just write their own. -- --Guido van Rossum (python.org/~guido)

You should add https://bitbucket.org/hipchat/txlocal as a reference for the pep as it largely implements this idea for Twisted. It may provide for some practical discussions of use cases and limitations of this approach. On Tue, Sep 5, 2017, 09:55 Guido van Rossum <guido@python.org> wrote:

We'll add a reference to the "Can Execution Context be implemented without modifying CPython?" section [1]. However, after skimming through the readme file, I didn't see any examples or limitations that are relevant to PEP 550. If the PEP gets accepted, Twisted can simply add direct support for it (similarly to asyncio). That would mean that users won't need to maintain the context manually (described in txlocal's "Maintaining Context" section). Yury [1] https://www.python.org/dev/peps/pep-0550/#can-execution-context-be-implement... On Tue, Sep 5, 2017 at 8:00 AM, Kevin Conway <kevinjacobconway@gmail.com> wrote:

On Tue, Sep 5, 2017 at 10:54 AM Guido van Rossum <guido@python.org> wrote:
Don't you want to encourage people to use the context manager form and discourage calls to set/discard? I recognize that seterr has to be supported and has to sit on top of some method in the execution context. However, if we were starting from scratch, I don't see why we would have seterr at all. We should just have errstate. seterr can leak state, which might not seem like a big deal in a small program, but in a large program, it can mean that a minor change in one module can cause bugs in a totally different part of the program. These kinds of bugs can be very hard to debug.
-- --Guido van Rossum (python.org/~guido)

On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
From a quick skim, my impression is:
All the high-level semantics you suggest make sense... in fact, AFAICT they're exactly the same semantics we've been using as a litmus test for PEP 550. I think PEP 550 is sufficient to allow implementing all your proposed APIs (and that if it isn't, that's a bug in PEP 550). OTOH, your proposal doesn't provide any way to implement functions like decimal.setcontext or numpy.seterr, except by pushing a new state and never popping it, which leaks memory and permanently increases the N in the O(N) lookups. I didn't see any direct comparison with PEP 550 in your text (maybe I missed it). Why do you think this approach would be better than what's in PEP 550? -n -- Nathaniel J. Smith -- https://vorpus.org

On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith <njs@pobox.com> wrote:
Well, I'm happy to hear that a quick skim can already give you an impression ;). But let's see how correct...
Well, if "exactly the same semantics" is even nearly true, you are only testing a small subset of PEP 550 which resembles a subset of this proposal.
I think PEP 550 is sufficient to allow implementing all your proposed APIs (and that if it isn't, that's a bug in PEP 550).
That's not true either. The LocalContext-based semantics introduces scope barriers that affect *all* variables. You might get close by putting just one variable in a LogicalContext and then nest them, but PEP 550 does not allow this in all cases. With the addition of PEP 521 and some trickery, it might. See also this section in PEP 550, where one of the related issues is described: https://www.python.org/dev/peps/pep-0550/#should-yield- from-leak-context-changes
Well, there are different approaches for this. Let's take the example of numpy. import numpy as np I believe the relevant functions are np.seterr -- set a new state (and return the old one) np.geterr -- get the current state np.errstate -- gives you a context manager to do handle (Well, errstate sets more state than np.seterr, but that's irrelevant here). First of all, the np.seterr API is something that I want to discourage in this proposal, because if the state is not reset back to what it was, a completely different piece of code may be affected. BUT To preserve the current semantics of these functions in non-async code, you could do this: - numpy reimplements the errstate context manager using contextvars based on this proposal. - geterr gets the state using contextvars - seterr gets the state using contextvars and mutates it the way it wants (If contextvars is not available, it uses the old way) Also, the idea is to also provide frameworks the means for implementing concurrency-local storage, if that is what people really want, although I'm not sure it is.
It was not my intention to leave out the comparison altogether, but I did avoid the comparisons in some cases in this first draft, because thinking about PEP 550 concepts while trying to understand this proposal might give you the wrong idea. One of the benefits of this proposal is simplicity, and I'm guessing performance as well, but that would need evidence. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" them, I meant stacking them. It is in fact nesting in terms of value scopes. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 9:12 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I don't actually care if you use the latest terminology. You seem to have a wrong idea about how PEP 550 really works (and its full semantics), because things you say here about it don't make any sense. Yury

On Tue, Sep 5, 2017 at 8:24 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
In PEP 550, introducing a new LogicalContext on the ExecutionContext affects the scope of any_ var.set(value) for * any * any_var . Does that not make sense? –– Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 8:43 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
So you claim that PEP 550 does allow that in all cases? Or you don't think that that would get close? ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 9/4/17, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I feel that of "is" and "==" in assert statements in this PEP has to be used (or described) more precisely. What if new_value above is 123456789? maybe using something like could be better? -> def equals(a, b): return a is b or a == b Doesn't PEP need to think about something like "context level overflow" ? Or members like: cvar.level ?

On Tue, Sep 5, 2017 at 10:43 AM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
The use is quite precise as it is now. I can't use `is` for the string values, because the result would depend on whether Python gives you the same str instance as before, or a new one with the same content. Maybe I'll get rid of literal string values in the description, since it seems to only cause distraction.
What if new_value above is 123456789?
Any value is fine.
I don't see any need for this at this point, or possibly ever. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

I am a relative nobody in Python, however a few weeks ago, I suggested more harmonization with JavaScript. Admittedly I've been doing more JS lately, so I might have JS-colored glasses on, but it looks like you're trying to add lexical scoping to Python, and there's a whole lot of manual scope work going on. (This may be a result of showing what can be done, rather than what can typically be done). I am probably entirely mistaken, however when I saw the subject and started reading, I expected to see something like v = context.Var({'some': value, 'other': value}) # (wherein Var() would make deep copies of the values) why repeat `var` after context ? if Var is the only point of context module? But that's not what I saw. I didn't immediately grasp that `value` and `description` (aka. `doc`) were special properties for an individual context var. This is likely an issue with me, or it could be documentation. Then it went on to talk about using `with` for managing context. So it looks like the `with` is just providing a value stack (list) for the variable? Can this be done automatically with a _setattr_ and append()? Having done a lot of HTML5 Canvas drawing (https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/sa...) , this reminds me of `save()` and `restore()`, which would likely be more familiar to a larger audience. Additionally what about: with context.save() as derived_context: # or just `with context as derived_context:` calling save automatically # whatever # automatically call restore on __exit__. I also wonder about using multiple context Vars/managers with `with`, as that statement would get quite long. Finally, would it be possible to pass a dict and get membered object out? Using v from the example above, i.e.: v.value, v.other where gets/sets automatically use the most recent context?

Hi!,
[...]
why not to call the section 'Running code in the default state' and the method just `.default_context()`: with context_vars.default_context(): # here, all context vars start off with their default values # here, the state is back to what it was before the with block. Means `clean` here `default` (the variable is constructed as cvar = ontextvars.Var(default="the default value", ...) ? Thanks in advance, --francis

Hi all, Thank you for the feedback so far. FYI, or as a reminder, this is now PEP 555, but the web version is still the same draft that I posted here. The discussion of this was paused as there was a lot going on at that moment, but I'm now getting ready to make a next version of the draft. Below, I'll draft some changes I intend to make so they can already be discussed. First of all, I'm considering calling the concept "context arguments" instead of "context variables", because that describes the concept better. But see below for some more. On Tue, Sep 5, 2017 at 12:50 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Some points related to the arguments and naming: Indeed, this might change to contextvars.Arg. After all, these are more like arguments than variables. But just like with function arguments, you can use a mutable value, which allows more variable-like semantics. That is, however, not the primarily intended use. It may also cause more problems at inter-process or inter-interpreter boundaries etc., where direct mutation of objects may not be possible. I might have to remove the ``default`` argument, at least in this form. If there is a default, it should be more explicit what the scope of the default is. There could be thread-wide defaults or interpreter-wide defaults and so on. It is not completely clear what a truly global default would mean. One way to deal with this would be to always pass the context on to other threads and processes etc when they are created. But there are some ambiguities here too, so the safest way might be to let the user implement the desired semantics regarding defaults and thread boundaries etc.
Unfortunately, we actually need a third kind of generator semantics, something like this: @contextvars.caller_context def genfunc(): assert cvar.value is the_value yield assert cvar.value is the_value with cvar.assign(the_value): gen = genfunc() next(gen) with cvar.assign(1234567890): try: next(gen) except StopIteration: pass Nick, Yury and I (and Nathaniel, Guido, Jim, ...?) somehow just narrowly missed the reasons for this in discussions related to PEP 550. Perhaps because we had mostly been looking at it from an async angle. [In addition to this, all context changes (Assignment __enter__ or __exit__) would be leaked out when the generator finishes iff there are no outer context changes. If there are outer context changes, an attempt to leak changes will fail. (I will probably need to explain this better).]
We will probably need also a ``use()`` method (or another name) here. That would return a context manager that applies the full context on __enter__ and reapplies the previous one on __exit__.
As an additional tool, there could be contextvars.callback: @contextvars.callback def some_callback(): # do stuff This would provide some of the functionality of this PEP if callbacks are used, so that the callback would be run with the same context as the code that creates the callback. The implementation of this would be essentially: def callback(func): context = contextvars.get_local_context(): def wrapped(*args, **kwargs): with context.use(): func(*args, **kwargs return wrapped With some trickery this might allow an async framework based on callbacks instead of coroutines to use context arguments. But using this might be a bit awkward sometimes. A contextlib.ExitStack might help here.
Implementation
I will still need to explain the O(1) algorithm, but one nice thing is that an implementation like micropython does not necessarily need to include that optimization.
I have a preliminary design for this, but probably doesn't need to be in this PEP.
If context variables are renamed context arguments, then there could be a settable variant called a context variable (could also be a third-party thing on top of context arguments, depending on what is done with decimal contexts).
Rejected Ideas
In fact, in early sketches, my approach was closer to this. The context variables (or async variables) were stored in frame locals in a namespace called `__async__` and they were propagated through subroutine calls to callees. But this introduces problems when new scope layers are added, and ended up being more complicated (and slightly similar to PEP 550). Anyway, for starters, this was a glimpse of the changes I have planned, and open for discussion. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Oct 7, 2017 12:20, "Koos Zevenhoven" <k7hoven@gmail.com> wrote: Unfortunately, we actually need a third kind of generator semantics, something like this: @contextvars.caller_context def genfunc(): assert cvar.value is the_value yield assert cvar.value is the_value with cvar.assign(the_value): gen = genfunc() next(gen) with cvar.assign(1234567890): try: next(gen) except StopIteration: pass Nick, Yury and I (and Nathaniel, Guido, Jim, ...?) somehow just narrowly missed the reasons for this in discussions related to PEP 550. Perhaps because we had mostly been looking at it from an async angle. That's certainly a semantics that one can write down (and it's what the very first version of PEP 550 did), but why do you say it's needed? What are these reasons that were missed? Do you have a use case? -n

On Sun, Oct 8, 2017 at 12:16 AM, Nathaniel Smith <njs@pobox.com> wrote:
I do remember Yury mentioning that the first draft of PEP 550 captured something when the generator function was called. I think I started reading the discussions after that had already been removed, so I don't know exactly what it was. But I doubt that it was *exactly* the above, because PEP 550 uses set and get operations instead of "assignment contexts" like PEP 555 (this one) does.
but why do you say it's needed? What are these reasons that were missed? Do you have a use case?
Yes, there's a type of use case. When you think of a generator function as a function that returns an iterable of values and you don't care about whether the values are computed lazily or not. In that case, you don't want next() or .send() to affect the context inside the generator. In terms of code, we might want this: def values(): # compute some values using cvar.value and return a_list_of_values with cvar.assign(something): data = values() datalist = list(data) ...to be equivalent to: def values(): # compute some values using cvar.value and # yield them one by one with cvar.assign(something): data = values() datalist = list(data) So we don't want the "lazy evaluation" of generators to affect the values "in" the iterable. But I think we had our minds too deep in event loops and chains of coroutines and async generators to realize this. Initially, this seems to do the wrong thing in many other cases, but in fact, with the right extension to this behavior, we get the right thing in almost all situtations. We still do need the other generator behaviors described in PEP 555, for async and other uses, but I would probably go as far as making this new one the default. But I kept the decorator syntax for now. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 8 October 2017 at 08:40, Koos Zevenhoven <k7hoven@gmail.com> wrote:
We didn't forget it, we just don't think it's very useful. However, if you really want those semantics under PEP 550, you can do something like this: def use_creation_context(g): @functools.wraps(g) def make_generator_wrapper(*args, **kwds): gi = g(*args, **kwds) return _GeneratorWithCapturedEC(gi) return make_generator_wrapper class _GeneratorWithCapturedEC: def __init__(self, gi): self._gi = gi self._ec = contextvars.get_execution_context() def __next__(self): return self.send(None) def send(self, value): return contextvars.run_with_execution_context(self.ec, self._gi.send, value) def throw(self, *exc_details): return contextvars.run_with_execution_context(self.ec, self._gi.throw, *exc_details) def close(self): return self.throw(GeneratorExit) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 8, 2017 at 11:46 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm not sure I agree on the usefulness. Certainly a lot of the complexity of PEP 550 exists just to cater to Nathaniel's desire to influence what a generator sees via the context of the send()/next() call. I'm still not sure that's worth it. In 550 v1 there's no need for chained lookups. -- --Guido van Rossum (python.org/~guido)

On Mon, Oct 9, 2017 at 6:24 PM, Guido van Rossum <guido@python.org> wrote:
We do need some sort of chained lookups, though, at least in terms of semantics. But it is possible to optimize that away in PEP 555. Some kind of chained-lookup-like thing is inevitable if you want the state not to leak though yields out of the generator: with cvar.assign(a_value): # don't leak `a_value` to outer context yield some_stuff() ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Mon, Oct 9, 2017 at 4:39 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
You keep using the "optimize away" terminology. I assume that you mean that ContextVar.get() will have a cache (so it does in PEP 550 btw). What else do you plan to "optimize away"? Where's a detailed implementation spec? What you have in the PEP is still vague and leaves many important implementation details to the imagination of the reader. The fact is that the datastructure choice in PEP 555 is plain weird. You want to use a sequence of values to represent a mapping. And then you hand-waved all questions about what will happen in pathological cases, saying that "we'll have a cache and applications won't have to many context values anyways". But your design means that in the worst case, the uncached path requires you to potentially traverse all values in the context. Another thing: suppose someone calls 'context_var.assign().__enter__()' manually, without calling '__exit__()'. You will have unbound growth of the context values stack. You'll say that it's not how the API is supposed to be used, and we say that we want to convert things like decimal and numpy to use the new mechanism. That question was also hand-waved by you: numpy and decimal will have to come up with new/better APIs to use PEP 555. Well, that's just not good enough. And the key problem is that you still haven't directly highlighted differences in semantics between PEP 550 and PEP 555. This is the most annoying part, because almost no one (including me) knows the complete answer here. Maybe you know, but you refuse to include that in the PEP for some reason.
No, it's not "inevitable". In PEP 550 v1, generators captured the context when they are created and there was always only one level of context. This means that: 1. Context changes in generators aren't visible to the outside world. 2. Changes to the context in the outside world are not visible to running generators. PEP 550 v1 was the simplest thing possible with a very efficient implementation. It had the following "issues" that led us to v2+ semantics of chained lookup: 1. Refactoring. with some_context(): for i in gen(): pass would not be equivalent to: g = gen() with some_context(): for i in g: pass 2. Restricting generators to only see context at the point of their creation feels artificial. We know there are better solutions here (albeit more complex) and we try to see if they are worth it. 3. Nathaniel't use case in Trio. Yury

On Tue, Oct 10, 2017 at 1:55 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I'm hesitant to call it a cache, because the "cache" sort of automatically builds itself. I think I'll need to draw a diagram to explain it. The implementation is somewhat simpler than its explanation. I can go more into detail regarding the implementation, but I feel that semantics is more important at this point.
I don't think I've heard of any pathological cases... What do you mean? But your design means that in the worst
case, the uncached path requires you to potentially traverse all values in the context.
It is in fact possible to implement it in a way that this never happens, but the best thing might actually be an implementation, where this *almost* never happens. (Off-topic: It would be kind of cool, if you could do the same thing with MROs, so OOP method access will speed up. But that might be somewhat more difficult to implement, because there are more moving parts there.)
You can cause unbound growth in PEP 550 too. All you have to do is nest an unbounded number of generators. In PEP 555, nesting generators doesn't do anything really, unless you actually assign to context arguments in the generators. Only those who use it will pay. But seriously, you will always end up in a weird situation if you call an unbounded number of contextmanager.__enter__() methods without calling __exit__(). Nothing new about that. But entering a handful of assignment contexts and leaving them open until a script ends is not the end of the world. I don't think anyone should do that though.
What part of my explanation of this are you unhappy with? For instance, the 12th (I think) email in this thread, which is my response to Nathaniel. Could you reply to that and tell us your concern?
I don't refuse to . I just haven't prioritized it. But I've probably made the mistake of mentioning *similarities* between 550 and 555 . One major difference is that there is no .set(value) in PEP 555, so one shouldn't try to map PEP 550 uses directly to PEP 555.
Sure, if you make generators completely isolated from the outside world, then you can avoid chaining-like things too. But that would just sweep it under the carpet.
What's the point of this? Moving stuff out of a with statement should not matter? The whole point of with statements is that it matters whether you do something inside it or outside it. -- Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Mon, Oct 9, 2017 at 8:37 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
You can only nest up to 'sys.get_recursion_limit()' number of generators. With PEP 555 you can do: while True: context_var.assign(42).__enter__()
Same for 550. If a generator doesn't set context variables, its LC will be an empty mapping (or NULL if you want to micro-optimize things). Nodes for the chain will come from a freelist. The effective overhead for generators is a couple operations on pointers, and thus visible only in microbenchmarks.
I'm sorry, I'm not going to find some 12th email in some thread. I stated in this thread the following: not being able to use PEP 555 to fix *existing* decimal & numpy APIs is not good enough. And decimal & numpy is only one example, there's tons of code out there that can benefit from its APIs to be fixed to support for async code in Python 3.7.
This is not a "major difference". You might feel that it is, but it is a simple API design choice. As I illustrated in a few emails before, as long as users can call 'context_var.assign(..).__enter__()' manually, your PEP *does* allow to effectively do ".set(value)". If using a context manager instead of 'set' method is the only difference you can highlight, then why bother writing a PEP? The idea of using context managers to set values is very straightforward and is easy to be incorporated to PEP 550. In fact, it will be added to the next version of the PEP (after discussing it with Guido on the lang summit).
What do you mean by "just sweep it under the carpet"? Capturing the context at the moment of generators creation is a design choice with some consequences (that I illustrated in my previous email). There are cons and pros of doing that. Yury

On Tue, Oct 10, 2017 at 4:22 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Well, in PEP 550, you can explicitly stack an unbounded number of LogicalContexts in a while True loop. Or you can run out of memory using plain lists even faster: l = [42] while True: l *= 2 # ensure exponential blow-up I don't see why your example with context_var.assign(42).__enter__() would be any more likely. Sure, we could limit the number of allowed nested contexts in PEP 555. I don't really care. Just don't enter an unbounded number of context managers without exiting them. Really, it was my mistake to ever make you think that context_var.assign(42).__enter__() can be compared to .set(42) in PEP 550. I'll say it once more: PEP 555 context arguments have no equivalent of the PEP-550 .set(..).
Sure, you can implement push and pop and maintain a freelist by just doing operations on pointers. But that would be a handful of operations. Maybe you'd even manage to avoid INCREFs and DECREFs by not exposing things as Python objects. But I guarantee you, PEP 555 is simpler in this regard. In (pseudo?) C, the per-generator and per-send overhead would come from something like: /* On generator creation */ stack = PyThreadState_Get()->carg_stack; Py_INCREF(stack); self->carg_stack = stack; ---------- /* On each next / send */ stack_ptr = &PyThreadState_Get()->carg_stack; if (*stack_ptr == self->carg_stack) { /* no assignments made => do nothing */ } /* ... then after next yield */ if (*stack_ptr == self->carg_stack) { /* once more, do nothing */ } And there will of course be a PyDECREF after the generator has finished or when it is deallocated. If the generators *do* use context argument assignments, then some stuff would happen in the else clauses of the if statements above. (Or actually, using != instead of ==).
Well, anyone interested can read that 12th email in this thread. In short, my recommendation for libraries would be as follows: * If the library does not provide a context manager yet, they should add one, using PEP 555. That will then work nicely in coroutines and generators. * If the library does have a context manager, implement it using PEP 555. Or to be safe, add a new API function, so behavior in existing async code won't change. * If the library needs to support some kind of set_state(..) operation, implement it by getting the state using a PEP 555 context argument and mutating its contents. * Fall back to thread-local storage if no context argument is present or if the Python version does not support context arguments. [...]
"Capturing the context at generator creation" and "isolating generators completely" are two different things. I've described pros of the former. The latter has no pros that I'm aware of, except if sweeping things under the carpet is considered as one. Yes, the latter works in some use cases, but in others it does not. For instance, if an async framework wants to make some information available throughout the async task. If you isolate generators, then async programmers will have to avoid generators, because they don't have access to the information the framework is trying to provide. Also, if you refactor your generator into subgenerators using `yield from`, the subgenerators will not see the context set by the outer generator. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 10 October 2017 at 22:34, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Then your alternate PEP can't work, since it won't be useful to extension modules. Context managers are merely syntactic sugar for try/finally statements, so you can't wave your hands and say a context manager is the only supported API: you *have* to break the semantics down and explain what the try/finally equivalent looks like. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Oct 10, 2017 at 3:42 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Maybe this helps: * PEP 550 is based on var.set(..), but you will then implement context managers on top of that. * PEP 555 is based context managers, but you can implement a var.set(..) on top of that if you really need it.
Is this what you're asking? assi = cvar.assign(value) assi.__enter__() try: # do stuff involving cvar.value finally: assi.__exit__() As written in the PEP, these functions would have C equivalents. But most C extensions will probably only need cvar.value, and the assignment contexts will be entered from Python. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Oct 10, 2017 at 12:34 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
But then you *are* allowing users to use "__enter__()" and "__exit__()" directly. Which means that some users *can* experience an unbound growth of context values stack that will make their code run out of memory. This is not similar to appending something to a list -- people are aware that lists can't grow infinitely. But it's not obvious that you can't call "cvar.assign(value).__enter__()" many times. The problem with memory leaks like this is that you can easily write some code and ship it. And only after a while you start experiencing problems in production that are extremely hard to track. Yury

On Tue, Oct 10, 2017 at 8:34 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
No, you can't. PEP 550 doesn't have APIs to "stack ... LogicalContexts".
Of course you can write broken code. The point is that contexts work like scopes/mappings, and it's counter-intuitive that setting a variable with 'cv.assign(..).__enter__()' will break the world. If a naive user tries to convert their existing decimal-like API to use your PEP, everything would work initially, but then blow up in production. [..]
Any API exposing a context manager should have an alternative try..finally API. In your case it's 'context_var.assign(42).__enter__()'. 'With' statements are sugar in Python. It's unprecedented to design API solely around them.
[..] I wrote several implementations of PEP 550 so far. No matter what you put in genobject.send(): one pointer op or two, the results are the same: in microbenchmarks generators become 1-2% slower. In macrobenchmarks of generators you can't observe any slowdown. And if we want the fastest possible context implementation, we can chose PEP 550 v1, which is the simplest solution. In any case, the performance argument is invalid, please stop using it.
The last bullet point is the problem. Everybody is saying to you that it's not acceptable. It's your choice to ignore that. [..]
This is plain incorrect. Please read PEP 550v1 before continuing the discussion about it.
Subgenerators see the context changes in the outer generator in all versions of PEP 550. The point you didn't like is that in all versions of PEP 550 subgenerators could not leak any context to the outer generator. Please don't confuse these two. Yury

On Tue, Oct 10, 2017 at 5:40 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote: > On Tue, Oct 10, 2017 at 8:34 AM, Koos Zevenhoven <k7hoven@gmail.com> > wrote: > > On Tue, Oct 10, 2017 at 4:22 AM, Yury Selivanov <yselivanov.ml@gmail.com > > > > wrote: > >> > >> On Mon, Oct 9, 2017 at 8:37 PM, Koos Zevenhoven <k7hoven@gmail.com> > wrote: > >> > You can cause unbound growth in PEP 550 too. All you have to do is > nest > >> > an > >> > unbounded number of generators. > >> > >> You can only nest up to 'sys.get_recursion_limit()' number of > generators. > >> > >> With PEP 555 you can do: > >> > >> while True: > >> context_var.assign(42).__enter__() > >> > > > > Well, in PEP 550, you can explicitly stack an unbounded number of > > LogicalContexts in a while True loop. > > No, you can't. PEP 550 doesn't have APIs to "stack ... LogicalContexts". > > That's ridiculous. Quoting PEP 550: " The contextvars.run_with_logical_context(lc: LogicalContext, func, *args, **kwargs) function, which runs func with the provided logical context on top of the current execution context. " > > Or you can run out of memory using > > plain lists even faster: > > > > l = [42] > > > > while True: > > l *= 2 # ensure exponential blow-up > > > > I don't see why your example with context_var.assign(42).__enter__() > would > > be any more likely. > > Of course you can write broken code. The point is that contexts work > like scopes/mappings, and it's counter-intuitive that setting a > variable with 'cv.assign(..).__enter__()' will break the world. If a > naive user tries to convert their existing decimal-like API to use > your PEP, everything would work initially, but then blow up in > production. > > The docs will tell them what to do. You can pass a context argument down the call chain. You don't "set" context arguments! That's why I'm changing to "context argument", and I've said this many times now. > [..] > > Really, it was my mistake to ever make you think that > > context_var.assign(42).__enter__() can be compared to .set(42) in PEP > 550. > > I'll say it once more: PEP 555 context arguments have no equivalent of > the > > PEP-550 .set(..). > > Any API exposing a context manager should have an alternative > try..finally API. In your case it's > 'context_var.assign(42).__enter__()'. 'With' statements are sugar in > Python. It's unprecedented to design API solely around them. > > [..] > Yury writes: >> >> That question was also hand-waved by you: > >> >> numpy and decimal will have to come up with new/better APIs to use > PEP > >> >> 555. Well, that's just not good enough. > >> > > >> > > Koos writes: >> > What part of my explanation of this are you unhappy with? For instance, > >> > the > >> > 12th (I think) email in this thread, which is my response to > Nathaniel. > >> > Could you reply to that and tell us your concern? > >> > >> I'm sorry, I'm not going to find some 12th email in some thread. I > >> stated in this thread the following: not being able to use PEP 555 to > >> fix *existing* decimal & numpy APIs is not good enough. And decimal & > >> numpy is only one example, there's tons of code out there that can > >> benefit from its APIs to be fixed to support for async code in Python > >> 3.7. > >> > > > > Well, anyone interested can read that 12th email in this thread. In > short, > > my recommendation for libraries would be as follows: > > > > * If the library does not provide a context manager yet, they should add > > one, using PEP 555. That will then work nicely in coroutines and > generators. > > > > * If the library does have a context manager, implement it using PEP > 555. Or > > to be safe, add a new API function, so behavior in existing async code > won't > > change. > > > > * If the library needs to support some kind of set_state(..) operation, > > implement it by getting the state using a PEP 555 context argument and > > mutating its contents. > > > > * Fall back to thread-local storage if no context argument is present or > if > > the Python version does not support context arguments. > > The last bullet point is the problem. Everybody is saying to you that > it's not acceptable. It's your choice to ignore that. > > Never has anyone told me that that is not acceptable. Please stop that. [..] > >> What do you mean by "just sweep it under the carpet"? Capturing the > >> context at the moment of generators creation is a design choice with > >> some consequences (that I illustrated in my previous email). There > >> are cons and pros of doing that. > >> > > > > "Capturing the context at generator creation" and "isolating generators > > completely" are two different things. > > > > I've described pros of the former. The latter has no pros that I'm aware > of, > > except if sweeping things under the carpet is considered as one. > > > > Yes, the latter works in some use cases, but in others it does not. For > > instance, if an async framework wants to make some information available > > throughout the async task. If you isolate generators, then async > programmers > > will have to avoid generators, because they don't have access to the > > information the framework is trying to provide. > > This is plain incorrect. Please read PEP 550v1 before continuing the > discussion about it. > > I thought you wrote that they are isolated both ways. Maybe there's a misunderstanding. I found your "New PEP 550" email in the archives in some thread. That might be v1, but the figure supposedly explaining this part is missing. Whatever. This is not about PEP 550v1 anyway. > > Also, if you refactor your > > generator into subgenerators using `yield from`, the subgenerators will > not > > see the context set by the outer generator. > > Subgenerators see the context changes in the outer generator in all > versions of PEP 550. > > The point you didn't like is that in all versions of PEP 550 > subgenerators could not leak any context to the outer generator. > Please don't confuse these two. > That's a different thing. But it's not exactly right: I didn't like the fact that some subroutines (functions, coroutines, (async) generators) leak context and some don't. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Oct 10, 2017 at 11:26 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Note that 'run_with_logical_context()' doesn't accept the EC. It gets it using the 'get_execution_context()' function, which will squash LCs if needed. I say it again: *by design*, PEP 550 APIs do not allow to manually stack LCs in such a way that an unbound growth of the stack is possible.
I'm saying this the last time: In Python, any context manager should have an equivalent try..finally form. Please give us an example, how we can use PEP 555 APIs with a try..finally block. By the way, PEP 555 has this, quote: """ By default, values assigned inside a generator do not leak through yields to the code that drives the generator. However, the assignment contexts entered and left open inside the generator do become visible outside the generator after the generator has finished with a StopIteration or another exception: assi = cvar.assign(new_value) def genfunc(): yield assi.__enter__(): yield """ Why do you call __enter__() manually in this example? I thought it's a strictly prohibited thing in your PEP -- it's unsafe to use it this way. Is it only for illustration purposes? If so, then how "the assignment contexts entered and left open inside the generator" can even be a thing in your design? [..]
The whole idea of PEP 550 was to provide a working alternative to TLS. So this is clearly not acceptable for PEP 550. PEP 555 may hand-wave this requirement, but it simply limits the scope of where it can be useful. Which in my opinion means that it provides strictly *less* functionality than PEP 550. [..]
PEP 550 has links to all versions of it. You can simply read it there.
That might be v1, but the figure supposedly explaining this part is missing. Whatever. This is not about PEP 550v1 anyway.
This is about you spreading wrong information about PEP 550 (all of its versions in this case). Again, in PEP 550: 1. Changes to contexts made in async generators and sync generators do not leak to the caller. Changes made in a caller are visible to the generator. 2. Changes to contexts made in async tasks do not leak to the outer code or other tasks. That's assuming async tasks implementation is tweaked to use 'run_with_execution_context'. Otherwise, coroutines work with EC just like functions. 3. Changes to contexts made in OS threads do not leak to other threads. How's PEP 555 different besides requiring to use a context manager?
So your PEP is "solving" this by disallowing to simply "set" a variable without a context manager. Is this the only difference? Look, Koos, until you give me a full list of *semantical* differences between PEP 555 and PEP 550, I'm not going to waste my time on discussions here. And I encourage Guido, Nick, and Nathaniel to do the same. Yury

On 10 October 2017 at 01:24, Guido van Rossum <guido@python.org> wrote:
The compatibility concern is that we want developers of existing libraries to be able to transparently switch from using thread local storage to context local storage, and the way thread locals interact with generators means that decimal (et al) currently use the thread local state at the time when next() is called, *not* when the generator is created. I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence: with decimal.localcontext() as ctx: ctc.prex = 30 for i in gen(): pass g = gen() with decimal.localcontext() as ctx: ctc.prex = 30 for i in g: pass The easiest way to maintain that equivalence is to say that even though preventing state changes leaking *out* of generators is considered a desirable change, we see preventing them leaking *in* as a gratuitous backwards compatibility break. This does mean that *neither* form is semantically equivalent to eager extraction of the generator values before the decimal context is changed, but that's the status quo, and we don't have a compelling justification for changing it. If folks subsequently decide that they *do* want "capture on creation" or "capture on first iteration" semantics for their generators, those are easy enough to add as wrappers on top of the initial thread-local-compatible base by using the same building blocks as are being added to help event loops manage context snapshots for coroutine execution. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Oct 10, 2017 at 3:34 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
If you want to keep those semantics in decimal, then you're already done.
Generator functions aren't usually called `gen`. Change that to: with decimal.localcontext() as ctx: ctc.prex = 30 for val in values(): do_stuff_with(val) # and vals = values() with decimal.localcontext() as ctx: ctc.prex = 30 for val in vals: do_stuff_with(val) I see no reason why these two should be equivalent. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 10 October 2017 at 22:51, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I see no reason why these two should be equivalent.
There is no "should" about it: it's a brute fact that the two forms *are* currently equivalent for lazy iterators (including generators), and both different from the form that uses eager evaluation of the values before the context change. Where should enters into the picture is by way of PEP 550 saying that they should *remain* equivalent because we don't have an adequately compelling justification for changing the runtime semantics. That is, given the following code: itr = make_iter() with decimal.localcontext() as ctx: ctc.prex = 30 for i in itr: pass Right now, today, in 3.6. the calculations in the iterator will use the modified decimal context, *not* the context that applied when the iterator was created. If you want to ensure that isn't the case, you have to force eager evaluation before the context change. What PEP 550 is proposing is that, by default, *nothing changes*: the lazy iteration in the above will continue to use the updated decimal context by default. However, people *will* gain a new option for avoiding that: instead of forcing eager evaluation, they'll be able to capture the creation context instead, and switching back to that each time the iterator needs to calculate a new value. If PEP 555 proposes that we should instead make lazy iteration match eager evaluation semantics by *default*, then that's going to be a much harder case to make because it's a gratuitous compatibility break - code that currently works one way will suddenly start doing something different, and end users will have difficulty getting it to behave the same way on 3.7 as it does on earlier versions. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Oct 10, 2017 at 5:01 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's just an arbitrary example. There are many things that *would* change if decimal contexts simply switched from using thread-local storage to using PEP 550. It's not at all obvious which of the changes would be most likely to cause problems. If I were to choose, I would probably introduce a new context manager which works with PEP 555 semantics, because that's the only way to ensure full backwards compatibility, regardless of whether PEP 555 or PEP 550 is used. But I'm sure one could decide otherwise. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Oct 10, 2017 at 10:22 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Please stop using "many things .. would", "most likely" etc. We have a very focused discussion here. If you know of any particular issue, please demonstrate it with a realistic example. Otherwise, we only increase the number of emails and make things harder to track for everybody. If decimal switches to use PEP 550, there will be no "many things that *would* change". The only thing that will change is this: def g(): with decimal_context(...): yield next(g()) # this will no longer leak decimal context to the outer world I consider the above a bug fix, because nobody in their right mind relies on partial iteration of generator expecting that some of it's internal code would affect your code indirectly. The only such case is contextlib.contextmanager, and PEP 550 provides mechanisms to make generators "leaky" explicitly. Yury

On Tue, Oct 10, 2017 at 5:46 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
It is not obvious to me if changing the semantics of this is breakage or a bug fix (as you put it below).
I can't explain everything, especially not in a single email. I will use whatever English words I need. You can also think for yourself––or ask a question.
I'm not going to (and won't be able to) list all those many use cases. I'd like to keep this more focused too. I'm sure you are well aware of those differences. It's not up to me to decide what `decimal` should do. I'll give you some examples below, if that helps.
You forgot `yield from g()`. See also below.
People use generators for all kinds of things. See below.
That's not the only one. Here's another example: def context_switcher(): for c in contexts: decimal.setcontext(c) yield ctx_switcher = context_switcher() def next_context(): next(ctx_switcher) And one more example: def make_things(): old_ctx = None def first_things_first(): first = compute_first_value() yield first ctx = figure_out_context(first) nonlocal old_ctx old_ctx = decimal.getcontext() decimal.setcontext(ctx) yield get_second_value() def the_bulk_of_things(): return get_bulk() def last_but_not_least(): decimal.set_context(old_ctx) yield "LAST" yield from first_things_first() yield from the_bulk_of_things() yield from last_but_not_least() all_things = list(make_things()) ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Oct 10, 2017 at 12:21 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
I can't assign meaning to your examples formulated in "many things" and "most likely". I can reason about concrete words and code examples. You essentially asking us to *trust you* that you know of some examples and they exist. It's not going to happen.
Then why are you working on a PEP? :) [..]
In 10 years of me professionally writing Python code, I've never seen this pattern in any code base. But even if such pattern exists, you can simply decorate "context_switcher" generator to set it's __logical_context__ to None. And it will start to leak things. BTW, how does PEP 555 handle your own example? I thought it's not possible to implement "decimal.setcontext" with PEP 555 at all!
I can only say that this one wouldn't pass my code review :) This isn't a real example, this is something that you clearly just a piece of tangled convoluted code that you just invented. Yury

On Tue, Oct 10, 2017 at 5:34 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Apart from the example in PEP 550, is that really a known idiom?
Do we really want that equivalence? It goes against the equivalence from Koos' example.
I dunno, I think them leaking in in the first place is a dubious feature, and I'm not too excited that the design of the way forward should bend over backwards to be compatible here. The only real use case I've seen so far (not counting examples that just show how it works) is Nathaniel's timeout example (see point 9 in Nathaniel’s message <https://mail.python.org/pipermail/python-ideas/2017-August/046736.html>), and I'm still not convinced that that example is important enough to support either. It would all be easier to decide if there were use cases that were less far-fetched, or if the far-fetched use cases would be supportable with a small tweak. As it is, it seems that we could live in a simpler, happier world if we gave up on context values leaking in via next() etc. (I still claim that in that case we wouldn't need chained lookup in the exposed semantics, just fast copying of contexts.)
I think the justification is that we could have a *significantly* simpler semantics and implementation.
(BTW Capture on first iteration sounds just awful.) I think we really need to do more soul-searching before we decide that a much more complex semantics and implementation is worth it to maintain backwards compatibility for leaking in via next(). -- --Guido van Rossum (python.org/~guido)

On 11 October 2017 at 02:52, Guido van Rossum <guido@python.org> wrote:
As a less-contrived example, consider context managers implemented as generators. We want those to run with the execution context that's active when they're used in a with statement, not the one that's active when they're created (the fact that generator-based context managers can only be used once mitigates the risk of creation time context capture causing problems, but the implications would still be weird enough to be worth avoiding). For native coroutines, we want them to run with the execution context that's active when they're awaited or when they're prepared for submission to an event loop, not the one that's active when they're created. For generators-as-coroutines, we want them to be like their native coroutine counterparts: run with the execution context that's active when they're passed to "yield from" or prepared for submission to an event loop. It's only for generators-as-iterators that the question of what behaviour we want even really arises, as it's less clear cut whether we'd be better off overall if they behaved more like an eagerly populated container (and hence always ran with the execution context that's active when they're created), or more like the way they do now (where retrieval of the next value from a generator is treated like any other method call). That combination of use cases across context managers, native coroutines, top level event loop tasks, and generator-based coroutines mean we already need to support both execution models regardless, so the choice of default behaviour for generator-iterators won't make much difference to the overall complexity of the PEP. However, having generator-iterators default to *not* capturing their creation context makes them more consistent with the other lazy evaluation constructs, and also makes the default ContextVar semantics more consistent with thread local storage semantics. The flipside of that argument would be: * the choice doesn't matter if there aren't any context changes between creation & use * having generators capture their context by default may ease future migrations from eager container creation to generators in cases that involve context-dependent calculations * decorators clearing the implicitly captured context from the generator-iterator when appropriate is simpler than writing a custom iterator wrapper to handle the capturing I just don't find that counterargument compelling when we have specific use cases that definitely benefit from the proposed default behaviour (contextlib.contextmanager, asyncio.coroutine), but no concrete use cases for the proposed alternative that couldn't be addressed by a combination of map(), functools.partial(), and contextvars.run_in_execution_context(). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I'm out of energy to debate every point (Steve said it well -- that decimal/generator example is too contrived), but I found one nit in Nick's email that I wish to correct. On Wed, Oct 11, 2017 at 1:28 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Here I think we're in agreement about the desired semantics, but IMO all this requires is some special casing for @contextlib.contextmanager. To me this is the exception, not the rule -- in most *other* places I would want the yield to switch away from the caller's context.
This caught my eye as wrong. Considering that asyncio's tasks (as well as curio's and trio's) *are* native coroutines, we want complete isolation between the context active when `await` is called and the context active inside the `async def` function. -- --Guido van Rossum (python.org/~guido)

On 13 October 2017 at 10:56, Guido van Rossum <guido@python.org> wrote:
The rationale for this behaviour *does* arise from a refactoring argument: async def original_async_function(): with some_context(): do_some_setup() raw_data = await some_operation() data = do_some_postprocessing(raw_data) Refactored: async def async_helper_function(): do_some_setup() raw_data = await some_operation() return do_some_postprocessing(raw_data) async def refactored_async_function(): with some_context(): data = await async_helper_function() However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [..]
"almost always" is an incorrect assumption. "usually" would be the correct one.
I still believe that both versions of PEP 550 (v1 & latest) got this right: * Coroutines on their own don't capture context; * Tasks manage context for coroutines they wrap. Yury

On Fri, Oct 13, 2017 at 3:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote: [..]
Actually, capturing context at the moment of coroutine creation (in PEP 550 v1 semantics) will not work at all. Async context managers will break. class AC: async def __aenter__(self): pass ^ If the context is captured when coroutines are instantiated, __aenter__ won't be able to set context variables and thus affect the code it wraps. That's why coroutines shouldn't capture context when created, nor they should isolate context. It's a job of async Task. Yury

On 13Oct2017 0941, Yury Selivanov wrote:
Then make __aenter__/__aexit__ when called by "async with" an exception to the normal semantics? It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions. Cheers, Steve

On Fri, Oct 13, 2017 at 8:45 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
The semantics is not really dependent on __aenter__ and __aexit__. They can be used together with both semantic variants that I'm describing for PEP 555, and without any special casing. IOW, this is independent of any remaining concerns in PEP 555. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Fri, Oct 13, 2017 at 1:45 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
It's not possible to special case __aenter__ and __aexit__ reliably (supporting wrappers, decorators, and possible side effects).
+1. I think that would make it much more usable by those of us who are not experts.
I still don't understand what Steve means by "more usable", to be honest. Yury

On 13 October 2017 at 19:32, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I'd consider myself a "non-expert" in async. Essentially, I ignore it - I don't write the sort of applications that would benefit significantly from it, and I don't see any way to just do "a little bit" of async, so I never use it. But I *do* see value in the context variable proposals here - if only in terms of them being a way to write my code to respond to external settings in an async-friendly way. I don't follow the underlying justification (which is based in "we need this to let things work with async/coroutines) at all, but I'm completely OK with the basic idea (if I want to have a setting that behaves "naturally", like I'd expect decimal contexts to do, it needs a certain amount of language support, so the proposal is to add that). I'd expect to be able to write context variables that my code could respond to using a relatively simple pattern, and have things "just work". Much like I can write a context manager using @contextmanager and yield, and not need to understand all the intricacies of __enter__ and __exit__. (BTW, apologies if I'm mangling the terminology here - write it off as part of me being "not an expert" :-)) What I'm getting from this discussion is that even if I *do* have a simple way of writing context variables, they'll still behave in ways that seem mildly weird to me (as a non-async user). Specifically, my head hurts when I try to understand what that decimal context example "should do". My instincts say that the current behaviour is wrong - but I'm not sure I can explain why. So on that example, I'd ask the following of any proposal: 1. Users trying to write a context variable[1] shouldn't have to jump through hoops to get "natural" behaviour. That means that suggestions that the complexity be pushed onto decimal.context aren't OK unless it's also accepted that the current behaviour is wrong, and the only reason decimal.context needs to replicated is for backward compatibility (and new code can ignore the problem). 2. The proposal should clearly establish what it views as "natural" behaviour, and why. I'm not happy with "it's how decimal.context has always behaved" as an explanation. Sure, people asking to break backward compatibility should have a good justification, but equally, people arguing to *preserve* an unintuitive current behaviour in new code should be prepared to explain why it's not a bug. To put it another way, context variables aren't required to be bug-compatible with thread local storage. [1] I'm assuming here that "settings that affect how a library behave" is a common requirement, and the PEP is intended as the "one obvious way" to implement them. Nick's other async refactoring example is different. If the two forms he showed don't behave identically in all contexts, then I'd consider that to be a major problem. Saying that "coroutines are special" just reads to me as "coroutines/async are sufficiently weird that I can't expect my normal patterns of reasoning to work with them". (Apologies if I'm conflating coroutines and async incorrectly - as a non-expert, they are essentially indistinguishable to me). I sincerely hope that isn't the message I should be getting - async is already more inaccessible than I'd like for the average user. The fact that Nick's async example immediately devolved into a discussion that I can't follow at all is fine - to an extent. I don't mind the experts debating implementation details that I don't need to know about. But if you make writing context variables harder, just to fix Nick's example, or if you make *using* async code like (either of) Nick's forms harder, then I do object, because that's affecting the end user experience. In that context, I take Steve's comment as meaning "fiddling about with how __aenter__ and __aexit__ work is fine, as that's internals that non-experts like me don't care about - but making context variables behave oddly because of this is *not* fine". Apologies if the above is unhelpful. I've been lurking but not commenting here, precisely because I *am* a non-expert, and I trust the experts to build something that works. But when non-experts were explicitly mentioned, I thought my input might be useful. The following quote from the Zen seems particularly relevant here: If the implementation is hard to explain, it's a bad idea. (although the one about needing to be Dutch to understand why something is obvious might well trump it ;-)) Paul

On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore@gmail.com> wrote: [..]
Nick's idea that coroutines can isolate context was actually explored before in PEP 550 v3, and then, rather quickly, it became apparent that it wouldn't work. Steve's comments were about a specific example about generators, not coroutines. We can't special case __aenter__, we simply can not. __aenter__ can be a chain of coroutines -- its own separate call stack, we can't say that this whole call stack is behaving differently from all other code with respect to execution context. At this time, we have so many conflicted examples and tangled discussions on these topics, that I myself just lost what everybody is implying by "this semantics isn't obvious to *me*". Which semantics? It's hard to tell. At this point of time, there's just one place which describes one well defined semantics: PEP 550 latest version. Paul, if you have time/interest, please take a look at it, and say what's confusing there. Yury

On 13 October 2017 at 23:30, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Hi Yury, The following is my impressions from a read-through of the initial part of the PEP. tl; dr - you say "concurrent" too much and it makes my head hurt :-) 1. The abstract feels like it's talking about async. The phrase "consistent access to non-local state in the context of out-of-order execution, such as in Python generators and coroutines" said async to me, even though it mentioned generators. Probably because any time I see generators mentioned alongside coroutines (a term I really don't grasp yet in the context of Python) I immediately assume the reference is to the weird extensions of generators when send() and yield expressions are used. It quite genuinely took me two or three attempts to get past the abstract and actually read the next section, because the "this is async" idea came across so strongly. 2. The rationale says that "Prior to the advent of asynchronous programming in Python" threads and TLS were used - and it implies this was fine. But the section goes on to say "TLS does not work well for programs which execute concurrently in a single thread". But it uses a *generator* as the example. I'm sorry, but to me a generator is pure and simple standard Python, and definitely not "executing concurrently in a single thread" (see below). So again, the clash between what the description said and the actual example left me confused (and confused enough to equate all of this in my mind with "all that async stuff I don't follow"). 3. "This is because implicit Decimal context is stored as a thread-local, so concurrent iteration of the fractions() generator would corrupt the state." This makes no sense to me. The example isn't concurrent. There's only one thread, and no async. So no concurrency. It's interleaved iteration through two generators, which I understand is *technically* considered concurrency in the async sense, but doesn't *feel* like concurrency. At its core, this is the problem I'm hitting throughout the whole document - the conceptual clash between examples that don't feel concurrent, and discussions that talk almost totally in terms of concurrency, means that understanding every section is a significant mental effort. 4. By the end of the rationale, what I'd got from the document was: "There's a bug in decimal.context, caused by the fact that it uses TLS. It's basically a limitation of TLS. To fix it they need a new mechanism, which this PEP provides." So unless I'm using (or would expect to use) TLS in my own code, this doesn't affect me. Which really isn't the point (if I now understand correctly) - the PEP is actually providing a safe (and hopefully easy to use/understand!) mechanism for handling a specific class of programming problem, maintaining dynamic state that follows the execution order of the code, rather than the lexical structure. (I didn't state that well - but I hope I got the idea across) Basically, the problem that Lisp dynamic variables are designed to solve (although I don't think that describing the feature in terms of Lisp is a good idea either). 4a. I'd much prefer this part of the PEP to be structured as follows: * There's a class of programming problems that need to allow code to access "state" in a way that follows the runtime path the code takes. Prior art in this area include Lisp's dynamic scope, ... (more examples would be good - IIRC, Perl has this type of variable too). * Normal local variables can't do this as they are lexically scoped. Global variables can be used, but they don't work in the presence of threads. * TLS work for threads, but hit problems when code execution paths aren't nested subroutine-style. Examples where this happens are generators (which suspend execution and yield back to their parent), and async (which simulates multiple threads by interleaving execution of generators). [Note - I know this explanation is probably inaccurate] * This PEP proposes a general mechanism that will allow programmers to simply write code that manages state like this, which will work in all of the above cases. That's it. Barely any mention of async, no need to focus on the Decimal bug except as a motivating example of why TLS isn't sufficient, and so no risk that people think "why not just fix decimal.context" - so no need to go into detail as to why you can't "just fix it". And it frames the PEP as providing a new piece of functionality that *anyone* might find a use for, rather than as a fix for a corner case of async/TLS interaction. 5. The "Goals" section says "provide a more reliable threading.local() alternative" which is fine. But the bullet points do exactly the same as before, using terms that I associate with async to describe the benefits, and so they aren't compelling to me. I'd say something like: * Is a reliable replacement for TLS that doesn't have the issue that was described in the rationale * Is closely modeled on the TLS API, to minimise the impact of switching on code that currently uses TLS * Performance yada yada yada - I don't think this is important, there's been no mention yet that any of this is performance critical (but see 4a above, this could probably be improved further if the rationale were structured the way I propose there). 6. The high level spec, under generators, says: """ Unlike regular function calls, generators can cooperatively yield their control of execution to the caller. Furthermore, a generator does not control where the execution would continue after it yields. It may be resumed from an arbitrary code location. """ That's not how I understand generators. To me, a generator can *suspend its execution* to be resumed later. On suspension, control *always* returns to the caller. Generators can be resumed from anywhere, although the most common pattern is to resume them repeatedly in a loop. To me, this implies that context variables should follow that execution path. If the caller sets a value, the generator sees it. If the generator sets a value then yields, the caller will see that. If code changes the value between two resumptions of the generator, the generator will see those changes. The PEP at this point, though, states the behaviour of context variables in a way that I simply don't follow - it's using the idea of an "outer context" - which as far as I can see, has never been defined at this point (and doesn't have any obvious meaning in terms of the execution flow, which is not nested in any obvious sense - that's the *point* of generators, to not have a purely nested execution path). The problem with the decimal context isn't about any of that - it's about how "yield" interacts with "with", and specifically that yielding out of the with *doesn't* run the exit part of the context manager, as the code inside the with statement hasn't finished running yet. Having stated the problem like that, I'm wondering why the solution isn't to add some sort of "suspend/resume" mechanism to the context manager protocol, rather than introducing context variables? That may be worth adding to the "Rejected ideas" section if it's not a viable solution. The next section of the high level spec is coroutines and async, which I'll skip, as I firmly believe that as I don't use them, if there's anything of relevance to me in that section, it should be moved to somewhere that isn't about async. I'm not going to comment on anything further. At this point, I'm far too overwhelmed with concepts and ideas that are at odds with my understanding of the problem to really take in detail-level information. I'd assume that the detail is about how the overall functionality as described is implemented, but as I don't really have a working mental model of the high-level functionality, I doubt I'd get much from the detail. I hope this is of some use. I appreciate I'm talking about a pretty wholesale rewrite, and it's quite far down the line to be suggesting such a thing. I'll understand if you don't feel it's worthwhile to take that route. Paul

Once again, I think Paul Moore gets to the heart of the issue. Generators are simply confusing & async even more so. Per my earlier email, the fact that generators look like functions, but are not functions, is at the root of the confusion. This next part really gets to the heart of the matter: On Sun, Oct 15, 2017 at 8:15 AM, Paul Moore <p.f.moore@gmail.com> wrote:
This is the totally natural way to think of generators -- and exactly how I thought about them when I started -- and how I suspect 99% of beginners think of them: - And exactly what you expect since generators appear to be functions (since they use 'def' to create them). Now, as I understand it, its not what really happens, in fact, they can have their own context, which the major discussion here is all about: 1. Do we "bind" the context at the time you create the generator (which appears to call the generator, but really doesn't)?; OR 2. Do we "bind" the context at the time the first .__next__ method is called? And, as far as I can figure out, people are strongly arguing for #2 so it doesn't break backwards compatibility: - And then discussion of using wrappers to request #1 instead of #2 My answer is, you can't do #1 or #2 -- you need to do #3, as the default, -- what Paul write above -- anything else is "unnatural" and "non-intuitive". Now, I fully understand we *actually* want the unnatural behavior of #1 & #2 in real code (and #3 is not really that useful in real code). However #3 is the natural understanding of what it does ... so that what I am arguing needs to be implemented (as the default). Then when we want either #1 or #2, when we are writing real code, -- there should be special syntax to indicate that is what we want (and yes, we'll have to use the special syntax 95%+ of the time since we really want #1 or #2 95%+ of the time; and don't want #3). But that is the whole point, we *should* use special syntax to indicate we are doing something that is non-intuitive. This special syntax helps beginners understand better & will help them think about the concepts more clearly (See previous post be me on adding first class language to defining generators, so its a lot clearer what is going on with then). My argument, is a major strength of python, is how the syntax helps teach you concepts & how easy the language is to pick up. I've talked to so many people who have said the same thing about the language when they started. Generators (and this whole discussion of context variables) is not properly following that paradigm; and I believe it should. It would make python even stronger as a programming language of choice, that not only is easy to use, but easy to learn from as you start programming.

On 15 October 2017 at 13:51, Amit Green <amit.mixie@gmail.com> wrote:
I don't agree. I don't find generators *at all* confusing. They are a very natural way of expressing things, as has been proven by how popular they are in the Python community. I don't *personally* understand async, but I'm currently willing to reserve judgement until I've been in a situation where it would be useful, and therefore needed to learn it. Paul

Generators are a wonderful feature of the python language, and one of its best idea. They are initially very intuitive to understand & easy to use. However, once you get beyond that; they are actually quite confusing because their behavior is not natural. Thus they have a initial easy learning & acceptance curve; and then as you go from initial use to more advanced use, there is a sudden "bump" in the learning curve, which is not as smooth as it could be. Specifically, the fact that when you call them, they do not actually call your code (but instead call a wrapper) is quite confusing. Example: import __main__ def read_and_prefix_each_line(path): with open(path) as f: data = f.read() for s in data.splitlines(): yield '!' + s def print_prefixed_file(path): reader = read_and_prefix_each_line(path) #LINE 12 print('Here is how %r looks prefixed' % path) for s in reader: #LINE 16 print(s) print_prefixed_file(__main__.__file__) print_prefixed_file('nonexistent') Will produce the following: Traceback (most recent call last): File "x.py", line 20, in <module> print_prefixed_file('nonexistent') File "x.py", line 16, in print_prefixed_file for s in reader: File "x.py", line 5, in read_and_prefix_each_line with open(path) as f: IOError: [Errno 2] No such file or directory: 'nonexistent' This is quite confusing to a person who has been using generators for a month, and thinks they understand them. WHY is the traceback happening at line 16 instead of at line 12, when the function is called? It is much more intuitive, and natural [to a beginner], to expect the failure to open the file "nonexistent" to happen at line 12, not line 16. So, now, the user, while trying to figure out a bug, has to learn that: - NO, calling a generator (which looks like a function) does not actually call the body of the function (as the user defined it) - Instead it calls some generator wrapper. - And finally the first time the __next__ method of the wrapper is called, the body of the function (as the user defined it) gets called. And this learning curve is quite steep. It is made even harder by the following: >>> def f(): yield 1 ... >>> f <function f at 0x7f748507c5f0> So the user is now convinced that 'f' really is a function. Further investigation makes it even more confusing: >>> f() <generator object f at 0x7f74850716e0> At this point, the user starts to suspect that something is kind of unusual about 'yield' keyword. Eventually, after a while, the user starts to figure this out:
And eventually after reading https://docs.python.org/3/reference/datamodel.html the following sentence: "The following flag bits are defined for co_flags: bit 0x04 is set if the function uses the *arguments syntax to accept an arbitrary number of positional arguments; bit 0x08 is set if the function uses the **keywords syntax to accept arbitrary keyword arguments; bit 0x20 is set if the function is a generator." Finally figures it out: >>> def f(): yield 1 ... >>> f <function f at 0x7f73f38089d8> >>> f.__code__.co_flags & 0x20 32 My point is ... this learning process is overly confusing to a new user & not a smooth learning curve. Here is, just a quick initial proposal on how to fix it: >>> def f(): yield 1 >>> ... Syntax Error: the 'yield' keyword can only be used in a generator; please be sure to use @generator before the definition of the generator >>> @generator ... def f(): yield 1 ... >>> f <generator f at 0x7f73f38089d8> Just the fact it says '<generator f at ...' instead of '<function f at ...' would be a *BIG* help to starting users. This would also really drive home, from the start, the idea to the user, that: - A generator is special, is not a function - Calling the generator does not call the generator, but its wrapper. (the generator is not called until its __next_ method is called) Next: 1. Don't make this the default; but require: from __future__ import generator to activate this feature (for the next few releases of python) 2. Regarding contexts, *REQUIRE* an argument to generator that tells it how to have the generator interact with contexts. I.E.: Something like: @generator(catpure_context_at_start = true) def f(): yield 1 With three options: (a) capture context at start; (b) capture context on first call to __next__; (c) don't capture context at all, but have it work the natural way, you expressed two emails ago (i.e.: each time use the context of the caller, not a special context for the generator). Finally if a user attempts to user a generator with contexts but without one of these three parameters, throw a syntax error & tell the user the context usage must be specified. The major point of all this, is make the learning curve easier for users, so generators are: - Intuitive, easy to pick up & quick use (as they currently are) - Intuitive easy to pick up & quick to use, as you go from a beginner to a medium level user (making it easier to learn their specific in's & out's) - Intuitive, easy to pick up & quick to use, as you go from medium user to advanced user (and need to make them interact in different way with contexts, etc). On Sun, Oct 15, 2017 at 9:33 AM, Paul Moore <p.f.moore@gmail.com> wrote:

On Sun, Oct 15, 2017 at 8:15 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Hi Paul, Thanks *a lot* for this detailed analysis. Even though PEP 550 isn't going to make it to 3.7 and I'm not going to edit/rewrite it anymore, I'll try to incorporate some of your feedback into the new PEP. Thanks, Yury

really like what Paul Moore wrote here as it matches a *LOT* of what I have been feeling as I have been reading this whole discussion; specifically: - I find the example, and discussion, really hard to follow. - I also, don't understand async, but I do understand generators very well (like Paul Moore) - A lot of this doesn't seem natural (generators & context variable syntax) - And particular: " If the implementation is hard to explain, it's a bad idea." I've spend a lot of time thinking about this, and what the issues are. I think they are multi-fold: - I really use Generators a lot -- and find them wonderful & are one of the joy's of python. They are super useful. However, as I am going to, hopefully, demonstrate here, they are not initially intuitive (to a beginner). - Generators are not really functions; but they appear to be functions, this was very confusing to me when I started working with generators. - Now, I'm used to it -- BUT, we really need to consider new people - and I suggest making this easier. - I find the proposed context syntax very confusing (and slow). I think contexts are super-important & instead need to be better integrated into the language (like nonlocal is) - People keep writing they want a real example -- so this is a very real example from real code I am writing (a python parser) and how I use contexts (obviously they are not part of the language yet, so I have emulated them) & how they interact with generators. The full example, which took me a few hours to write is available here (its a very very reduced example from a real parser of the python language written in python): - https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py Here is the result of running the code -- which reads & executes demo1.py (importing & executing demo2.py twice): [Not by executing, I mean the code is running its own parser to execute it & its own code to emulate an 'import' -- thus showing nested contexts): It creates two input files for testing -- demo1.py: print 1 print 8 - 2 * 3 import demo2 print 9 - sqrt(16) print 10 / (8 - 2 * 3) import demo2 print 2 * 2 * 2 + 3 - 4 And it also creates demo2.py: print 3 * (2 - 1) error print 4 There are two syntax errors (on purpose) in the files, but since demo2.py is imported twice, this will show three syntax errors. Running the code produces the following: demo1.py#1: expression '1' evaluates to 1 demo1.py#2: expression '8 - 2 * 3' evaluates to 2 demo1.py#3: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#4: UNKNOWN ATOM: ' sqrt(16)' demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5 demo1.py#6: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7 This code demonstrates all of the following: - Nested contexts - Using contexts 'naturally' -- i.e.: directly as variables; without a 'context.' prefix -- which I would find too harder to read & also slower. - Using a generator that is deliberately broken up into three parts, start, next & stop. - Handling errors & how it interacts with both the generator & 'context' - Actually parsing the input -- which creates a deeply nested stack (due to recursive calls during expression parsing) -- thus a perfect example for contexts. So given all of the above, I'd first like to focus on the generator: - Currently we can write generators as either: (1) functions; or (2) classes with a __next__ method. However this is very confusing to a beginner. - Given a generator like the following (actually in the code): def __iter__(self): while not self.finished: self.loop += 1 yield self - What I found so surprising when I started working with generator, is that calling the generator does *NOT* actually start the function. - Instead, the actual code does not actually get called until the first __next__ method is called. - This is quite counter-intuitive. I therefore suggest the following: - Give generators their own first class language syntax. - This syntax, would have good entry point's, to allow their interaction with context variables. Here is the generator in my code sample: # # Here is our generator to walk over a file. # # This generator has three sections: # # generator_start - Always run when the generator is started. # This opens the file & reads it. # # generator_next - Run each time the generator needs to retrieve # The next value. # # generator_stop - Called when the generator is going to stop. # def iterate_lines(path): data_lines = None def generator_startup(path): nonlocal current_path, data_lines with open(path) as f: current_path = path data = f.read() data_lines = tuple(data.splitlines()) def generator_next(): nonlocal current_line, line_number for current_line in data_lines: line_number += 1 line_position = 0 yield current_line generator_stop() def generator_stop(): current_path = None line_number = 0 line_position = 0 generator_startup(path) return generator_next() This generator demonstrates the following: - It immediately starts up when called (and in fact opens the file when called -- so if the file doesn't exist, an exception is thrown then, not later when the __next__ method is first called) - It's half way between a function generator & a class generator; thus (1) efficient; and (2) more understandable than a class generator. Here is (a first draft) proposal and how I would like to re-write the above generator, so it would have its own first class syntax: generator iterate_lines(path): local data_lines = None context current_path, current_line, line_number, line_position start: with open(path) as f: current_path = path data = f.read() data_lines = tuple(data.splitlines()) next: for current_line in data_lines: line_number += 1 line_position = 0 yield current_line stop: current_path = None line_number = 0 line_position = 0 This: 1. Adds a keyword 'generator' so its obvious this is a generator not a function. 2. Declares it variables (data_lines) 3. Declares which context variables it wants to use (current_path, currentline, line_number, & line_position) 4. Has a start section that immediately gets executed. 5. Has a next section that executes on each call to __next__ (and this is where the yield keyword must appear) 6. Has a stop section that executes when the generator receives a StopIteration. 7. The compiler could generate equally efficient code for generators as it does for current generators; while making the syntax clearer to the user. 8. The syntax is chosen so the user can edit it & convert it to a class generator. Given the above: - I could now add special code to either the 'start' or 'next' section, saying which context I wanted to use (once we have that syntax implemented). The reason for its own syntax is to allow us to think more clearly about the different parts of a generator & then makes it easier for the user to choose which part of the generator interacts with contexts & which context. In particular the user could interact with multiple contexts (one in the start section & a different one in the next section). [Also for other generators I think the syntax needs to be extended, to something like: next(context): use context: .... Allowing two new features --- requesting that the __next__ receive the context of the caller & secondly being able to use that context itself. Next, moving on to contexts: - I love how non-local works & how you can access variables declared in your surrounding function. - I really think that contexts should work the same way - You would simply declare 'context' (like non-local) & just be able to use the variables directly. - Way easier to understand & use. The sample code I have actually emulates contexts using non-local, so as to demonstrate the idea I am explaining. Thanks, Amit P.S.: As I'm very new to python ideas, I'm not sure if I should start a separate thread to discuss this or use the current thread. Also I'm not sure if I should attached the sample code here or not ... So I just provided the link above. On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore@gmail.com> wrote:

On 13Oct2017 1132, Yury Selivanov wrote:
Why not? Can you not add a decorator that sets a flag on the code object that means "do not create a new context when called", and then it doesn't matter where the call comes from - these functions will always read and write to the caller's context. That seems generally useful anyway, and then you just say that __aenter__ and __aexit__ are special and always have that flag set.
I don't know that I said "more usable", but it would certainly be easier to explain. The Zen has something to say about that... Cheers, Steve

On 14 October 2017 at 08:44, Steve Dower <steve.dower@python.org> wrote:
One example where giving function names implicit semantic significance becomes problematic: async def start_transaction(self): ... async def end_transaction(self, *exc_details): ... __aenter__ = start_transaction __aexit__ = end_transaction There *are* ways around that (e.g. type.__new__ implicitly wraps __init_subclass__ with classmethod since it makes no sense as a regular instance method), but then you still run into problems like this: async def __aenter__(self): return await self.start_transaction() async def __aexit__(self, *exc_details): return await self.end_transaction(*exc_details) If coroutines were isolated from their parents by default, then the above method implementations would be broken, even though the exact same invocation pattern works fine for synchronous function calls. To try and bring this back to synchronous examples that folks may find more intuitive, I figure it's worth framing the question this way: do we want people to reason about context variables like the active context is implicitly linked to the synchronous call stack, or do we want to encourage them to learn to reason about them more like they're a new kind of closure? The reason I ask that is because there are three "interesting" times in the life of a coroutine or generator: - definition time (when the def statement runs - this determines the lexical closure) - instance creation time (when the generator-iterator or coroutine is instantiated) - execution time (when the frame actually starts running - this determines the runtime call stack) For synchronous functions, instance creation time and execution time are intrinsically linked, since the execution frame is allocated and executed directly as part of calling the function. For asynchronous operations, there's more of a question, since actual execution is deferred until you call await or next() - the original synchronous call to the factory function instantiates an object, it doesn't actually *do* anything. The current position of PEP 550 (which I agree with) is that context variables should default to being closely associated with the active call stack (regardless of whether those calls are regular synchronous ones, or asynchronous ones with await), as this keeps the synchronous and asynchronous semantics of context variables as close to each other as we can feasibly make them. When implicit isolation takes place, it's either to keep concurrently active logical call stacks isolated from each other (the event loop case), and else to keep context changes from implicitly leaking *up* a stack (the generator case), not to keep context changes from propagating *down* a call stack. When we do want to prevent downward propagation for some reason, then that's what "run_in_execution_context" is for: deliberate creation of a new concurrently active call stack (similar to running something in another thread to isolate the synchronous call stack). Don't get me wrong, I'm not opposed to the idea of making it trivial to define "micro tasks" (iterables that perform a context switch to a specified execution context every time they retrieve a new value) that can provide easy execution context isolation without an event loop to manage it, I just think that would be more appropriate as a wrapper API that can be placed around any iterable, rather than being baked in as an intrinsic property of generators. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 14 October 2017 at 08:09, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm really struggling to keep up here. I need to go and fully read the PEP as Yury suggested, and focus on what's in there. But I'll try to answer this comment. I will ask one question, though, based on Yury's point "the PEP is where you should look for the actual semantics" - can you state where in the PEP is affected by the answer to this question? I want to make sure that when I read the PEP, I don't miss the place that this whole discussion thread is about... I don't think of contexts in terms of *either* the "synchronous call stack" (which, by the way, is much too technical a term to make sense to the "non-expert" people around here like me - I know what the term means, but only in a way that's far to low-level to give me an intuitive sense of what contexts are) or closures. At the risk of using another analogy that's unfamiliar to a lot of people, I think of them in terms of Lisp's dynamic variables. Code that needs a context variable, gets the value that's current *at that time*. I don't want to have to think lower level than that - if I have to, then in my view there's a problem with a *different* abstraction (specifically async ;-)) To give an example: async def get_webpage(id): url = f"https://{server}/{app}/items?id={id}" # 1 encoding, content = await url_get(url) #2 return content.decode(encoding) I would expect that, if I set a context variable at #1, and read it at #2, then: 1. code run as part of url_get would see the value set at 1 2. code run as part of url_get could set the value, and I'd see the new value at 2 It doesn't matter what form the lines in the function take (loops, with statements, conditionals, ...) as long as they are run immediately (class and function definitions should be ignored - there's no lexical capture of context variables). That probably means "synchronous call stack" in your terms, but please don't assume that any implications of that term which aren't covered by the above example are obvious to me. To use the decimal context example:
There's only one setting of a context here, so it's obvious - values returned from gen have precision 30.
"for i in g" is getting values from the generator, at a time when the precision is 30, so those values should have precision 30. There's no confusion here to me. If that's not what decimal currently does, I'd happily report that as a bug. The refactoring case is similarly obvious to me:
All we've done here is take some code out of the with block and write it as a helper. There should be no change of semantics when doing so. That's a fundamental principle to me, and honestly I don't see it as credible for anyone to say otherwise. (Anyone who suggests that is basically saying "if you use async, common sense goes out of the window" as far as I'm concerned).
OK. They aren't *really* interesting to me (they are a low-level detail, but they should work to support intuitive semantics, not to define what my intuition should be) but I'd say that my expectation is that the *execution time* value of the context variable is what I'd expect to get and set.
This isn't particularly a question for me: g = gen() creates an object. next(g) - or more likely "for o in g" - runs it, and that's when the context matters. I struggle to understand why anyone would think otherwise.
At the high level we're talking here, I agree with this.
I don't understand this. If it matters, in terms of explaining corner cases of the semantics, then it needs to be explained in more intuitive terms. If it's an implementation detail of *how* the PEP ensures it acts intuitively, then I'm fine with not needing to care.
I read that as "run_in_execution_context is a specialised thing that you'll never need to use, because you don't understand its purpose - so just hope that in your code, everything will just work as you expect without it". The obvious omission here is an explanation of precisely who my interpretation *doesn't* apply for. Who are the audience for run_in_execution_context? If it's "people who write context managers that use context variables" then I'd say that's a problem, because I'd hope a lot of people would find use for this, and I wouldn't want them to have to understand the internals to this level. If it's something like "people who write async context managers using raw __aenter__ and __aexit__ functions, as opposed to the async version of @contextmanager", then that's probably fine.
I don't think it matters whether it's trivial to write "micro tasks" if non-experts don't know what they are ;-) I *do* think it matters if "micro tasks" are something non-experts might need to write, but not realise they are straying into deep waters. But I've no way of knowing how likely that is. One final point, this is all pretty deeply intertwined with the comprehensibility of async as a whole. At the moment, as I said before, async is a specialised area that's largely only used in projects that centre around it. In the same way that Twisted is its own realm - people write network applications without Twisted, or they write them using Twisted. Nobody uses Twisted in the middle of some normal non-async application like pip to handle grabbing a webpage. I'm uncertain whether the intent is for the core async features to follow this model, or whether we'd expect in the longer term for "utility adoption" of async to happen (tactical use of async for something like web crawling or collecting subprocess output in a largely non-async app). If that *does* happen, then async needs to be much more widely understandable - maintenance programmers who have never used async will start encountering it in corners of their non-async applications, or find it used under the hood in libraries that they use. This discussion is a good example of the implications of that - async quirks leaking out into the "normal" world (decimal contexts) and as a result the async experts needing to be able to communicate their concerns and issues to non-experts. Hopefully some of this helps, Paul

On 14 October 2017 at 21:56, Paul Moore <p.f.moore@gmail.com> wrote: TL;DR of below: PEP 550 currently gives you what you're after, so your perspective counts as a preference for "please don't eagerly capture the creation time context in generators or coroutines". To give an example:
This is consistent with what PEP 550 currently proposes, because you're creating the coroutine and calling it in the same expression: "await url_get(url)". That's the same as what happens for synchronous function calls, which is why we think it's also the right way for coroutines to behave. The slightly more open-to-challenge case is this one: # Point 1 (pre-create) cr = url_get(url) # Point 2 (post-create, pre-call) encoding, content = await cr # Point 3 (post-call) PEP 550 currently says that it doesn't matter whether you change the context at point 1 or point 2, as "get_url" will see the context as it is at the await call (i.e. when it actually gets executed), *not* as it is when the coroutine is created. The suggestion has been made that we should instead be capturing the active context when "url_get(url)" is called, and implicitly switching back to that at the point where await is called. It doesn't seem like a good idea to me, as it breaks the "top to bottom" mental model of code execution (since the "await cr" expression would briefly switch the context back to the one that was in effect on the "cr = url_get(url)" line without even a nested suite to indicate that we may be adjusting the order of code execution). It would also cause problems with allowing context changes to propagate out of the "await cr" call, since capturing a context implies forking it, and hence any changes would somehow need to be transplanted back to a potentially divergent context history (if a context change *did* happen at point 2 in the split example). It doesn't matter what form the lines in the function take (loops,
I think you got everything, as I really do just mean the stack of frames in the current thread that will show up in a traceback. We normally just call it "the call stack", but that's ambiguous whenever we're also talking about coroutines (since each await chain creates its own distinct asynchronous call stack).
This is the existing behaviour that PEP 550 is recommending we preserve as the default generator semantics, even if decimal (or a comparable context manager) switches to using context vars instead of thread locals. As with coroutines, the question has been asked whether or not the "g = gen()" line should be implicitly capturing the active execution context at that point, and then switching backing it for each iteration of "for i in g:".
That's the view PEP 550 currently takes as well.
If you capture the context eagerly, then there are fewer opportunities to get materially different values from "data = list(iterable)" and "data = iter(context_capturing_iterable)". While that's a valid intent for folks to want to be able to express, I personally think it would be more clearly requested via an expression like "data = iter_in_context(iterable)" rather than having it be implicit in the way generators work (especially since having eager context capture be generator-only behaviour would create an odd discrepancy between generators and other iterators like those in itertools).
Cases where we expect context changes to be able to propagate into or out of a frame: - when you call something, it can see your context - when something you called returns, you can see changes it made to your context - when a generator-based context manager is suspended Call in the above deliberately covers both sychronous calls (with regular call syntax) and asynchronous calls (with await or yield from). Cases where we *don't* expect context changes to propagate out of a frame: - when you spun up a separate logical thread of execution (either an actual OS thread, or an event loop task) - when a generator-based iterator is suspended
Context managers would be fine (the defaults are deliberately set up to make those "just work", either globally, or in the relevant decorators). However, people who write event loops will need to care about it, as would anyone writing an "iter_in_context" helper function. Folks trying to strictly emulate generator semantics in their own iterators would also need to worry about it, but "revert any context changes before returning from __next__" is a simpler alternative to actually doing that.
A micro-task is just a fancier name for the "iter_in_context" idea above (save the current context when the iterator is created, switch back to that context every time you're asked for a new value).
Aye, this is why I'd like the semantics of context variables to be almost indistinguishable from those of thread local variables for synchronous code (aside from avoiding context changes leaking out of generator-iterators when they yield from inside a with statement). PEP 550 currently does a good job of ensuring that, but we'd break that near equivalence if generators were to implicitly capture their creation context. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 14 October 2017 at 17:50, Nick Coghlan <ncoghlan@gmail.com> wrote:
Thank you. That satisfies my concerns pretty well.
OK. Then I think that's a bad idea - and anyone proposing it probably needs to explain much more clearly why it might be a good idea to jump around in the timeline like that.
OK. I understand the point here - but I'm not sure I see the practical use case for iter_in_context. When would something like that be used? Paul

On 15 October 2017 at 05:47, Paul Moore <p.f.moore@gmail.com> wrote:
Suppose you have some existing code that looks like this: results = [calculate_result(a, b) for a, b in data] If calculate_result is context dependent in some way (e.g. a & b might be decimal values), then eager evaluation of "calculate_result(a, b)" will use the context that's in effect on this line for every result. Now, suppose you want to change the code to use lazy evaluation, so that you don't need to bother calculating any results you don't actually use: results = (calculate_result(a, b) for a, b in data) In a PEP 550 world, this refactoring now has a side-effect that goes beyond simply delaying the calculation: since "calculate_result(a, b)" is no longer executed immediately, it will default to using whatever execution context is in effect when it actually does get executed, *not* the one that's in effect on this line. A context capturing helper for iterators would let you decide whether or not that's what you actually wanted by instead writing: results = iter_in_context(calculate_result(a, b) for a, b in data) Here, "iter_in_context" would indicate explicitly to the reader that whenever another item is taken from this iterator, the execution context is going to be temporarily reset back to the way it was on this line. And since it would be a protocol based iterator-in-iterator-out function, you could wrap it around *any* iterator, not just generator-iterator objects. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 15.10.2017 06:39, Nick Coghlan wrote:
I have a hard time seeing the advantage of having a default where the context at the time of execution is dependent on where it happens rather than where it's defined. IMO, the default should be to use the context where the line was defined in the code, since that matches the intuitive way of writing and defining code. The behavior of also deferring the context to time of execution should be the non-standard form to not break this intuition, otherwise debugging will be a pain and writing fully working code would be really hard in the face of changing contexts (e.g. say decimal rounding changes in different parts of the code). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Oct 15 2017)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 15 October 2017 at 14:53, M.-A. Lemburg <mal@egenix.com> wrote:
The underlying rationale is that the generator form should continue to be as close as we can reasonably make it to being pure syntactic sugar for the iterator form: class ResultsIterator: def __init__(self, data): self._itr = iter(data) def __next__(self): return calculate_result(next(self._itr)) results = _ResultsIterator(data) The logical context adjustments in PEP 550 then serve to make using a with statement around a yield expression in a generator closer in meaning to using one around a return statement in a __next__ method implementation.
This would introduce a major behavioural discrepancy between generators and iterators.
No, it really wouldn't, since "the execution context is the context that's active when the code is executed" is relatively easy to understand based entirely on the way functions, methods, and other forms of delayed execution work (including iterators). "The execution context is the context that's active when the code is executed, *unless* the code is in a generator, in which case, it's the context that was active when the generator-iterator was instantiated" is harder to follow. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 15.10.2017 07:13, Nick Coghlan wrote:
I think you're mixing two concepts here: the context defines a way code is supposed to be interpreted at runtime. This doesn't have anything to do with when the code is actually run. Just think what happens if you write code using a specific context (let's say rounding to two decimal places), which then get executed deferred within another context (say rounding to three decimal places) for part of the generator run and yet another context (say rounding to whole integers) for the remainder of the generator. I can't think of a good use case where such behavior would be intuitive, expected or even reasonable ;-) The context should be inherited by the generator when instantiated and not change after that, so that the context defining the generator takes precedent over any later context in which the generator is later run. Note that the above is not the same as raising an exception and catching it somewhere else (as Nathaniel brought up). The context actually changes semantics of code, whereas exceptions only flag a special state and let other code decide what to do with it (defined where the exception handling is happening, not where the raise is caused). Just for clarification: I haven't followed the thread, just saw your posting and found the argument you put forward a bit hard to follow. I may well be missing some context or evaluating the argument in a different one as the one where it was defined ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts
::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/

On Sat, Oct 14, 2017 at 9:53 PM, M.-A. Lemburg <mal@egenix.com> wrote:
Of course, that's already the default: it's now regular variables and function arguments work. The reason we have forms like 'with decimal.localcontext', 'with numpy.errstate' is to handle the case where you want the context value to be determined by the runtime context when it's accessed rather than the static context where it's accessed. That's literally the whole point. It's not like this is a new and weird concept in Python either -- e.g. when you raise an exception, the relevant 'except' block is determined based on where the 'raise' happens (the runtime stack), not where the 'raise' was written: try: def foo(): raise RuntimeError except RuntimeError: print("this is not going to execute, because Python doesn't work that way") foo() -n -- Nathaniel J. Smith -- https://vorpus.org

On 15 October 2017 at 15:49, Nathaniel Smith <njs@pobox.com> wrote:
Exactly - this is a better formulation of what I was trying to get at when I said that we want the semantics of context variables in synchronous code to reliably align with the semantics of the synchronous call stack as it appears in an exception traceback. Attempting a pithy summary of PEP 550's related semantics for use in explanations to folks that don't care about all the fine details: The currently active execution context aligns with the expected flow of exception handling for any exceptions raised in the code being executed. And with a bit more detail: * If the code in question will see the exceptions your code raises, then your code will also be able to see the context variables that it defined or set * By default, this relationship is symmetrical, such that if your code will see the exceptions that other code raises as a regular Python exception, then you will also see the context changes that that code makes. * However, APIs and language features that enable concurrent code execution within a single operating system level thread (like event loops, coroutines and generators) may break that symmetry to avoid context variable management conflicts between concurrently executing code. This is the key behavioural difference between context variables (which enable this by design) and thread local variables (which don't). * Pretty much everything else in the PEP 550 API design is a lower level performance optimisation detail to make management of this dynamic state sharing efficient in event-driven code Even PEP 550's proposal for how yield would work aligns with that "the currently active execution context is the inverse of how exceptions will flow" notion: the idea there is that if a context manager's __exit__ method wouldn't see an exception raised by a piece of code, then that piece of code also shouldn't be able to see any context variable changes made by that context manager's __enter__ method (since the changes may not get reverted correctly on failure in that case). Exceptions raised in a for loop body *don't* typically get thrown back into the body of the generator-iterator, so generator-iterators' context variable changes should be reverted at their yield points. By contrast, exceptions raised in a with statement body *do* get thrown back into the body of a generator decorated with contextlib.contextmanager, so those context variable changes should *not* be reverted at yield points, and instead left for __exit__ to handle. Similarly, coroutines are in the exception handling path for the other coroutines they call (just like regular functions), so those coroutines should share an execution context rather than each having their own. All of that leads to it being specifically APIs that already need to do special things to account for exception handling flows within a single thread (e.g. asyncio.gather, asyncio.ensure_future, contextlib.contextmanager) that are likely to have to put some thought into how they will impact the active execution context. Code for which the existing language level exception handling semantics already work just fine should then also be able to rely on the default execution context management semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Oct 15, 2017 at 06:53:58AM +0200, M.-A. Lemburg wrote:
It would be a major change, but I also think lexical scoping would work best for the decimal context. Isolation of modules and libraries (which IMO is a bigger problem than the often cited generator issues) would be solved. It would probably not work best (or even at all) for the async call chain use case. Stefan Krah

On 15 October 2017 at 05:39, Nick Coghlan <ncoghlan@gmail.com> wrote:
OK, got it. That sounds to me like a candidate for a stdlib function (either because it's seen as a common requirement, or because it's tricky to get right - or both). The PEP doesn't include it, as far as I can see, though. But I do agree with MAL, it seems wrong to need a helper for this, even though it's a logical consequence of the other semantics I described as intuitive :-( Paul

I would like to reboot this discussion (again). It feels to me we're getting farther and farther from solving any of the problems we might solve. I think we need to give up on doing anything about generators; the use cases point in too many conflicting directions. So we should keep the semantics there, and if you don't want your numeric or decimal context to leak out of a generator, don't put `yield` inside `with`. (Yury and Stefan have both remarked that this is not a problem in practice, given that there are no bug reports or StackOverflow questions about this topic.) Nobody understands async generators, so let's not worry about them. That leaves coroutines (`async def` and `await`). It looks like we don't want to change the original semantics here either, *except* when a framework like asyncio or Twisted has some kind of abstraction for a "task". (I intentionally don't define tasks, but task switches should be explicit, e.g. via `await` or some API -- note that even gevent qualifies, since it only switches when you make a blocking call.) The key things we want then are (a) an interface to get and set context variables whose API is independent from the framework in use (if any), and (b) a way for a framework to decide when context variables are copied, shared or reinitialized. For (a) I like the API from PEP 550: var = contextvars.ContextVar('description') value = var.get() var.set(value) It should be easy to adopt this e.g. in the decimal module instead of the current approach based on thread-local state. For (b) I am leaning towards something simple that emulates thread-local state. Let's define "context" as a mutable mapping whose keys are ContextVar objects, tied to the current thread (each Python thread knows about exactly one context, which is deemed the current context). A framework can decide to clone the current context and assign it to a new task, or initialize a fresh context, etc. The one key feature we want here is that the right thing happens when we switch tasks via `await`, just as the right thing happens when we switch threads. (When a framework uses some other API to switch tasks, the framework do what it pleases.) I don't have a complete design, but I don't want chained lookups, and I don't want to obsess over performance. (I would be fine with some kind of copy-on-write implementation, and switching out the current context should be fast.) I also don't want to obsess over API abstraction. Finally I don't want the design to be closely tied to `with`. Maybe I need to write my own PEP? -- --Guido van Rossum (python.org/~guido)

On 15 October 2017 at 15:05, Guido van Rossum <guido@python.org> wrote:
Let me have another go at building up the PEP 550 generator argument from first principles. The behaviour that PEP 550 says *shouldn't* change is the semantic equivalence of the following code: # Iterator form class ResultsIterator: def __init__(self, data): self._itr = iter(data) def __next__(self): return calculate_result(next(self._itr)) results = _ResultsIterator(data) # Generator form def _results_gen(data): for item in data: yield calculate_result(item) results = _results_gen(data) This *had* been non-controversial until recently, and I still don't understand why folks suddenly decided we should bring it into question by proposing that generators should start implicitly capturing state at creation time just because it's technically possible for them to do so (yes we can implicitly change the way all generators work, but no, we can't implicitly change the way all *iterators* work). The behaviour that PEP 550 thinks *should* change is for the following code to become roughly semantically equivalent, given the constraint that the context manager involved either doesn't manipulate any shared state at all (already supported), or else only manipulates context variables (the new part that PEP 550 adds): # Iterator form class ResultsIterator: def __init__(self, data): self._itr = iter(data) def __next__(self): with adjusted_context(): return calculate_result(next(self._itr)) results = _ResultsIterator(data) # Generator form def _results_gen(data): for item in data: with adjusted_context(): yield calculate_result(item) results = _results_gen(data) Today, while these two forms look like they *should* be comparable, they're not especially close to being semantically equivalent, as there's no mechanism that allows for implicit context reversion at the yield point in the generator form. While I think PEP 550 would still be usable without fixing this discrepancy, I'd be thoroughly disappointed if the only reason we decided not to do it was because we couldn't clearly articulate the difference in reasoning between: * "Generators currently have no way to reasonably express the equivalent of having a context-dependent return statement inside a with statement in a __next__ method implementation, so let's define one" (aka "context variable changes shouldn't leak out of generators, as that will make them *more* like explicit iterator __next__ methods"); and * "Generator functions should otherwise continue to be unsurprising syntactic sugar for objects that implement the regular iterator protocol" (aka "generators shouldn't implicitly capture their creation context, as that would make them *less* like explicit iterator __init__ methods"). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 15 October 2017 at 06:43, Nick Coghlan <ncoghlan@gmail.com> wrote:
This is non-controversial to me.
I'll have to take your word for this, as I can't think of an actual example that follows the pattern of your abstract description, for which I can immediately see the difference. In the absence of being able to understand why the difference matters in current code, I have no view on whether PEP 550 needs to "fix" this issue.
I think that if we can't describe the problem that makes it obvious to the average Python user, then that implies it's a corner case that's irrelevant to said average Python user - and so I'd consider fixing it to be low priority. Specifically, a lot lower priority than providing a context variable facility - which while still not a *common* need, at least resonates with the average user in the sense of "I can imagine writing code that needed context like Decimal does". (And apologies for presenting an imagined viewpoint as what "the average user" might think...) Paul

On 15 October 2017 at 20:45, Paul Moore <p.f.moore@gmail.com> wrote:
On 15 October 2017 at 06:43, Nick Coghlan <ncoghlan@gmail.com> wrote:
Interestingly, thinking about the problem in terms of exception handling flow reminded me of the fact that having a generator-iterator yield while inside a with statement or try/except block is already considered an anti-pattern in many situations, precisely because it means that any exceptions that get thrown in (including GeneratorExit) will be intercepted when that may not be what the author really intended. Accordingly, the canonical guaranteed-to-be-consistent-with-the-previous-behaviour iterator -> generator transformation already involves the use of a temporary variable to move the yield outside any exception handling constructs and ensure that the exception handling only covers the code that it was originally written to cover: def _results_gen(data): for item in data: with adjusted_context(): result_for_item = calculate_result(item) yield result_for_item results = _results_gen(data) The exception handling expectations with coroutines are different, since an "await cr" expression explicitly indicates that any exceptions "cr" fails to handle *will* propagate back through to where the await appears, just as "f()" indicates that unhandled exceptions in "f" will be seen by the current frame. And even if as a new API context variables were to be defined in a yield-tolerant way, a lot of existing context managers still wouldn't be "yield safe", since they may be manipulating thread local or process global state, rather than context variables or a particular object instance. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

It all works fine now: https://github.com/neogeny/TatSu/blob/master/tatsu/contexts.py So, I have a strong requirement: whatever is decided on this PEP... Please don't break it? (or make it illegal) -- Juancarlo *Añez*

On 16 October 2017 at 21:08, Juancarlo Añez <apalala@gmail.com> wrote:
The "anti-pattern in many situations" qualifier was there because there are cases where it's explicitly expected to work, and isn't an anti-pattern at all (e.g. when the generator is decorated with contextlib.contextmanager, or when you're using a context manager to hold open an external resource like a file until the generator is closed). So this wasn't intended as an argument for changing anything - rather, it's about my changing my perspective on how beneficial it would be to have generators default to maintaining their own distinct logical context (which then becomes an argument for *not* changing anything). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

All this arguing based on "equivalence" between different code fragments is nuts. The equivalences were never meant to be exact, and people don't typically understand code using generators using these equivalencies. The key problem we're trying to address is creating a "context" abstraction that can be controlled by async task frameworks without making the *access* API specific to the framework. Because coroutines and generators are similar under the covers, Yury demonstrated the issue with generators instead of coroutines (which are unfamiliar to many people). And then somehow we got hung up about fixing the problem in the example. I want to back out of this entirely, because the problem (while it can be demonstrated) is entirely theoretical, and the proposed solution is made too complicated by attempting to solve the problem for generators as well as for tasks. --Guido On Sat, Oct 14, 2017 at 10:43 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Nick: “I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence: with decimal.localcontext() as ctx: ctc.prex = 30 for i in gen(): pass g = gen() with decimal.localcontext() as ctx: ctc.prex = 30 for i in g: pass” I’m following this discussion from a distance, but cared enough about this point to chime in without even reading what comes later in the thread. (Hopefully it’s not twenty people making the same point…) I HATE this example! Looking solely at the code we can see, you are refactoring a function call from inside an *explicit* context manager to outside of it, and assuming the behavior will not change. There’s *absolutely no* logical or semantic reason that these should be equivalent, especially given the obvious alternative of leaving the call within the explicit context. Even moving the function call before the setattr can’t be assumed to not change its behavior – how is moving it outside a with block ever supposed to be safe? I appreciate the desire to be able to take currently working code using one construct and have it continue working with a different construct, but the burden should be on that library and not the runtime. By that I mean that the parts of decimal that set and read the context should do the extra work to maintain compatibility (e.g. through a globally mutable structure using context variables as a slightly more fine-grained key than thread ID) rather than forcing an otherwise straightforward core runtime feature to jump through hoops to accommodate it. New users of this functionality very likely won’t assume that TLS is the semantic equivalent, especially when all the examples and naming make it sound like context managers are more related. (I predict people will expect this to behave more like unstated/implicit function arguments and be captured at the same time as other arguments are, but can’t really back that up except with gut-feel. It's certainly a feature that I want for myself more than I want another spelling for TLS…) Top-posted from my Windows phone From: Nick Coghlan Sent: Tuesday, October 10, 2017 5:35 To: Guido van Rossum Cc: Python-Ideas Subject: Re: [Python-ideas] PEP draft: context variables On 10 October 2017 at 01:24, Guido van Rossum <guido@python.org> wrote: On Sun, Oct 8, 2017 at 11:46 PM, Nick Coghlan <ncoghlan@gmail.com> wrote: On 8 October 2017 at 08:40, Koos Zevenhoven <k7hoven@gmail.com> wrote: I do remember Yury mentioning that the first draft of PEP 550 captured something when the generator function was called. I think I started reading the discussions after that had already been removed, so I don't know exactly what it was. But I doubt that it was *exactly* the above, because PEP 550 uses set and get operations instead of "assignment contexts" like PEP 555 (this one) does. We didn't forget it, we just don't think it's very useful. I'm not sure I agree on the usefulness. Certainly a lot of the complexity of PEP 550 exists just to cater to Nathaniel's desire to influence what a generator sees via the context of the send()/next() call. I'm still not sure that's worth it. In 550 v1 there's no need for chained lookups. The compatibility concern is that we want developers of existing libraries to be able to transparently switch from using thread local storage to context local storage, and the way thread locals interact with generators means that decimal (et al) currently use the thread local state at the time when next() is called, *not* when the generator is created. I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence: with decimal.localcontext() as ctx: ctc.prex = 30 for i in gen(): pass g = gen() with decimal.localcontext() as ctx: ctc.prex = 30 for i in g: pass The easiest way to maintain that equivalence is to say that even though preventing state changes leaking *out* of generators is considered a desirable change, we see preventing them leaking *in* as a gratuitous backwards compatibility break. This does mean that *neither* form is semantically equivalent to eager extraction of the generator values before the decimal context is changed, but that's the status quo, and we don't have a compelling justification for changing it. If folks subsequently decide that they *do* want "capture on creation" or "capture on first iteration" semantics for their generators, those are easy enough to add as wrappers on top of the initial thread-local-compatible base by using the same building blocks as are being added to help event loops manage context snapshots for coroutine execution. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Oct 11, 2017 at 7:46 AM, Steve Dower <steve.dower@python.org> wrote:
Exactly. You did say it less politely than I did, but this is exactly how I thought about it. And I'm not sure people got it the first time.
In fact, one might then use the kind of enhanced context-local storage that I've been planning on top of PEP 555, as also mentioned in the PEP. It would not be the recommended way, but people might benefit from it in some cases, such as for a more backwards-compatible PEP-555 "feature enablement" or other special purposes like `trio`s timeouts. (However, these needs can be satisfied with an even simpler approach in PEP 555, if that's where we want to go.) I want PEP 555 to be how things *should be*, not how things are. After all, context arguments are a new feature. But plenty of effort in the design still goes into giving people ways to tweak things to their special needs and for compatibility issues.
I assume you like my decision to rename the concept to "context arguments" :). And indeed, new use cases would be more interesting than existing ones. Surely we don't want new use cases to copy the semantics from the old ones which currently have issues (because they were originally designed to work with traditional function and method calls, and using then-available techniques). ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 11Oct2017 0458, Koos Zevenhoven wrote:
Exactly. You did say it less politely than I did, but this is exactly how I thought about it. And I'm not sure people got it the first time.
Yes, perhaps a little harsh. However, if I released a refactoring tool that moved function calls that far, people would file bugs against it for breaking their code (and in my experience of people filing bugs against tools that break their code, they can also be a little harsh).
I don't really care about names, as long as it's easy to use them to research the underlying concept or intended functionality. And I'm not particularly supportive of this concept as a whole anyway - EIBTI and all. But since it does address a fairly significant shortcoming in existing code, we're going to end up with something. If it's a new runtime feature then I'd like it to be an easy concept to grasp with clever hacks for the compatibility cases (and I do believe there are clever hacks available for getting "inject into my deferred function call" semantics), rather than the whole thing being a complicated edge-case. Cheers, Steve

On 11 October 2017 at 21:58, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Refactoring isn't why I like the example, as I agree there's no logical reason why the two forms should be semantically equivalent in a greenfield context management design. The reason I like the example is because, in current Python, with the way generators and decimal contexts currently work, it *doesn't matter* which of these two forms you use - they'll both behave the same way, since no actual code execution takes place in the generator iterator at the time the generator is created. That means we have a choice to make, and that choice will affect how risky it is for a library like decimal to switch from using thread local storage to context local storage: is switching from thread locals to context variables in a synchronous context manager going to be a compatibility break for end user code that uses the second form, where generator creation happens outside a with statement, but use happens inside it? Personally, I want folks maintaining context managers to feel comfortable switching from thread local storage to context variables (when the latter are available), and in particular, I want the decimal module to be able to make such a switch and have it be an entirely backwards compatible change for synchronous single-threaded code. That means it doesn't matter to me whether we see separating generator (or context manager) creation from subsequent use is good style or not, what matters is that decimal contexts work a certain way today and hence we're faced with a choice between: 1. Preserve the current behaviour, since we don't have a compelling reason to change its semantics 2. Change the behaviour, in order to gain <end user benefit> "I think it's more correct, but don't have any specific examples where the status quo subtly does the wrong thing" isn't an end user benefit, as: - of necessity, any existing tested code won't be written that way (since it would be doing the wrong thing, and will hence have been changed) - future code that does want creation time context capture can be handled via an explicit wrapper (as is proposed for coroutines, with event loops supplying the wrapper in that case) "It will be easier to implement & maintain" isn't an end user benefit either, but still a consideration that carries weight when true. In this case though, it's pretty much a wash - whichever form we make the default, we'll need to provide some way of switching to the other behaviour, since we need both behavioural variants ourselves to handle different use cases. That puts the burden squarely on the folks arguing for a semantic change: "We should break currently working code because ...". PEP 479 (the change to StopIteration semantics) is an example of doing that well, and so is the proposal in PEP 550 to keep context changes from implicitly leaking *out* of generators when yield or await is used in a with statement body. The challenge for folks arguing for generators capturing their creation context is to explain the pay-off that end users will gain from our implicitly changing the behaviour of code like the following: >>> data = [sum(Decimal(10)**-r for r in range(max_r+1)) for max_r in range(5)] >>> data [Decimal('1'), Decimal('1.1'), Decimal('1.11'), Decimal('1.111'), Decimal('1.1111')] >>> def lazily_round_to_current_context(data): ... for d in data: yield +d ... >>> g = lazily_round_to_current_context(data) >>> with decimal.localcontext() as ctx: ... ctx.prec = 2 ... rounded_data = list(g) ... >>> rounded_data [Decimal('1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1')] Yes, it's a contrived example, but it's also code that will work all the way back to when the decimal module was first introduced. Because of the way I've named the rounding generator, it's also clear to readers that the code is aware of the existing semantics, and is intentionally relying on them. The current version of PEP 550 means that the decimal module can switch to using context variables instead of thread local storage, and the above code won't even notice the difference. However, if generators were to start implicitly capturing their creation context, then the above code would break, since the rounding would start using a decimal context other than the one that's in effect in the current thread when the rounding takes place - the generator would implicitly reset it back to an earlier state. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Thu, Oct 12, 2017 at 6:54 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
The latter version does not feel like a good way to write the code. People will hate it, because they can't tell what happens by looking at the code locally. What I think is that the current behavior of decimal contexts only satisfies some contrived examples. IMO, everything about decimal contexts together with generators is currently a misfeature. Of course, you can also make use of a misfeature, like in the above example, where the subtleties of decimal rounding are hidden underneath the iterator protocol and a `for` statement. That means we have a choice to make, and that choice will affect how risky
AFAICT, the number of users of `decimal` could be anywhere between 3 and 3**19. Anything you do might break someone's code. Personally, I think the current behavior, which you explain using the example above, is counter-intuitive. But I can't tell you how much code would break by fixing it with direct PEP 555 semantics. I also can't tell how much would break when using PEP 550 to "fix" it, but I don't even like the semantics that that would give. I strongly believe that the most "normal" use case for a generator function is that it's a function that returns an iterable of values. Sadly, decimal contexts don't currently work correctly for this use case. Indeed, I would introduce a new context manager that behaves intuitively and then slowly deprecate the old one. [*] Personally, I want folks maintaining context managers to feel comfortable
Sure. But PEP 550 won't give you that, though. Being inside a generator affects the scope of changing the decimal context. Yes, as a side effect, the behavior of a decimal context manager inside a generator becomes more intuitive. But it's still a breaking change, even for synchronous code. That means it doesn't matter to me whether we see separating generator (or
3. Introduce a new context manager that behaves intuitively. My guess is that the two context managers could even be made to interact with each other in a fairly reasonable manner, even if you nest them in different orders. I'm not sure how necessary that is.
Then it's actually better to not change the semantics of the existing functionality, but add new ones instead.
Handling the normal case with wrappers (that might even harm performance) just because decimal does not handle the normal case?
True, in PEP 555 there is not really much difference in complexity regarding leaking in from the side (next/send) and leaking in from the top (genfunc() call). Just a matter of some if statements.
And on the folks that end up having to argue against it, or to come up with a better solution. And those that feel that it's a distraction from the discussion.
The former is a great example. The latter has good parts but is complicated and didn't end up getting all the way there. The challenge for folks arguing for generators capturing their creation
The way you've named the function (lazily_round_to_current_context) does not correspond to the behavior in the code example. "Current" means "current", not "the context of the caller of next at lazy evaluation time". Maybe you could make it: g = rounded_according_to_decimal_context_of_whoever_calls_next(data) Really, I think that, to get this behavior, the function should be defined with a decorator to mark that context should leak in through next(). But probably the programmer will realize––there must be a better way: with decimal.localcontext() as ctx: ctx.prec = 2 rounded_data = [round_in_context(d) for d in data] That one would already work and be equivalent in any of the proposed semantics. But there could be more improvements, perhaps: with decimal.context(prec=2): rounded_data = [round_in_context(d) for d in data] ––Koos [*] Maybe somehow make the existing functionality a phantom easter egg––a blast from the past which you can import and use, but which is otherwise invisible :-). Then later give warnings and finally remove it completely. But we need better smooth upgrade paths anyway, maybe something like: from __compat__ import unintuitive_decimal_contexts with unintuitive_decimal_contexts: do_stuff() Now code bases can more quickly switch to new python versions and make the occasional compatibility adjustments more lazily, while already benefiting from other new language features. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Thu, Oct 12, 2017 at 8:59 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Note that this is an independent argument w.r.t. both PEPs. PEP 550 does not propose to change existing decimal APIs. It merely uses decimal to illustrate the problem, and suggests a fix using the new APIs. Although it is true that I plan to propose to use PEP 550 to reimplement decimal APIs on top of it, and so far I haven't seen any real-world examples of code that will be broken because of that. As far as I know—and I've done some research—nobody uses decimal contexts and generators because of the associated problems. It's a chicken and egg problem. Yury

On Oct 12, 2017 9:03 PM, "Yury Selivanov" <yselivanov.ml@gmail.com> wrote: On Thu, Oct 12, 2017 at 8:59 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Note that this is an independent argument w.r.t. both PEPs. PEP 550 does not propose to change existing decimal APIs. It merely uses decimal to illustrate the problem, and suggests a fix using the new APIs. Of course this particular point is independent. But not all the other points are. Although it is true that I plan to propose to use PEP 550 to reimplement decimal APIs on top of it, and so far I haven't seen any real-world examples of code that will be broken because of that. As far as I know—and I've done some research—nobody uses decimal contexts and generators because of the associated problems. It's a chicken and egg problem. I've been inclined to think so too. But that kind of research would be useful for decimal if—and only if—you share your methodology. It's not at all clear how one would do research to arrive at such a conclusion. —Koos (mobile)

On Mon, Oct 9, 2017 at 9:46 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Yeah, I'm not surprised you remember that :). But while none of us saw a good enough reason for it at that moment, I have come to think we absolutely need it. We need both the forest and the trees. Sure, if you think of next() as being a simple function call that does something that involves state, then you might want the other semantics (with PEP 555, that situation would look like): def do_stuff_with_side_effects(): with cvar.assign(value): return next(global_gen_containing_state) Now stuff happens within next(..), and whatever happens in next(..) is expected to see the cvar assignment. However, probably much more often, one just thinks of next(..) as "getting the next value", although some computations happen under the hood that one doesn't need to care about. As we all know, in the real world, the use case is usually just to generate the Fibonacci sequence ;). And when you call fibonacci(), the whole sequence should already be determined. You just evaluate the sequence lazily by calling next() each time you want a new number. It may not even be obvious when the computations are made: fib_cache = [0, 1] def fibonacci(): for i in itertools.counter(): if i < len(fib_cache): yield fib_cache[i] else: # not calculated before new = sum(fib_cache[-2:]) fib_cache.append(new) yield new # (function above is thread-unsafe, for clarity) (Important:) So in *any* situation, where you want the outer context to affect the stuff inside the generator through next(), like in the `do_stuff_with_side_effects` example, *the author of the generator function needs to know about it*. And then it is ok to require that the author uses a decorator on the generator function. But when you just generate a pre-determined set of numbers (like fibonacci), the implementation of the generator function should not matter, but if the outer context leaks in through next(..), the internal implementation does matters, and the abstraction is leaky. I don't want to give the leaky things by default. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Hi! On Tue, Sep 05, 2017 at 12:50:35AM +0300, Koos Zevenhoven <k7hoven@gmail.com> wrote:
cvar = contextvars.Var(default="the default value", description="example context variable")
Why ``description`` and not ``doc``?
with cvar.assign(new_value):
Why ``assign`` and not ``set``?
Each thread of the Python interpreter keeps its on stack of
"its own", I think.
``contextvars.Assignment`` objects, each having a pointer to the previous (outer) assignment like in a linked list.
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On Tue, Sep 5, 2017 at 1:20 AM, Oleg Broytman <phd@phdru.name> wrote:
Cause that's a nice thing to bikeshed about? In fact, I probably should have left it out at this point. Really, it's just to get a meaningful repr for the object and better error messages, without any significance for the substance of the PEP. There are also concepts in the PEP that don't have a name yet.
with cvar.assign(new_value):
Why ``assign`` and not ``set``?
To distinguish from typical set-operations (setattr, setitem), and from sets and from settings. I would rather enter an "assignment context" than a "set context" or "setting context". One key point of this PEP is to promote defining context variable scopes on a per-variable (and per-value) basis. I combined the variable and value aspects in this concept of Assignment(variable, value) objects, which define a context that one can enter and exit.
Each thread of the Python interpreter keeps its on stack of
"its own", I think.
That's right, thanks. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

So every generator stores "captured" modifications. This is similar to PEP 550, which adds Logical Context to generators to store their EC modifications. The implementation is different, but the intent is the same. PEP 550 uses a stack of hash tables, this proposal has a linked list of Assignment objects. In the worst case, this proposal will have worse performance guarantees. It's hard to say more, because the implementation isn't described in full. With PEP 550 it's trivial to implement a context manager to control variable assignments. If we do that, how exactly this proposal is different? Can you list all semantical differences between this proposal and PEP 550? So far, it looks like if I call "var.assign(value).__enter__()" it would be equivalent to PEP 550's "var.set(value)". Yury On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:

On Monday, September 4, 2017 at 6:37:44 PM UTC-4, Yury Selivanov wrote:
I think you really should add a context manager to PEP 550 since it is better than calling "set", which leaks state. Nathaniel is right that you need set to support legacy numpy methods like seterr. Had there been a way of setting context variables using a context manager, then numpy would only have had to implement the "errstate" context manager on top of it. There would have been no need for seterr, which leaks state between code blocks and is error-prone.

On Tue, Sep 5, 2017 at 7:42 AM, Neil Girdhar <mistersheik@gmail.com> wrote:
There is nothing in current Python to prevent numpy to use a context manager for seterr; it's easy enough to write your own context manager that saves and restores thread-local state (decimal shows how). In fact with PEP 550 it's so easy that it's really not necessary for the PEP to define this as a separate API -- whoever needs it can just write their own. -- --Guido van Rossum (python.org/~guido)

You should add https://bitbucket.org/hipchat/txlocal as a reference for the pep as it largely implements this idea for Twisted. It may provide for some practical discussions of use cases and limitations of this approach. On Tue, Sep 5, 2017, 09:55 Guido van Rossum <guido@python.org> wrote:

We'll add a reference to the "Can Execution Context be implemented without modifying CPython?" section [1]. However, after skimming through the readme file, I didn't see any examples or limitations that are relevant to PEP 550. If the PEP gets accepted, Twisted can simply add direct support for it (similarly to asyncio). That would mean that users won't need to maintain the context manually (described in txlocal's "Maintaining Context" section). Yury [1] https://www.python.org/dev/peps/pep-0550/#can-execution-context-be-implement... On Tue, Sep 5, 2017 at 8:00 AM, Kevin Conway <kevinjacobconway@gmail.com> wrote:

On Tue, Sep 5, 2017 at 10:54 AM Guido van Rossum <guido@python.org> wrote:
Don't you want to encourage people to use the context manager form and discourage calls to set/discard? I recognize that seterr has to be supported and has to sit on top of some method in the execution context. However, if we were starting from scratch, I don't see why we would have seterr at all. We should just have errstate. seterr can leak state, which might not seem like a big deal in a small program, but in a large program, it can mean that a minor change in one module can cause bugs in a totally different part of the program. These kinds of bugs can be very hard to debug.
-- --Guido van Rossum (python.org/~guido)

On Mon, Sep 4, 2017 at 2:50 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
From a quick skim, my impression is:
All the high-level semantics you suggest make sense... in fact, AFAICT they're exactly the same semantics we've been using as a litmus test for PEP 550. I think PEP 550 is sufficient to allow implementing all your proposed APIs (and that if it isn't, that's a bug in PEP 550). OTOH, your proposal doesn't provide any way to implement functions like decimal.setcontext or numpy.seterr, except by pushing a new state and never popping it, which leaks memory and permanently increases the N in the O(N) lookups. I didn't see any direct comparison with PEP 550 in your text (maybe I missed it). Why do you think this approach would be better than what's in PEP 550? -n -- Nathaniel J. Smith -- https://vorpus.org

On Tue, Sep 5, 2017 at 3:49 AM, Nathaniel Smith <njs@pobox.com> wrote:
Well, I'm happy to hear that a quick skim can already give you an impression ;). But let's see how correct...
Well, if "exactly the same semantics" is even nearly true, you are only testing a small subset of PEP 550 which resembles a subset of this proposal.
I think PEP 550 is sufficient to allow implementing all your proposed APIs (and that if it isn't, that's a bug in PEP 550).
That's not true either. The LocalContext-based semantics introduces scope barriers that affect *all* variables. You might get close by putting just one variable in a LogicalContext and then nest them, but PEP 550 does not allow this in all cases. With the addition of PEP 521 and some trickery, it might. See also this section in PEP 550, where one of the related issues is described: https://www.python.org/dev/peps/pep-0550/#should-yield- from-leak-context-changes
Well, there are different approaches for this. Let's take the example of numpy. import numpy as np I believe the relevant functions are np.seterr -- set a new state (and return the old one) np.geterr -- get the current state np.errstate -- gives you a context manager to do handle (Well, errstate sets more state than np.seterr, but that's irrelevant here). First of all, the np.seterr API is something that I want to discourage in this proposal, because if the state is not reset back to what it was, a completely different piece of code may be affected. BUT To preserve the current semantics of these functions in non-async code, you could do this: - numpy reimplements the errstate context manager using contextvars based on this proposal. - geterr gets the state using contextvars - seterr gets the state using contextvars and mutates it the way it wants (If contextvars is not available, it uses the old way) Also, the idea is to also provide frameworks the means for implementing concurrency-local storage, if that is what people really want, although I'm not sure it is.
It was not my intention to leave out the comparison altogether, but I did avoid the comparisons in some cases in this first draft, because thinking about PEP 550 concepts while trying to understand this proposal might give you the wrong idea. One of the benefits of this proposal is simplicity, and I'm guessing performance as well, but that would need evidence. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 6:53 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I'm sorry, by LocalContext I meant LogicalContext, and by "nesting" them, I meant stacking them. It is in fact nesting in terms of value scopes. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 9:12 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I don't actually care if you use the latest terminology. You seem to have a wrong idea about how PEP 550 really works (and its full semantics), because things you say here about it don't make any sense. Yury

On Tue, Sep 5, 2017 at 8:24 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
In PEP 550, introducing a new LogicalContext on the ExecutionContext affects the scope of any_ var.set(value) for * any * any_var . Does that not make sense? –– Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Tue, Sep 5, 2017 at 8:43 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
So you claim that PEP 550 does allow that in all cases? Or you don't think that that would get close? ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 9/4/17, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I feel that of "is" and "==" in assert statements in this PEP has to be used (or described) more precisely. What if new_value above is 123456789? maybe using something like could be better? -> def equals(a, b): return a is b or a == b Doesn't PEP need to think about something like "context level overflow" ? Or members like: cvar.level ?

On Tue, Sep 5, 2017 at 10:43 AM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
The use is quite precise as it is now. I can't use `is` for the string values, because the result would depend on whether Python gives you the same str instance as before, or a new one with the same content. Maybe I'll get rid of literal string values in the description, since it seems to only cause distraction.
What if new_value above is 123456789?
Any value is fine.
I don't see any need for this at this point, or possibly ever. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
participants (19)
-
Amit Green
-
Ethan Furman
-
francismb
-
Guido van Rossum
-
Jason H
-
Juancarlo Añez
-
Kevin Conway
-
Koos Zevenhoven
-
M.-A. Lemburg
-
Nathaniel Smith
-
Neil Girdhar
-
Nick Coghlan
-
Oleg Broytman
-
Paul Moore
-
Pavol Lisy
-
Stefan Krah
-
Stephen J. Turnbull
-
Steve Dower
-
Yury Selivanov