(PEP 555 subtopic) Propagation of context in async code
This is a continuation of the PEP 555 discussion in https://mail.python.org/pipermail/python-ideas/2017-September/046916.html And this month in https://mail.python.org/pipermail/python-ideas/2017-October/047279.html If you are new to the discussion, the best point to start reading this might be at my second full paragraph below ("The status quo..."). On Fri, Oct 13, 2017 at 10:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 13 October 2017 at 10:56, Guido van Rossum <guido@python.org> wrote:
I'm out of energy to debate every point (Steve said it well -- that decimal/generator example is too contrived), but I found one nit in Nick's email that I wish to correct.
On Wed, Oct 11, 2017 at 1:28 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As a less-contrived example, consider context managers implemented as generators.
We want those to run with the execution context that's active when they're used in a with statement, not the one that's active when they're created (the fact that generator-based context managers can only be used once mitigates the risk of creation time context capture causing problems, but the implications would still be weird enough to be worth avoiding).
Here I think we're in agreement about the desired semantics, but IMO all this requires is some special casing for @contextlib.contextmanager. To me this is the exception, not the rule -- in most *other* places I would want the yield to switch away from the caller's context.
For native coroutines, we want them to run with the execution context that's active when they're awaited or when they're prepared for submission to an event loop, not the one that's active when they're created.
This caught my eye as wrong. Considering that asyncio's tasks (as well as curio's and trio's) *are* native coroutines, we want complete isolation between the context active when `await` is called and the context active inside the `async def` function.
The rationale for this behaviour *does* arise from a refactoring argument:
async def original_async_function(): with some_context(): do_some_setup() raw_data = await some_operation() data = do_some_postprocessing(raw_data)
Refactored:
async def async_helper_function(): do_some_setup() raw_data = await some_operation() return do_some_postprocessing(raw_data)
async def refactored_async_function(): with some_context(): data = await async_helper_function()
*This* type of refactoring argument I *do* subscribe to.
However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation).
The difference between context propagation through coroutine function calls and awaits comes up when you need help from "the" event loop, which means things like creating new tasks from coroutines. However, we cannot even assume that the loop is the only one. So far, it makes no difference where you call the coroutine function. It is only when you await it or schedule it for execution in a loop when something can actually happen. The status quo is that there's nothing that prevents you from calling a coroutine function from within one event loop and then awaiting it in another. So if we want an event loop to be able to pass information down the call chain in such a way that the information is available *throughout the whole task that it is driving*, then the contexts needs to a least propagate through `await`s. This was my starting point 2.5 years ago, when Yury was drafting this status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there will be a problem, where each API that uses "blocking IO" somewhere under the hood would need a duplicate version for asyncio (and one for each third-party async framework!). I felt it was necessary to think about a solution before PEP 492 is accepted, and this became a fairly short-lived thread here on python-ideas: https://mail.python.org/pipermail/python-ideas/2015-May/033267.html This year, the discussion on Yury's PEP 550 somehow ended up with a very similar need before I got involved, apparently for independent reasons. A design for solving this need (and others) is also found in my first draft of PEP 555, found at https://mail.python.org/pipermail/python-ideas/2017-September/046916.html Essentially, it's a way of *passing information down the call chain* when it's inconvenient or impossible to pass the information as normal function arguments. I now call the concept "context arguments". More recently, I put some focus on the direct needs of normal users (as opposed direct needs of async framework authors). Those thoughts are most "easily" discussed in terms of generator functions, which are very similar to coroutine functions: A generator function is often thought of as a function that returns an iterable of lazily evaluated values. In this type of usage, the relevant "function call" happens when calling the generator function. The subsequent calls to next() (or a yield from) are thought of as merely getting the items in the iterable, even if they do actually run code in the generator's frame. The discussion on this is found starting from this email: https://mail.python.org/pipermail/python-ideas/2017-October/047279.html However, also coroutines are evaluated lazily. The question is, when should we consider the "subroutine call" to happen: when the coroutine function is called, or when the resulting object is awaited. Often these two are indeed on the same line of code, so it does not matter. But as I discuss above, there are definitely cases where it matters. This has mostly to do with the interactions of different tasks within one event loop, or code where multiple event loops interact. As mentioned above, there are cases where propagating the context through next() and await is desirable. However, there are also cases where the coroutine call is important. This comes up in the case of multiple interacting tasks or multiple event loops. To start with, probably a more example-friendly case, however, is running an event loop and a coroutine from synchronous code: import asyncio async def do_something_context_aware(): do_something_that_depends_on(current_context()) loop = asyncio.get_event_loop() with some_context(): coro = do_something_context_aware() loop.run_until_complete(coro) Now, if the coroutine function call `do_something_context_aware()` does not save the current context on `coro`, then there is no way some_context() can affect the code that will run inside the coroutine, even if that is what we are explicitly trying to do here. The easy solution is to delegate the context transfer to the scheduling function (run_until_complete), and require that the context is passed to that function: with some_context(): coro = do_something_context_aware() loop.run_until_complete(coro) This gives the async framework (here asyncio) a chance to make sure the context propagates as expected. In general, I'm in favor of giving async frameworks some freedom in how this is implemented. However, to give the framework even more freedom, the coroutine call, do_something_context_aware(), could save the current context branch on `coro`, which run_until_complete can attach to the Task that gets created. The bigger question is, what should happen when a coroutine awaits on another coroutine directly, without giving the framework a change to interfere: async def inner(): do_context_aware_stuff() async def outer(): with first_context(): coro = inner() with second_context(): await coro The big question is: In the above, which context should the coroutine be run in? "The" event loop does not have a chance to interfere, so we cannot delegate the decision. We need both versions: the one that propagates first_context() into the coroutine, and the one that propagates second_context() into it. Or, using my metaphor from the other thread, we need "both the forest and the trees". A solution to this would be to have two types of context arguments: 1. (calling) context arguments and 2. execution context arguments Both of these would have their own stack of (argument, value) assignment pairs, explained in the implementation part of the first PEP 555 draft. While this is a complication, the performance overhead of these is so small, that doubling the overhead should not be a performance concern. The question is, do we want these two types of stacks, or do we want to work around it somehow, for instance using context-local storage, implemented on top of the first kind, to implement something like the second kind. However, that again raises some issues of how to propagate the context-local storage down the ambiguous call chain. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
This was my starting point 2.5 years ago, when Yury was drafting this status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there will be a problem, where each API that uses "blocking IO" somewhere under the hood would need a duplicate version for asyncio (and one for each third-party async framework!). I felt it was necessary to think about a solution before PEP 492 is accepted, and this became a fairly short-lived thread here on python-ideas:
Well, it's obvious why the thread was "short-lived". Don't mix non-blocking and blocking code and don't nest asyncio loops. But I believe this new subtopic is a distraction. You should start a new thread on Python-ideas if you want to discuss the acceptance of PEP 492 2.5 years ago. [..]
The bigger question is, what should happen when a coroutine awaits on another coroutine directly, without giving the framework a change to interfere:
async def inner(): do_context_aware_stuff()
async def outer(): with first_context(): coro = inner()
with second_context(): await coro
The big question is: In the above, which context should the coroutine be run in?
The real big question is how people usually write code. And the answer is that they *don't write it like that* at all. Many context managers in many frameworks (aiohttp, tornado, and even asyncio) require you to wrap your await expressions in them. Not coroutine instantiation. A more important point is that existing context solutions for async frameworks can only support a with statement around an await expression. And people that use such solutions know that 'with ...: coro = inner()' isn't going to work at all. Therefore wrapping coroutine instantiation in a 'with' statement is not a pattern. It can only become a pattern, if whatever execution context PEP accepted in Python 3.7 encouraged people to use it. [..]
Both of these would have their own stack of (argument, value) assignment pairs, explained in the implementation part of the first PEP 555 draft. While this is a complication, the performance overhead of these is so small, that doubling the overhead should not be a performance concern.
Please stop handwaving performance. Using big O notation: PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the total number of all context values for all context keys for the current frame stack. For a recursive function you can easily have a situation where cache is invalidated often, and code starts to run slower and slower. PEP 550 v1, worst complexity for uncached lookup: O(1), see [1]. PEP 550 v2+, worst complexity for uncached lookup: O(k), where 'k' is the number of nested generators for the current frame. Usually k=1. While caching will mitigate PEP 555' bad performance characteristics in *tight loops*, the performance of uncached path must not be ignored. Yury [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis
On Fri, Oct 13, 2017 at 7:38 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
This was my starting point 2.5 years ago, when Yury was drafting this status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there will be a problem, where each API that uses "blocking IO" somewhere under the hood would need a duplicate version for asyncio (and one for each third-party async framework!). I felt it was necessary to think about a solution before PEP 492 is accepted, and this became a fairly short-lived thread here on python-ideas:
Well, it's obvious why the thread was "short-lived". Don't mix non-blocking and blocking code and don't nest asyncio loops. But I believe this new subtopic is a distraction.
Nesting is not the only way to have interaction between two event loops. But whenever anyone *does* want to nest two loops, they are perhaps more likely to be loops of different frameworks. You believe that the semantics in async code is a distraction?
You should start a new thread on Python-ideas if you want to discuss the acceptance of PEP 492 2.5 years ago.
I 'm definitely not interested in discussing the acceptance of PEP 492.
[..]
The bigger question is, what should happen when a coroutine awaits on another coroutine directly, without giving the framework a change to interfere:
async def inner(): do_context_aware_stuff()
async def outer(): with first_context(): coro = inner()
with second_context(): await coro
The big question is: In the above, which context should the coroutine be run in?
The real big question is how people usually write code. And the answer is that they *don't write it like that* at all. Many context managers in many frameworks (aiohttp, tornado, and even asyncio) require you to wrap your await expressions in them. Not coroutine instantiation.
You know very well that I've been talking about how people usually write code etc. But we still need to handle the corner cases too.
A more important point is that existing context solutions for async frameworks can only support a with statement around an await expression. And people that use such solutions know that 'with ...: coro = inner()' isn't going to work at all.
Therefore wrapping coroutine instantiation in a 'with' statement is not a pattern. It can only become a pattern, if whatever execution context PEP accepted in Python 3.7 encouraged people to use it.
The code is to illustrate semantics, not an example of real code. The point is to highlight that the context has changed between when the coroutine function was called and when it is awaited. That's certainly a thing that can happen in real code, even if it is not the most typical case. I do mention this in my previous email.
[..]
Both of these would have their own stack of (argument, value) assignment pairs, explained in the implementation part of the first PEP 555 draft. While this is a complication, the performance overhead of these is so small, that doubling the overhead should not be a performance concern.
Please stop handwaving performance. Using big O notation:
There is discussion on perfomance elsewhere, now also in this other subthread: https://mail.python.org/pipermail/python-ideas/2017-October/047327.html PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the
total number of all context values for all context keys for the current frame stack.
Not true. See the above link. Lookups are fast (*and* O(1), if we want them to be). PEP 555 stacks are independent of frames, BTW.
For a recursive function you can easily have a situation where cache is invalidated often, and code starts to run slower and slower.
Not true either. The lookups are O(1) in a recursive function, with and without nested contexts. I started this thread for discussion about semantics in an async context. Stefan asked about performance in the other thread, so I posted there. ––Koos
PEP 550 v1, worst complexity for uncached lookup: O(1), see [1].
PEP 550 v2+, worst complexity for uncached lookup: O(k), where 'k' is the number of nested generators for the current frame. Usually k=1.
While caching will mitigate PEP 555' bad performance characteristics in *tight loops*, the performance of uncached path must not be ignored.
Yury
[1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt- performance-analysis
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Fri, Oct 13, 2017 at 1:46 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Fri, Oct 13, 2017 at 7:38 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote: [..]
This was my starting point 2.5 years ago, when Yury was drafting this status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there will be a problem, where each API that uses "blocking IO" somewhere under the hood would need a duplicate version for asyncio (and one for each third-party async framework!). I felt it was necessary to think about a solution before PEP 492 is accepted, and this became a fairly short-lived thread here on python-ideas:
Well, it's obvious why the thread was "short-lived". Don't mix non-blocking and blocking code and don't nest asyncio loops. But I believe this new subtopic is a distraction.
Nesting is not the only way to have interaction between two event loops. But whenever anyone *does* want to nest two loops, they are perhaps more likely to be loops of different frameworks.
You believe that the semantics in async code is a distraction?
Discussing blocking calls and/or nested event loops in async code is certainly a distraction :) [..]
The real big question is how people usually write code. And the answer is that they *don't write it like that* at all. Many context managers in many frameworks (aiohttp, tornado, and even asyncio) require you to wrap your await expressions in them. Not coroutine instantiation.
You know very well that I've been talking about how people usually write code etc. But we still need to handle the corner cases too. [..] The code is to illustrate semantics, not an example of real code. The point is to highlight that the context has changed between when the coroutine function was called and when it is awaited. That's certainly a thing that can happen in real code, even if it is not the most typical case. I do mention this in my previous email.
I understand the point and what you're trying to illustrate. I'm saying that people don't write 'with smth: c = coro()' because it's currently pointless. And unless you tell them they should, they won't.
[..]
Both of these would have their own stack of (argument, value) assignment pairs, explained in the implementation part of the first PEP 555 draft. While this is a complication, the performance overhead of these is so small, that doubling the overhead should not be a performance concern.
Please stop handwaving performance. Using big O notation:
There is discussion on perfomance elsewhere, now also in this other subthread:
https://mail.python.org/pipermail/python-ideas/2017-October/047327.html
PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the total number of all context values for all context keys for the current frame stack.
Quoting you from that link: "Indeed I do mention performance here and there in the PEP 555 draft. Lookups can be made fast and O(1) in most cases. Even with the simplest unoptimized implementation, the worst-case lookup complexity would be O(n), where n is the number of assignment contexts entered after the one which is being looked up from (or in other words, nested inside the one that is being looked up from). This means that for use cases where the relevant context is entered as the innermost context level, the lookups are O(1) even without any optimizations. It is perfectly reasonable to make an implementation where lookups are *always* O(1). Still, it might make more sense to implement a half-way solution with "often O(1)", because that has somewhat less overhead in case the callees end up not doing any lookups." So where's the actual explanation of how you can make *uncached* lookups O(1) in your best implementation? I only see you claiming that you know how to do that. And since you're using a stack of values instead of hash tables, your explanation can make a big impact on the CS field :) It's perfectly reasonable to say that "cached lookups in my optimization is O(1)". Saying that "in most cases it's O(1)" isn't how the big O notation should be used. BTW, what's the big O for capturing the entire context in PEP 555 (get_execution_context() in PEP 550)? How will that operation be implemented? A shallow copy of the stack? Also, if I had this: with c.assign(o1): with c.assign(o2): with c.assign(o3): ctx = capture_context() will ctx have references to o1, o2, and o3?
Not true. See the above link. Lookups are fast (*and* O(1), if we want them to be).
PEP 555 stacks are independent of frames, BTW.
For a recursive function you can easily have a situation where cache is invalidated often, and code starts to run slower and slower.
Not true either. The lookups are O(1) in a recursive function, with and without nested contexts.
See the above. I claim that you can't say that *uncached* lookups can be O(1) in PEP 555 with the current choice of datastructures. Yury
This is a continuation of the PEP 555 discussion in
https://mail.python.org/pipermail/python-ideas/2017-September/046916.html
And this month in
https://mail.python.org/pipermail/python-ideas/2017-October/047279.html
If you are new to the discussion, the best point to start reading this might be at my second full paragraph below ("The status quo...").
On Fri, Oct 13, 2017 at 10:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 13 October 2017 at 10:56, Guido van Rossum <guido@python.org> wrote:
I'm out of energy to debate every point (Steve said it well -- that decimal/generator example is too contrived), but I found one nit in Nick's email that I wish to correct.
On Wed, Oct 11, 2017 at 1:28 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
As a less-contrived example, consider context managers implemented as generators.
We want those to run with the execution context that's active when they're used in a with statement, not the one that's active when they're created (the fact that generator-based context managers can only be used once mitigates the risk of creation time context capture causing problems, but the implications would still be weird enough to be worth avoiding).
Here I think we're in agreement about the desired semantics, but IMO all this requires is some special casing for @contextlib.contextmanager. To me this is the exception, not the rule -- in most *other* places I would want the yield to switch away from the caller's context.
For native coroutines, we want them to run with the execution context that's active when they're awaited or when they're prepared for submission to an event loop, not the one that's active when they're created.
This caught my eye as wrong. Considering that asyncio's tasks (as well as curio's and trio's) *are* native coroutines, we want complete isolation between the context active when `await` is called and the context active inside the `async def` function.
The rationale for this behaviour *does* arise from a refactoring argument:
async def original_async_function(): with some_context(): do_some_setup() raw_data = await some_operation() data = do_some_postprocessing(raw_data)
Refactored:
async def async_helper_function(): do_some_setup() raw_data = await some_operation() return do_some_postprocessing(raw_data)
async def refactored_async_function(): with some_context(): data = await async_helper_function()
*This* type of refactoring argument I *do* subscribe to.
However, considering that coroutines are almost always instantiated at the point where they're awaited, I do concede that creation time context capture would likely also work out OK for the coroutine case, which would leave contextlib.contextmanager as the only special case (and it would turn off both creation-time context capture *and* context isolation).
The difference between context propagation through coroutine function calls and awaits comes up when you need help from "the" event loop, which means things like creating new tasks from coroutines. However, we cannot even assume that the loop is the only one. So far, it makes no difference where you call the coroutine function. It is only when you await it or schedule it for execution in a loop when something can actually happen.
The status quo is that there's nothing that prevents you from calling a coroutine function from within one event loop and then awaiting it in another. So if we want an event loop to be able to pass information down the call chain in such a way that the information is available *throughout the whole task that it is driving*, then the contexts needs to a least propagate through `await`s.
This was my starting point 2.5 years ago, when Yury was drafting this status quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there will be a problem, where each API that uses "blocking IO" somewhere under the hood would need a duplicate version for asyncio (and one for each third-party async framework!). I felt it was necessary to think about a solution before PEP 492 is accepted, and this became a fairly short-lived thread here on python-ideas:
https://mail.python.org/pipermail/python-ideas/2015-May/033267.html
This year, the discussion on Yury's PEP 550 somehow ended up with a very similar need before I got involved, apparently for independent reasons.
A design for solving this need (and others) is also found in my first draft of PEP 555, found at
https://mail.python.org/pipermail/python-ideas/2017-September/046916.html
Essentially, it's a way of *passing information down the call chain* when it's inconvenient or impossible to pass the information as normal function arguments. I now call the concept "context arguments".
More recently, I put some focus on the direct needs of normal users (as opposed direct needs of async framework authors).
Those thoughts are most "easily" discussed in terms of generator functions, which are very similar to coroutine functions: A generator function is often thought of as a function that returns an iterable of lazily evaluated values. In this type of usage, the relevant "function call" happens when calling the generator function. The subsequent calls to next() (or a yield from) are thought of as merely getting the items in the iterable, even if they do actually run code in the generator's frame. The discussion on this is found starting from this email:
https://mail.python.org/pipermail/python-ideas/2017-October/047279.html
However, also coroutines are evaluated lazily. The question is, when should we consider the "subroutine call" to happen: when the coroutine function is called, or when the resulting object is awaited. Often these two are indeed on the same line of code, so it does not matter. But as I discuss above, there are definitely cases where it matters. This has mostly to do with the interactions of different tasks within one event loop, or code where multiple event loops interact.
As mentioned above, there are cases where propagating the context through next() and await is desirable. However, there are also cases where the coroutine call is important. This comes up in the case of multiple interacting tasks or multiple event loops.
To start with, probably a more example-friendly case, however, is running an event loop and a coroutine from synchronous code:
import asyncio
async def do_something_context_aware(): do_something_that_depends_on(current_context())
loop = asyncio.get_event_loop()
with some_context(): coro = do_something_context_aware()
loop.run_until_complete(coro)
Now, if the coroutine function call `do_something_context_aware()` does not save the current context on `coro`, then there is no way some_context() can affect the code that will run inside the coroutine, even if that is what we are explicitly trying to do here.
The easy solution is to delegate the context transfer to the scheduling function (run_until_complete), and require that the context is passed to that function:
with some_context(): coro = do_something_context_aware() loop.run_until_complete(coro)
This gives the async framework (here asyncio) a chance to make sure the context propagates as expected. In general, I'm in favor of giving async frameworks some freedom in how this is implemented. However, to give the framework even more freedom, the coroutine call, do_something_context_aware(), could save the current context branch on `coro`, which run_until_complete can attach to the Task that gets created.
The bigger question is, what should happen when a coroutine awaits on another coroutine directly, without giving the framework a change to interfere:
async def inner(): do_context_aware_stuff()
async def outer(): with first_context(): coro = inner()
with second_context(): await coro
The big question is: In the above, which context should the coroutine be run in?
"The" event loop does not have a chance to interfere, so we cannot delegate the decision.
Note that I did not write the above as what real code is expected to look
Let me respond to my own email. I'm sorry I wrote a long email, but I figured I'll have to take the time to write this down carefully (and even in a new thread with a clear title) so that people would know what the discussion was about. Probably I could have done better structuring that email, but I seriously ran out of time. This is directly related to how "a normal user" writing async code would be affected by the semantics of this (context arguments/variables). It's also related to the semantics of contexts combined with normal generator functions, partly because the situation is somewhat similar, and partly because we might want the same basic rules to apply in both situations. (Side note: This also has to do with more obscure cases like multiple different async frameworks in the same process (or in the same program, or perhaps the same server, or even larger – whatever the constraints are). Any of the context propagation and isolation/leakage semantics I have described (or that I recall anyone else describing) could be implemented in the PEP 555 approach without any problems. The difference is just an if statement branch or two in C code. So, see below for some more discussion between (it would be useful if some people could reply to this email and say if and why they agree or disagree with something below -- also non-experts that roughly understand what I'm talking about): On Fri, Oct 13, 2017 at 6:49 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote: like. It's just to underline the semantic difference that the context can change between the call and the await. Indeed, one might say that people don't write code like that. And maybe async/await is still sufficiently young that one can sort of say "this is how we showed people how to do it, so that's how they'll do it" [*]. But let's make the above code just a little bit more complicated, so that it becomes easier to believe that the semantic difference here really matters, and cannot be hand-waved away: async def outer(): with some_context(): a = inner() with other_context(): b = inner() await gather(a, b) # execute coroutines a and b concurrently If the coroutine function call, inner(), does not save a pointer to the current context at that point, then the code would just ignore the with statements completely and run both coroutines in the outer context, which is clearly not what an author of such code would want the code to do. It is certainly possible to fix the problem with requiring wrapping the coroutines with stuff, but that would lead to nobody ever knowing what the semantics will be without checking if the the coroutine has been wrapped or not. On the other hand, we could make the code *just work*, and that would be completely in line with what I've been promoting also as the semantics for generator functions in this thread: https://mail.python.org/pipermail/python-ideas/2017-October/047279.html I am definitely *not* talking about this kind of semantics because of something *I personally* need: In fact, I arrived at these thoughts because my designs for solving "my" original problem had turned into a more general-purpose mechanism (PEP 555) that would eventually also affect how code written by completely normal users of with statements and generator functions would behave. And certainly the situation is *very* similar to the case of coroutine functions, as (only?) Guido seems to acknowledge. But then how to address "my" original problem where the context would propagate through awaits, and next/send? From what others have written, it seems there are also other situations where that is desired. There are several ways to solve the problem as an extension to PEP 555, but below is one:
We need both versions: the one that propagates first_context() into the coroutine, and the one that propagates second_context() into it. Or, using my metaphor from the other thread, we need "both the forest and the trees".
A solution to this would be to have two types of context arguments:
1. (calling) context arguments
and
2. execution context arguments
So yes, I'm actually serious about this possibility. Now it would be up to library and framework authors to pick the right variant of the two. And this is definitely something that could be documented very clearly.
Both of these would have their own stack of (argument, value) assignment pairs, explained in the implementation part of the first PEP 555 draft. While this is a complication, the performance overhead of these is so small, that doubling the overhead should not be a performance concern. The question is, do we want these two types of stacks, or do we want to work around it somehow, for instance using context-local storage, implemented on top of the first kind, to implement something like the second kind. However, that again raises some issues of how to propagate the context-local storage down the ambiguous call chain.
This would also reduce the need to decorate and wrap generators and decorator functions, although in some cases that would still be needed. If something was not clear, but seems relevant to what I'm trying to discuss here, please ask :) – – Koos [*] Maybe it would not even be too late to make minor changes in the PEP 492 semantics of coroutine functions at this point if there was a good enough reason. But in fact, I think the current semantics might be perfectly fine, and I'm definitely not suggesting any changes to existing semantics here. Only extensions to the existing semantics. -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Sun, Oct 15, 2017 at 9:44 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
So, see below for some more discussion between (it would be useful if some people could reply to this email and say if and why they agree or disagree with something below -- also non-experts that roughly understand what I'm talking about):
Yes, I understand what you are roughly talking about. Also, yes, generators are co-routines [though when starting to work with generators, people don't fully realize this]. But then how to address "my" original problem where the context would
propagate through awaits, and next/send? From what others have written, it seems there are also other situations where that is desired. There are several ways to solve the problem as an extension to PEP 555, but below is one:
We need both versions: the one that propagates first_context() into the coroutine, and the one that propagates second_context() into it. Or, using my metaphor from the other thread, we need "both the forest and the trees".
A solution to this would be to have two types of context arguments:
1. (calling) context arguments
and
2. execution context arguments
So yes, I'm actually serious about this possibility. Now it would be up to library and framework authors to pick the right variant of the two. And this is definitely something that could be documented very clearly.
This is an interesting idea. I would add you also need: 3. Shared context, the generator shares the context with it's caller which means: - If the caller changes the context, the generator, see the changed context next time it's __next__ function is called - If the generator changes the context, the caller sees the changed context. - [This clearly make changing the context using 'with' totally unusable in both the caller & the generator -- unless we add even odder semantics, that the generator restores the original context when it exists???] - (As per previous email by me, I claim this is the most natural way beginners are going to think it works; and needs to be supported; also in real code this is not often useful] - I'm not sure if this would even work with async or not -- *IF* not, I would still have a syntax for the user to attempt this -- and throw a Syntax Error when they do, with a good explanation of why this combination doesn't work for async. I believe good explanations are a great way for people to learn which features can't be combined together & why.
If something was not clear, but seems relevant to what I'm trying to discuss here, please ask :)
I looked for you quote "we need both the forest & the trees", but didn't find it here. I quite strongly agree we need both; in fact need also the third case I highlighted above. As for what Guido wrote, that we might be trying to solve too many problems -- probably. However, these are real issues with context's, not edge cases. Thus Guido writing we don't want to allow yield within a 'with' clause (as it leaks context) .. I would argue two things: - There are use cases where we *DO* want this -- rare -- true -- but they exist (i.e.: my #3 above) - IF, for simplicity, sake, it is decided not to handle this case now; then make it a syntax error in the language; i.e.: def f(): with context() as x: yield 1 Syntax error: 'yield' may not be used inside a 'with' clause. This would really help new users not to make a mistake that takes hours to debug; & help correct their [initial mistaken] thinking on how contexts & generators interact.
On Sun, Oct 15, 2017 at 5:34 PM, Amit Green <amit.mixie@gmail.com> wrote:
On Sun, Oct 15, 2017 at 9:44 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
So, see below for some more discussion between (it would be useful if some people could reply to this email and say if and why they agree or disagree with something below -- also non-experts that roughly understand what I'm talking about):
Yes, I understand what you are roughly talking about.
Also, yes, generators are co-routines [though when starting to work with generators, people don't fully realize this].
But then how to address "my" original problem where the context would
propagate through awaits, and next/send? From what others have written, it seems there are also other situations where that is desired. There are several ways to solve the problem as an extension to PEP 555, but below is one:
We need both versions: the one that propagates first_context() into the coroutine, and the one that propagates second_context() into it. Or, using my metaphor from the other thread, we need "both the forest and the trees".
A solution to this would be to have two types of context arguments:
1. (calling) context arguments
and
2. execution context arguments
So yes, I'm actually serious about this possibility. Now it would be up to library and framework authors to pick the right variant of the two. And this is definitely something that could be documented very clearly.
This is an interesting idea. I would add you also need:
3. Shared context, the generator shares the context with it's caller which means:
- If the caller changes the context, the generator, see the changed context next time it's __next__ function is called - If the generator changes the context, the caller sees the changed context. - [This clearly make changing the context using 'with' totally unusable in both the caller & the generator -- unless we add even odder semantics, that the generator restores the original context when it exists???] - (As per previous email by me, I claim this is the most natural way beginners are going to think it works; and needs to be supported; also in real code this is not often useful] - I'm not sure if this would even work with async or not -- *IF* not, I would still have a syntax for the user to attempt this -- and throw a Syntax Error when they do, with a good explanation of why this combination doesn't work for async. I believe good explanations are a great way for people to learn which features can't be combined together & why.
Just as a quick note, after skimming through your bullet points: All of this is indeed covered with decorators and other explicit mechanisms in the PEP 555 approach. I don't think we need syntax errors, though.
If something was not clear, but seems relevant to what I'm trying to discuss here, please ask :)
I looked for you quote "we need both the forest & the trees", but didn't find it here. I quite strongly agree we need both; in fact need also the third case I highlighted above.
The ordering of the archive was indeed thoroughly destroyed. Ordering by date might help. But the quote you ask for is here: https://mail.python.org/pipermail/python-ideas/2017-October/047285.html –-Koos
As for what Guido wrote, that we might be trying to solve too many problems -- probably. However, these are real issues with context's, not edge cases.
Thus Guido writing we don't want to allow yield within a 'with' clause (as it leaks context) .. I would argue two things:
- There are use cases where we *DO* want this -- rare -- true -- but they exist (i.e.: my #3 above)
- IF, for simplicity, sake, it is decided not to handle this case now; then make it a syntax error in the language; i.e.:
def f(): with context() as x: yield 1
Syntax error: 'yield' may not be used inside a 'with' clause.
This would really help new users not to make a mistake that takes hours to debug; & help correct their [initial mistaken] thinking on how contexts & generators interact.
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
participants (3)
-
Amit Green
-
Koos Zevenhoven
-
Yury Selivanov