Mailman 3 PEP 550 v4: coroutine policy - Python-Dev

newer
Re: [Python-Dev] [Python-checkins]...

PEP 550 v4: coroutine policy

Yury Selivanov

28 Aug 2017 28 Aug '17

4:24 p.m.

Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine: var = new_context_var() async def bar(): var.set(42) async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics run_until_complete(foo()) But it would break if a user wrapped "await bar()" with "wait_for()": var = new_context_var() async def bar(): var.set(42) async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!! run_until_complete(foo()) Therefore, in the current (v4) version of the PEP, we made all coroutines to have their own LC (just like generators), which makes both examples always raise an AssertionError. This makes it easier for async/await users to refactor their code: they simply cannot propagate EC changes up the call stack, hence any coroutine can be safely wrapped into a task. Nathaniel and Stefan Krah argued on the mailing list that this change in semantics makes the PEP harder to understand. Essentially, context changes propagate up the call stack for regular code, but not for asynchronous. For regular code the PEP behaves like TLS, but for asynchronous it behaves like dynamic scoping. IMO, on its own, this argument is not strong enough to rollback to the older PEP 550 semantics, but I think I've discovered a stronger one: asynchronous context managers. With the current version (v4) of the PEP, it's not possible to set context variables in __aenter__ and in @contextlib.asynccontextmanager: class Foo: async def __aenter__(self): context_var.set('aaa') # won't be visible outside of __aenter__ So I guess we have no other choice other than reverting this spec change for coroutines. The very first example in this email should start working again. This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task. BTW, on the topic of dynamic scoping. Context manager protocols (both sync and async) is the fundamental reason why we couldn't implement dynamic scoping in Python even if we wanted to. With a classical dynamic scoping in a functional language, __enter__ would have its own scope, which the code in the 'with' block would never be able to access. Thus I propose to stop associating PEP 550 concepts with (dynamic) scoping. Thanks, Yury

Show replies by thread

Nick Coghlan

29 Aug 29 Aug

7:21 a.m.

On 29 August 2017 at 07:24, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task.

I'm not sure if it was Nathaniel or Stefan that raised it, but I liked the analogy that compared wrapping a coroutine in a new Task to submitting a synchronous function call to a concurrent.futures executor: while the dispatch layer is able to pass down a snapshot of the current execution context to the submitted function or wrapped coroutine, the potential for concurrent execution of multiple activities using that same context means the dispatcher *can't* reliably pass back any changes that the child context makes. For example, consider the following: def func(): with make_executor() as executor: fut1 = executor.submit(other_func1) fut2 = executor.submit(other_func2) result1 = fut1.result() result2 = fut2.result() async def coro(): fut1 = asyncio.ensure_future(other_coro1()) fut2 = asyncio.ensure_future(other_coro2()) result1 = await fut1 result2 = await fut2 For both of these cases, it shouldn't matter which order we use to wait for the results, or if we perform any other operations in between, and the only way to be sure of that outcome is if the dispatch operations (whether that's asyncio.ensure_future or executor.submit) prevent reverse propagation of context changes from the operation being dispatched. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Antoine Pitrou

1:40 p.m.

On Mon, 28 Aug 2017 17:24:29 -0400 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine:

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics

run_until_complete(foo())

But it would break if a user wrapped "await bar()" with "wait_for()":

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!!

run_until_complete(foo())

[...]

...

So I guess we have no other choice other than reverting this spec change for coroutines. The very first example in this email should start working again.

What about the second one? Why wouldn't the bar() coroutine inherit the LC at the point it's instantiated (i.e. where the synchronous bar() call is done)?

...

This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task.

Hmm, sorry for being a bit slow, but I'm not sure what this sentence implies. How is the user supposed to know whether something will be wrapped into a Task (short of being an expert in asyncio internals perhaps)? Actually, if could whip up an example of what you mean here, it would be helpful I think :-)

...

Thus I propose to stop associating PEP 550 concepts with (dynamic) scoping.

Agreed that dynamic scoping is a red herring here. Regards Antoine.

Yury Selivanov

2:18 p.m.

On Tue, Aug 29, 2017 at 2:40 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:

...

On Mon, 28 Aug 2017 17:24:29 -0400 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine:

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics

run_until_complete(foo())

But it would break if a user wrapped "await bar()" with "wait_for()":

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!!

run_until_complete(foo())

[...]

...
So I guess we have no other choice other than reverting this spec change for coroutines. The very first example in this email should start working again.

What about the second one?

Just to be clear: in the next revision of the PEP, the first example will work without an AssertionError; second example will keep raising an AssertionError.

...

Why wouldn't the bar() coroutine inherit the LC at the point it's instantiated (i.e. where the synchronous bar() call is done)?

We want tasks to have their own isolated contexts. When a task is started, it runs its code in parallel with its "parent" task. We want each task to have its own isolated EC (OS thread/TLS vs async task/EC analogy), otherwise the EC of "foo()" will be randomly changed by the tasks it spawned. wait_for() in the above example creates an asyncio.Task implicitly, and that's why we don't see 'var' changed to '42' in foo(). This is a slightly complicated case, but it's addressable with a good documentation and recommended best practices.

...

...
This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task.

Hmm, sorry for being a bit slow, but I'm not sure what this sentence implies. How is the user supposed to know whether something will be wrapped into a Task (short of being an expert in asyncio internals perhaps)?

Actually, if could whip up an example of what you mean here, it would be helpful I think :-)

__aenter__ won't ever be wrapped in a task because its called by the interpreter. var = new_context_var() class MyAsyncCM: def __aenter__(self): var.set(42) async with MyAsyncCM(): assert var.get() == 42 The above snippet will always work as expected. We'll update the PEP with thorough explanation of all these nuances in the semantics. Yury

Antoine Pitrou

2:32 p.m.

Le 29/08/2017 à 21:18, Yury Selivanov a écrit :

...

On Tue, Aug 29, 2017 at 2:40 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:

...
On Mon, 28 Aug 2017 17:24:29 -0400 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine:

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics

run_until_complete(foo())

But it would break if a user wrapped "await bar()" with "wait_for()":

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!!

run_until_complete(foo())

[...]

...
Why wouldn't the bar() coroutine inherit the LC at the point it's instantiated (i.e. where the synchronous bar() call is done)?

We want tasks to have their own isolated contexts. When a task is started, it runs its code in parallel with its "parent" task.

I'm sorry, but I don't understand what it all means. To pose the question differently: why is example #1 supposed to be different, philosophically, than example #2? Both spawn a coroutine, both wait for its execution to end. There is no reason that adding a wait_for() intermediary (presumably because the user wants to add a timeout) would significantly change the execution semantics of bar().

...

wait_for() in the above example creates an asyncio.Task implicitly, and that's why we don't see 'var' changed to '42' in foo().

I don't understand why a non-obvious behaviour detail (the fact that wait_for() creates an asyncio.Task implicitly) should translate into a fundamental difference in observable behaviour. I find it counter-intuitive and error-prone.

...

This is a slightly complicated case, but it's addressable with a good documentation and recommended best practices.

It would be better addressed with consistent behaviour that doesn't rely on specialist knowledge, though :-/

...

...
...
This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task.

Hmm, sorry for being a bit slow, but I'm not sure what this sentence implies. How is the user supposed to know whether something will be wrapped into a Task (short of being an expert in asyncio internals perhaps)?

Actually, if could whip up an example of what you mean here, it would be helpful I think :-)

__aenter__ won't ever be wrapped in a task because its called by the interpreter.

var = new_context_var()

class MyAsyncCM:

def __aenter__(self): var.set(42)

async with MyAsyncCM(): assert var.get() == 42

The above snippet will always work as expected.

Uh... So I really don't understand what you meant above when you wrote: """ This means that PEP 550 will have a caveat for async code: don't rely on context propagation up the call stack, unless you are writing __aenter__ and __aexit__ that are guaranteed to be called without being wrapped into a Task. """ To ask the question again: can you showcase how and where the "caveat" applies? Regards Antoine.

Yury Selivanov

2:59 p.m.

On Tue, Aug 29, 2017 at 3:32 PM, Antoine Pitrou <antoine@python.org> wrote:

...

Le 29/08/2017 à 21:18, Yury Selivanov a écrit :

...
On Tue, Aug 29, 2017 at 2:40 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:

...
On Mon, 28 Aug 2017 17:24:29 -0400 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine:

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics

run_until_complete(foo())

But it would break if a user wrapped "await bar()" with "wait_for()":

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!!

run_until_complete(foo())

[...]

...
Why wouldn't the bar() coroutine inherit the LC at the point it's instantiated (i.e. where the synchronous bar() call is done)?

We want tasks to have their own isolated contexts. When a task is started, it runs its code in parallel with its "parent" task.

I'm sorry, but I don't understand what it all means.

To pose the question differently: why is example #1 supposed to be different, philosophically, than example #2? Both spawn a coroutine, both wait for its execution to end. There is no reason that adding a wait_for() intermediary (presumably because the user wants to add a timeout) would significantly change the execution semantics of bar().

I see your point. The currently published version of the PEP (v4) fixes this by saying: each coroutine has its own LC. Therefore, "var.set(42)" cannot be visible to the code that calls "bar()". And therefore, "await wait_for(bar())" and "await bar()" work the same way with regards to execution context semantics. *Unfortunately*, while this fixes above examples to work the same way, setting context vars in "__aenter__" stops working: class MyAsyncCM: def __aenter__(self): var.set(42) async with MyAsyncCM(): assert var.get() == 42 Because __aenter__ has its own LC, the code wrapped in "async with" will not see the effect of "var.set(42)"! This absolutely needs to be fixed, and the only way (that I know) it can be fixed is to revert the "every coroutine has its own LC" statement (going back to the semantics coroutines had in PEP 550 v2 and v3).

...

...
wait_for() in the above example creates an asyncio.Task implicitly, and that's why we don't see 'var' changed to '42' in foo().

I don't understand why a non-obvious behaviour detail (the fact that wait_for() creates an asyncio.Task implicitly) should translate into a fundamental difference in observable behaviour. I find it counter-intuitive and error-prone.

"await bar()" and "await wait_for(bar())" are actually quite different. Let me illustrate with an example: b1 = bar() # bar() is not running yet await b1 b2 = wait_for(bar()) # bar() was wrapped into a Task and is being running right now await b2 Usually this difference is subtle, but in asyncio it's perfectly fine to never await on b2, just let it run until it completes. If you don't "await b1" -- b1 simply will never run. All in all, we can't say that "await bar()" and "await wait_for(bar())" are equivalent. The former runs bar() synchronously within the coroutine that awaits it. The latter runs bar() in a completely separate and detached task in parallel to the coroutine that spawned it.

...

...
This is a slightly complicated case, but it's addressable with a good documentation and recommended best practices.

It would be better addressed with consistent behaviour that doesn't rely on specialist knowledge, though :-/

I agree. But I don't see any other solution that would solve the problem *and* satisfy the following requirements: 1. Context variables set in "CM.__aenter__" and "CM.__aexit__" should be visible to code that is wrapped in "async with CM()". 2. Tasks must have isolated contexts -- changes that coroutines do to the EC in one Task, should not be visible to other Tasks. Yury

Nathaniel Smith

3:06 p.m.

On Tue, Aug 29, 2017 at 12:32 PM, Antoine Pitrou <antoine@python.org> wrote:

...

Le 29/08/2017 à 21:18, Yury Selivanov a écrit :

...
On Tue, Aug 29, 2017 at 2:40 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:

...
On Mon, 28 Aug 2017 17:24:29 -0400 Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...
Long story short, I think we need to rollback our last decision to prohibit context propagation up the call stack in coroutines. In PEP 550 v3 and earlier, the following snippet would work just fine:

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await bar() assert var.get() == 42 # with previous PEP 550 semantics

run_until_complete(foo())

But it would break if a user wrapped "await bar()" with "wait_for()":

var = new_context_var()

async def bar(): var.set(42)

async def foo(): await wait_for(bar(), 1) assert var.get() == 42 # AssertionError !!!

run_until_complete(foo())

[...]

...
Why wouldn't the bar() coroutine inherit the LC at the point it's instantiated (i.e. where the synchronous bar() call is done)?

We want tasks to have their own isolated contexts. When a task is started, it runs its code in parallel with its "parent" task.

I'm sorry, but I don't understand what it all means.

To pose the question differently: why is example #1 supposed to be different, philosophically, than example #2? Both spawn a coroutine, both wait for its execution to end. There is no reason that adding a wait_for() intermediary (presumably because the user wants to add a timeout) would significantly change the execution semantics of bar().

...
wait_for() in the above example creates an asyncio.Task implicitly, and that's why we don't see 'var' changed to '42' in foo().

I don't understand why a non-obvious behaviour detail (the fact that wait_for() creates an asyncio.Task implicitly) should translate into a fundamental difference in observable behaviour. I find it counter-intuitive and error-prone.

For better or worse, asyncio users generally need to be aware of the distinction between coroutines/Tasks/Futures and which functions create or return which -- it's essentially the same as the distinction between running some code in the current thread versus spawning a new thread to run it (and then possibly waiting for the result). Mostly the docs tell you when a function converts a coroutine into a Task, e.g. if you look at the docs for 'ensure_future' or 'wait_for' or 'wait' they all say this explicitly. Or in some cases like 'gather' and 'shield', it's implicit because they take arbitrary futures, and creating a task is how you convert a coroutine into a future. As a rule of thumb, I think it's accurate to say that any function that takes a coroutine object as an argument always converts it into a Task.

...

...
This is a slightly complicated case, but it's addressable with a good documentation and recommended best practices.

It would be better addressed with consistent behaviour that doesn't rely on specialist knowledge, though :-/

This is the core of the Curio/Trio critique of asyncio: in asyncio, operations that implicitly initiate concurrent execution are all over the API. This is the root cause of asyncio's problems with buffering and backpressure, it makes it hard to shut down properly (it's hard to know when everything has finished running), it's related to the "spooky cancellation at a distance" issue where cancelling one task can cause another Task to get a cancelled exception, etc. If you use the recommended "high level" API for streams, then AFAIK it's still impossible to close your streams properly at shutdown (you can request that a close happen "sometime soon", but you can't tell when it's finished). Obviously asyncio isn't going anywhere, so we should try to solve/mitigate these issues where we can, but asyncio's API fundamentally assumes that users will be very aware and careful about which operations create which kinds of concurrency. So I sort of feel like, if you can use asyncio at all, then you can handle wait_for creating a new LC. -n [1] https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-a... -- Nathaniel J. Smith -- https://vorpus.org

Antoine Pitrou

3:10 p.m.

Le 29/08/2017 à 21:59, Yury Selivanov a écrit :

...

This absolutely needs to be fixed, and the only way (that I know) it can be fixed is to revert the "every coroutine has its own LC" statement (going back to the semantics coroutines had in PEP 550 v2 and v3).

I completely agree with this. What I don't understand is why example #2 can't work the same.

...

"await bar()" and "await wait_for(bar())" are actually quite different. Let me illustrate with an example:

b1 = bar() # bar() is not running yet await b1

b2 = wait_for(bar()) # bar() was wrapped into a Task and is being running right now await b2

Usually this difference is subtle, but in asyncio it's perfectly fine to never await on b2, just let it run until it completes. If you don't "await b1" -- b1 simply will never run.

Perhaps... But still, why doesn't bar() inherit the LC *at the point where it was instantiated* (i.e. foo()'s LC in the examples)? The fact that it's *later* passed to wait_for() shouldn't matter, right? Or should it? Regards Antoine.

Nathaniel Smith

3:18 p.m.

On Tue, Aug 29, 2017 at 12:59 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:

...

b2 = wait_for(bar()) # bar() was wrapped into a Task and is being running right now await b2

Ah.... not quite. wait_for is itself implemented as a coroutine, so it doesn't spawn off bar() into its own task until you await b2. Though according to the docs you should pretend that you don't know whether wait_for returns a coroutine or a Future, so what you said would also be a conforming implementation. -n -- Nathaniel J. Smith -- https://vorpus.org

Yury Selivanov

3:20 p.m.

On Tue, Aug 29, 2017 at 4:10 PM, Antoine Pitrou <antoine@python.org> wrote: [..]

...

...
"await bar()" and "await wait_for(bar())" are actually quite different. Let me illustrate with an example:

b1 = bar() # bar() is not running yet await b1

b2 = wait_for(bar()) # bar() was wrapped into a Task and is being running right now await b2

Usually this difference is subtle, but in asyncio it's perfectly fine to never await on b2, just let it run until it completes. If you don't "await b1" -- b1 simply will never run.

Perhaps... But still, why doesn't bar() inherit the LC *at the point where it was instantiated* (i.e. foo()'s LC in the examples)? The fact that it's *later* passed to wait_for() shouldn't matter, right? Or should it?

bar() will inherit the lookup chain. Two examples: 1) gvar = new_context_var() var = new_context_var() async def bar(): # EC = [current_thread_LC_copy, Task_foo_LC] var.set(42) assert gvar.get() == 'aaaa' async def foo(): # EC = [current_thread_LC_copy, Task_foo_LC] gvar.set('aaaa') await bar() assert var.get() == 42 # with previous PEP 550 semantics assert gvar.get() == 'aaaa' # EC = [current_thread_LC] run_until_complete(foo()) # Task_foo 2) gvar = new_context_var() var = new_context_var() async def bar(): # EC = [current_thread_LC_copy, Task_foo_LC_copy, Task_wait_for_LC] var.set(42) assert gvar.get() == 'aaaa' async def foo(): # EC = [current_thread_LC_copy, Task_foo_LC] await wait_for(bar(), 1) # bar() is wrapped into Task_wait_for implicitly assert gvar.get() == 'aaaa' # OK assert var.get() == 42 # AssertionError !!! # EC = [current_thread_LC] run_until_complete(foo()) # Task_foo The key difference: In example (1), bar() will have the LC of the Task that runs foo(). Both "foo()" and "bar()" will *share* the same LC. That's why foo() will see changes made in bar(). In example (2), bar() will have the LC of wait_for() task, and foo() will have a different LC. Yury

Antoine Pitrou

3:33 p.m.

Le 29/08/2017 à 22:20, Yury Selivanov a écrit :

...

2)

gvar = new_context_var() var = new_context_var()

async def bar(): # EC = [current_thread_LC_copy, Task_foo_LC_copy, Task_wait_for_LC]

Ah, thanks!... That explains things, though I don't expect most users to spontaneously infer this and its consequences from the fact that they used "wait_for()". This seems actually even more problematic, because if bar() can mutate Task_wait_for_LC, it may unwillingly affect wait_for() (assuming the wait_for() implementation may some day use EC for whatever purpose, e.g. logging). It seems framework code like wait_for() should have a way to override the default behaviour and remove their own LC's from "child" coroutines' lookup chaines. Perhaps the PEP already allows for his? Regards Antoine.

Yury Selivanov

3:54 p.m.

On Tue, Aug 29, 2017 at 4:33 PM, Antoine Pitrou <antoine@python.org> wrote:

...

Le 29/08/2017 à 22:20, Yury Selivanov a écrit :

...
2)

gvar = new_context_var() var = new_context_var()

async def bar(): # EC = [current_thread_LC_copy, Task_foo_LC_copy, Task_wait_for_LC]

Ah, thanks!... That explains things, though I don't expect most users to spontaneously infer this and its consequences from the fact that they used "wait_for()".

Yeah, we use "# EC=" comments in the PEP to explain how EC is implemented for generators (in the Detailed Specification section), and will now do the same for coroutines (in the next update).

...

This seems actually even more problematic, because if bar() can mutate Task_wait_for_LC, it may unwillingly affect wait_for() (assuming the wait_for() implementation may some day use EC for whatever purpose, e.g. logging).

In general the patter is to wrap the passed coroutine into a Task and then attach some callbacks to it (or wrap the coroutine into another coroutine). So while I understand the concern, I can't immediately come up with a realistic example...

...

It seems framework code like wait_for() should have a way to override the default behaviour and remove their own LC's from "child" coroutines' lookup chaines. Perhaps the PEP already allows for his?

Yes, the PEP provides enough APIs to implement any semantics we want. We might want to add "execution_context" kwarg to "asyncio.create_task" to make this customization of EC easy for Tasks. Yury

2444

Age (days ago)

2445

Last active (days ago)

List overview

Download

11 comments

5 participants

participants (5)

Antoine Pitrou
Antoine Pitrou
Nathaniel Smith
Nick Coghlan
Yury Selivanov

PEP 550 v4: coroutine policy

tags

participants (5)