Fill in missing contextvars/asyncio task support

With normal synchronous code you can use `contextvars.Context.run()` to change what context code is executing within. However, there is no analagous concept for asyncio code. I'm proposing something similar, for example: coro = foo() my_context = convextvars.Context() await asyncio.run_in_context(coro) Currently the workaround is to run the coroutine on a separate task. coro = foo() my_context = convextvars.Context() await my_context.run(asycnio.create_task, coro) However this is not exactlly the same as the task will inherit a copy of my_context rather than running directly on my_context. Additionally (obviously) it will also be running in a separate task. Similarly it would be nice if create_task and the Task constructor could take an optional context kwarg to use as the task context rather than the default of copying the calling context. Pull request with sample implementation (although I think missing the change to create_task): https://github.com/python/cpython/pull/26664

However this is not exactlly the same as the task will inherit a copy of my_context rather than running directly on my_context.
Yeah, it would indeed inherit the copy. We could, theoretically, make asyncio.Task accept context objects and not copy them, but what would that give us? If a coroutine awaits on some code that calls `copy_context()` internally you will not be able to observe the modifications that code makes to its forked context. *Ultimately, contextvars enable implicit flow of information from outer code to nested code and not vice versa. *
Additionally (obviously) it will also be running in a separate task.
There's no way around that, unfortunately. Even if we add some kind of helper to run coroutines in a context, there still we be a task object that iterates the coroutine.
I guess we can add a keyword argument to asyncio.create_task() for that. It is an open question if the task factory would just use the passed context object or would copy it first. I'm leaning towards the latter. Yury On Sun, Jun 20, 2021 at 3:13 PM Mark Gordon <msg555@gmail.com> wrote:
-- Yury

My main thinking was to just be similar to the closest synchronous analog I'm aware of, contextvars.Context.run. I would think an explanation of why the Context object API exists as it does, letting you manipulate and run in contexts directly, would equally motivate the async analogs. Maybe the exception would be if this API exists purely just to support async tasks (then maybe it should be private?). At any rate, the issue attached to the pull requests gives one example of seeking to do asyncio tests with providing fixture data through an existing context object. I could also imagine a use case of wanting to track the number of database requests made within a logical request that may span multiple tasks. Having the subtasks inherit the same context could help with this.
*Ultimately, contextvars enable implicit flow of information from outer code to nested code and not vice versa. *
Just to clarify, are you stating an established position of the python community or is this your personal view of how context vars should be used?
I was just pointing out that the stated work-around requires creating an additional task to run the called coroutine rather than running directly in the calling task.
My vote would be for not a copy as mentioned above. Having a asyncio.run_in_context(context, coro()) API is more important as this feature is currently completely missing. So happy to table this if we can't decide on if/what semantics this task kwarg API change should have. -Mark

On Mon, Jun 21, 2021 at 7:20 PM Mark Gordon <msg555@gmail.com> wrote:
To track things like database requests just put a mutable object in the context somewhere at the top level, referenced by a well-known contextvar in your code. That single object will be part of all contexts derived from the top one throughout the application lifecycle.
I'm stating this as the original contextvars PEP author and implementer. I don't see how the reverse flow is possible to implement within all restrictions of the design space. To propagate information from nested calls to outer calls just use mutable objects, as I outlined above.
Yes, and I'm saying that running a coroutine "directly on the stack" within some context is not technically possible, without wrapping it into a Task.
Is asyncio.run_in_context() a version of asyncio.run() or a shortcut for Context.run(asyncio.create_task, coro)? Yury

Yury Selivanov wrote:
Using mutable objects in contexts seems strange as it works against their copy semantics. I agree it will work in this use case.
Oh very cool, thanks for taking the time to look at this.
Is there something wrong with the solution of changing the context associated with a task before execution, yielding, run the coroutine, then swap it back and yield again when the coroutine exits? This is the hacky solution that appears to work for me on python 3.8 if I force use of _PyTask (and the linked PR extends this idea to work for the CTask implementation). import asyncio.tasks asyncio.tasks.Task = asyncio.tasks._PyTask import asyncio import contextvars var = contextvars.ContextVar('var', default=123) async def run_in_context(context, coro): task = asyncio.current_task() prev_context = task._context task._context = context await asyncio.sleep(0) try: return await coro finally: task._context = prev_context await asyncio.sleep(0) async def foo(x): old_val = var.get() var.set(x) return old_val async def main(): var.set(5) context = contextvars.Context() print(await run_in_context(context, foo(555))) # 123 print(var.get()) # 5 print(await run_in_context(context, foo(999))) # 555 print(var.get()) # 5 print(context.run(var.get)) # 999 asyncio.run(main())
Yeah I mean it to be more or less the same except without creating a new task (which maybe is impossible?).
Yury

On Tue, Jun 22 2021 at 04:40:45 AM -0000, Mark Gordon <msg555@gmail.com> wrote:
That's a feature :) Perhaps we should add an example to the docs.
There's not much wrong about this approach for simple coroutines. But if a coroutine runs its own tasks or code that forks the context inside, you won't see those changes in your context. In other words, the hack you propose will work for some cases, and fail for others. So that's why I'm -1 on changing the API that way. Yury

That's a feature :) Perhaps we should add an example to the docs.
What do you view as the point of the copy semantics, then?
I thought you were against this usage pattern anyway. Not sure what this has to do with the proposed API change.
In other words, the hack you propose will work for some cases, and fail for others.
It seems like you are assuming a purpose to this API that I did not intend. It really is just so you can change a context that a task is running within without creating a new task. It's really just to be an analog of Context.run.

On Wed, Jun 23 2021 at 05:16:00 PM -0000, Mark Gordon <msg555@gmail.com> wrote:
That's a feature :) Perhaps we should add an example to the docs.
What do you view as the point of the copy semantics, then?
Are you asking why the context is using immutable data-structures internally, in the first place? Please read the PEP, it explains that in great detail. Short answer: performance; it has nothing to do with *what* is stored in the context. It can store mutable objects, that's totally OK for aggregating metrics, for example.
I'm +1 to add a 'context' keyword-argument to 'asyncio.create_task()'. It will still be copied. I believe I've explained *why* the copy is still necessary in this thread. Yury

I've read the PEP and understand what's implemented. However there is pretty limited discussion about what the design constraints were and what intended/recommended usage would look like. I'll answer my own question: 1. If all we wanted was a version of TLS that worked in an analogous way extending (synchronous code, threads) to (async code, tasks) then you don't need anything fancy, a simple dictionary backing the "context" will do. This is all that you need to solve the "decimal formatting" problem, for instance. 2. However, the scope of PEP 567 was increased to something greater. It was decided that we want tasks/threads to be able to inherit an existing context. This is a unique feature with no analog in TLS. I believe a motivating use case was for a request/response server that may spawn worker tasks off the main task and want to store request context information in contextvars. 3. Additionally, to continue to have the "decimal formatting" solution work correctly it's necessary that no two tasks/threads are running on the same context. This means "inherting" should mean "running on a copy of". These constraints strongly suggest an interface of: contextvars.get_context() -> Context Context.run_in_copy(func) -> Context Context.async_run_in_copy(coro) -> Context *** NoContext.run method, no copy methods needed either *** So what was the motivation behind having a copy_context() and a non-copying Context.run method? It seems to break the third design constraint allowing you to have multiple threads try and run on the same Context. Nonetheless, "asyncio.run_in_context()" is a direct analog of "Context.run" and should be a clear add to the API from a symmetry point of view. We are already beyond the point where constraint three is being strictly enforced. I'm not sure what argument against this API wouldn't apply to "Context.run()" as well. If it's still a -1 on "asyncio.run_in_context()" what about "asyncio.run_in_context_copy(context, coro) -> Context" that copies the passed context and runs the coroutine in the task using that context? If we go this route maybe we would plan on deprecating Context.run and replacing it with a Context.run_in_copy method?

However this is not exactlly the same as the task will inherit a copy of my_context rather than running directly on my_context.
Yeah, it would indeed inherit the copy. We could, theoretically, make asyncio.Task accept context objects and not copy them, but what would that give us? If a coroutine awaits on some code that calls `copy_context()` internally you will not be able to observe the modifications that code makes to its forked context. *Ultimately, contextvars enable implicit flow of information from outer code to nested code and not vice versa. *
Additionally (obviously) it will also be running in a separate task.
There's no way around that, unfortunately. Even if we add some kind of helper to run coroutines in a context, there still we be a task object that iterates the coroutine.
I guess we can add a keyword argument to asyncio.create_task() for that. It is an open question if the task factory would just use the passed context object or would copy it first. I'm leaning towards the latter. Yury On Sun, Jun 20, 2021 at 3:13 PM Mark Gordon <msg555@gmail.com> wrote:
-- Yury

My main thinking was to just be similar to the closest synchronous analog I'm aware of, contextvars.Context.run. I would think an explanation of why the Context object API exists as it does, letting you manipulate and run in contexts directly, would equally motivate the async analogs. Maybe the exception would be if this API exists purely just to support async tasks (then maybe it should be private?). At any rate, the issue attached to the pull requests gives one example of seeking to do asyncio tests with providing fixture data through an existing context object. I could also imagine a use case of wanting to track the number of database requests made within a logical request that may span multiple tasks. Having the subtasks inherit the same context could help with this.
*Ultimately, contextvars enable implicit flow of information from outer code to nested code and not vice versa. *
Just to clarify, are you stating an established position of the python community or is this your personal view of how context vars should be used?
I was just pointing out that the stated work-around requires creating an additional task to run the called coroutine rather than running directly in the calling task.
My vote would be for not a copy as mentioned above. Having a asyncio.run_in_context(context, coro()) API is more important as this feature is currently completely missing. So happy to table this if we can't decide on if/what semantics this task kwarg API change should have. -Mark

On Mon, Jun 21, 2021 at 7:20 PM Mark Gordon <msg555@gmail.com> wrote:
To track things like database requests just put a mutable object in the context somewhere at the top level, referenced by a well-known contextvar in your code. That single object will be part of all contexts derived from the top one throughout the application lifecycle.
I'm stating this as the original contextvars PEP author and implementer. I don't see how the reverse flow is possible to implement within all restrictions of the design space. To propagate information from nested calls to outer calls just use mutable objects, as I outlined above.
Yes, and I'm saying that running a coroutine "directly on the stack" within some context is not technically possible, without wrapping it into a Task.
Is asyncio.run_in_context() a version of asyncio.run() or a shortcut for Context.run(asyncio.create_task, coro)? Yury

Yury Selivanov wrote:
Using mutable objects in contexts seems strange as it works against their copy semantics. I agree it will work in this use case.
Oh very cool, thanks for taking the time to look at this.
Is there something wrong with the solution of changing the context associated with a task before execution, yielding, run the coroutine, then swap it back and yield again when the coroutine exits? This is the hacky solution that appears to work for me on python 3.8 if I force use of _PyTask (and the linked PR extends this idea to work for the CTask implementation). import asyncio.tasks asyncio.tasks.Task = asyncio.tasks._PyTask import asyncio import contextvars var = contextvars.ContextVar('var', default=123) async def run_in_context(context, coro): task = asyncio.current_task() prev_context = task._context task._context = context await asyncio.sleep(0) try: return await coro finally: task._context = prev_context await asyncio.sleep(0) async def foo(x): old_val = var.get() var.set(x) return old_val async def main(): var.set(5) context = contextvars.Context() print(await run_in_context(context, foo(555))) # 123 print(var.get()) # 5 print(await run_in_context(context, foo(999))) # 555 print(var.get()) # 5 print(context.run(var.get)) # 999 asyncio.run(main())
Yeah I mean it to be more or less the same except without creating a new task (which maybe is impossible?).
Yury

On Tue, Jun 22 2021 at 04:40:45 AM -0000, Mark Gordon <msg555@gmail.com> wrote:
That's a feature :) Perhaps we should add an example to the docs.
There's not much wrong about this approach for simple coroutines. But if a coroutine runs its own tasks or code that forks the context inside, you won't see those changes in your context. In other words, the hack you propose will work for some cases, and fail for others. So that's why I'm -1 on changing the API that way. Yury

That's a feature :) Perhaps we should add an example to the docs.
What do you view as the point of the copy semantics, then?
I thought you were against this usage pattern anyway. Not sure what this has to do with the proposed API change.
In other words, the hack you propose will work for some cases, and fail for others.
It seems like you are assuming a purpose to this API that I did not intend. It really is just so you can change a context that a task is running within without creating a new task. It's really just to be an analog of Context.run.

On Wed, Jun 23 2021 at 05:16:00 PM -0000, Mark Gordon <msg555@gmail.com> wrote:
That's a feature :) Perhaps we should add an example to the docs.
What do you view as the point of the copy semantics, then?
Are you asking why the context is using immutable data-structures internally, in the first place? Please read the PEP, it explains that in great detail. Short answer: performance; it has nothing to do with *what* is stored in the context. It can store mutable objects, that's totally OK for aggregating metrics, for example.
I'm +1 to add a 'context' keyword-argument to 'asyncio.create_task()'. It will still be copied. I believe I've explained *why* the copy is still necessary in this thread. Yury

I've read the PEP and understand what's implemented. However there is pretty limited discussion about what the design constraints were and what intended/recommended usage would look like. I'll answer my own question: 1. If all we wanted was a version of TLS that worked in an analogous way extending (synchronous code, threads) to (async code, tasks) then you don't need anything fancy, a simple dictionary backing the "context" will do. This is all that you need to solve the "decimal formatting" problem, for instance. 2. However, the scope of PEP 567 was increased to something greater. It was decided that we want tasks/threads to be able to inherit an existing context. This is a unique feature with no analog in TLS. I believe a motivating use case was for a request/response server that may spawn worker tasks off the main task and want to store request context information in contextvars. 3. Additionally, to continue to have the "decimal formatting" solution work correctly it's necessary that no two tasks/threads are running on the same context. This means "inherting" should mean "running on a copy of". These constraints strongly suggest an interface of: contextvars.get_context() -> Context Context.run_in_copy(func) -> Context Context.async_run_in_copy(coro) -> Context *** NoContext.run method, no copy methods needed either *** So what was the motivation behind having a copy_context() and a non-copying Context.run method? It seems to break the third design constraint allowing you to have multiple threads try and run on the same Context. Nonetheless, "asyncio.run_in_context()" is a direct analog of "Context.run" and should be a clear add to the API from a symmetry point of view. We are already beyond the point where constraint three is being strictly enforced. I'm not sure what argument against this API wouldn't apply to "Context.run()" as well. If it's still a -1 on "asyncio.run_in_context()" what about "asyncio.run_in_context_copy(context, coro) -> Context" that copies the passed context and runs the coroutine in the task using that context? If we go this route maybe we would plan on deprecating Context.run and replacing it with a Context.run_in_copy method?
participants (4)
-
Mark Gordon
-
Paul Bryan
-
yselivanov.ml@gmail.com
-
Yury Selivanov