Letting context managers react to yields inside their scope

Hi all, Well, after a few days no-one has responded to my post on another thread about this [1], but the more I thought about it the more this seemed like a good idea, so I wrote up a little more-formal proposal (attached) for letting context managers react to 'yield's that occur within their 'with' block. This should in many ways be treated as a straw man proposal -- there's tons I don't know about how async code is written in Python these days -- but it seems like a good idea to me and I'd like to hear what everyone else thinks :-). -n [1] https://mail.python.org/pipermail/python-ideas/2015-April/033176.html -- Nathaniel J. Smith -- http://vorpus.org

This seems reasonable, though mostly also non-urgent. Have you thought about how it interacts with PEP 492 yet? On Wed, Apr 29, 2015 at 1:22 PM, Nathaniel Smith <njs@pobox.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Apr 29, 2015 at 1:36 PM, Guido van Rossum <guido@python.org> wrote:
This seems reasonable, though mostly also non-urgent. Have you thought about how it interacts with PEP 492 yet?
The interaction with PEP 492 is pretty trivial AFAICT; they're almost entirely orthogonal. The only points of interaction are: 1) PEP 492 adds new suspension points beyond "yield" and "yield from" (i.e., "await" plus the implicit suspension points in "async for" and "async with"). This proposal tweaks the behavior of suspension points (so that they call __suspend__/__resume__). So obviously the solution is that if PEP 492 is accepted then we have to document that yes, the new suspension points also trigger __suspend__/__resume__ calls in the obvious way, just like "yield" does. 2) PEP 492 defines a new asynchronous context manager protocol, which is "the same as the regular context manager protocol, except with the letter 'a' added and they're coroutines instead of regular methods". This proposal adds stuff to the regular context manager protocol, so one would want to make the analogous changes to the asynchronous context manager protocol too. It's not 100% clear to me whether the __asuspend__/__aresume__ versions should be coroutines or not, but this is a pretty simple question to resolve. (I see downthread Yury that thinks not, so okay, I guess not, done :-).) So there aren't any real gotchas here AFAICT. -n
-- Nathaniel J. Smith -- http://vorpus.org

Hi Nathanial, Sorry for not replying to your last email promptly. FWIW I was going to do it later today ;) My opinion on this subject (and I've implemented lots of local-context kind of objects in different frameworks) is that *just* inserting some kind of suspend/resume points before/after yields does not work. Why: 1. Greenlets, gevent, eventlet, stackless python: they do not have 'yield'. Context switches are invisible to the interpreter. And those frameworks are popular. You want numpy.errstate/decimal.localcontext to work there too. 2. I'm curious how this will be implemented and what's the performance impact. 3. I think that mechanism should be more generic than just 'with' staments. What if you want to have a decorator that applies some context? What if you can't write it as a generator with @contextlib.contextmanager? What I would propose: I think that the right approach here would be to have a standard protocol; something similar to how we defined WSGI -- it doesn't require any changes in the interpreter, it is a protocol. For instance, we could have a module in the stdlib, with a class Context: class Context: @classmethod def __suspend_contexts(cls): for child in cls.__subclasses__(): child.__suspend_context() @classmethod def __resume_contexts(cls): .. To inform context objects that the context is about to change, asyncio event loop would call Context.__suspend_contexts() I think that it's up to the event loop/library/framework to manage context switches properly. In asyncio, for instance, you can attach callbacks to Futures. I think it would be great if such callbacks are executed in the right context. I would also suggest to think about a universal context -- the one that works both for asyncio coroutines and threadpools they might use. This way any framework can implement the context in the most efficient way. All in all, I'm in favor of API and a couple functions added to the stdlib for this. Thanks, Yury On 2015-04-29 4:22 PM, Nathaniel Smith wrote:

On 30 April 2015 at 07:01, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
+1 - I think we have reasonably good precedent for this in the form of atexit and there's also a proposal for an "atfork" manager: https://bugs.python.org/issue16500 Decimal contexts and (to a much lesser degree) fpectl are also interesting use cases, as is the event loop policy management in asyncio. One of the more interesting aspects of a context is its scope: * Can the same context be safely accessed from multiple threads concurrently? At different points in time? * Is there a "default context" which applies if no other context has been explicitly activated? By defining a global metacontext, it becomes feasible to sensibly manage things like the active asyncio event loop policy, the decimal context, etc, in ways that multiple frameworks can interoperate with. It's also potentially something that the logging module, pdb, and other tools could integrate with, by being able to better report the active context when requested. https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack is also an interest data point, as I can easily envision a variant of that API model that remembers what contexts *were* on the stack (this becoming a "ContextStack", rather than an "ExitStack"), making it easier to switch *between* stacks, rather than throwing them away when you're done. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Apr 29, 2015 at 6:53 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I've thought about these and other issues for quite some time. :) I have an old implementation of a generic context type kicking around that came out of my initial work on the "Import Engine" PEP. At the time it seemed genuinely useful and would have been used in decimal and maybe for email policies (in addition to import system contexts). I'm spinning back up on the successor to that PEP, which is part of why I brought "async-local" contexts the other day.
+1
interesting. -eric

On 2015-04-29 8:53 PM, Nick Coghlan wrote:
* Can the same context be safely accessed from multiple threads concurrently? At different points in time?
In one of my context implementations (closed source) I monkey-patch threading module to actually track when threads are started to preserve the context information. Internally, the data storage is a tree, which branches with control points each time the context is modified. This way it's possible to see all data that was added to context "on the way" to the point that is currently executing by Python.
* Is there a "default context" which applies if no other context has been explicitly activated?
In all implementations I myself wrote, I always had a root context. But I don't like this design, especially for something generic in the standard library, where you want access to this root object to be namespaced (i.e. numpy cannot conflict with decimal by using the same key). I'm OK with some central hidden root context though, exposed only via an API for event loops, pools, etc, to facilitate the context switching. Thanks, Yury

On Wed, Apr 29, 2015 at 2:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Very good observation. (I.e. I hadn't thought of this myself. :-)
I'm not sure about this, since currently those callbacks are how task scheduling happens. :-)
I'm curious -- how would you access the current contexts? -- --Guido van Rossum (python.org/~guido)

On 2015-04-29 11:28 PM, Guido van Rossum wrote:
I tried to find my implementation of context for asyncio (it was designed and tested to work with https://codereview.appspot.com/87940044/#ps1) but I couldn't. I'll need to spend some time to prototype this again. And not as a patch to asyncio, but as a generic thing. Yury

On Wed, Apr 29, 2015 at 3:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Good point.
Great idea. I've been thinking of something along those lines. I think there also needs to be a recognition of active vs. inactive context. Only active contexts would be used by the event loop/etc. A generic context would also make it easy to clone and to use in context managers. One approach: class Context: def activate(self): ... # or open def deactivate(self): ... # or close def clone(self): ... def __suspend__(self, cs, id): ... def __resume__(self, cs, id): ... def __enter__(self): self.activate() return self def __exit__(self, ...): self.deactivate() def active(ctx=Context): return ... The various "context switchers" (event loop, thread pool, etc.) would then call __suspend__ and __resume__ on all the active contexts. The two args (cs for "context switcher") to those methods would allow each context object to adjust to the targeted "linear execution context" uniquely identified by the (cs, id) pair. For example: thread/async-local namespaces. I've often found it to be cleaner to split the "spec" from the "instance", so: class InactiveContext: # or GlobalContext or DefaultContext def activate(self): ... # return a new cloned Context def clone(self): ... # merge with activate? class Context: def deactivate(self): ... def clone(self): ... def __suspend__(self, cm, id): ... def __resume__(self, cm, id): ... def __enter__(self): return self def __exit__(self, ...): self.deactivate() This is all rough and the result of divided attention over the last few hours, but the idea of leaving the switching up to the "context switcher" is the key thing. Also, I smell some overlapping concepts with existing types and patterns that could probably distill the idea of a generic Context type into something cleaner. -eric

On Wed, Apr 29, 2015 at 2:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Oh well, this way you got a nice clean thread to do it in instead :-)
Before I get into the details of your message, I just want to emphasize-- there are clearly cases where inserting some kind of suspend/resume points before/after yields is not just a sufficient solution, but actually the only possible solution. Consider a streaming data processing pipeline that uses a generator to load data from disk in chunks, and then a series of generators to process, split, filter, etc. this data before passing it on to the next generator in the stack. For certain problems this is 100% the best way to do it, generators are amazing. But right now you kinda... just cannot use certain kinds of 'with' blocks in such code without all kinds of weirdness ensuing. And coming up with fancy APIs for event loops to use won't help, because there are no event loops here at all. The problem here isn't caused by event loops, it's caused by the way adding coroutines to Python created a dissociation between a block's dynamic-scope-in-terms-of-lifetime and its dynamic-scope-in-terms-of-call-chain. Obviously there may be other cases where you also need something else on top of this proposal (though I'm not 100% convinced... see below). But logically speaking, even if you're right that this proposal is not sufficient, that doesn't imply that it's not necessary :-).
First, this is the only argument you're giving for "why" yield callbacks are insufficient, right? (I'm a little confused by your formatting and want to make sure I'm not missing anything.) Also, just to check, the only actual library that we really have to worry about here is greenlets, right, b/c gevent and eventlet use greenlets to do the heavy lifting, and stackless is a whole-interpreter fork so for them context switches are visible to the interpreter? Now, as to the actual point, I have a few reactions: a) The curmudgeonly reaction: Well, yes, the problem with those frameworks is that switches are invisible to both the interpreter and to user code, which makes it more difficult to manage shared state of all kinds, not just the very specific pseudo-dynamic-scoping state that we're talking about here. If you want a programming model where you can actually keep track of shared state, then maybe you should use something like asyncio instead :-). b) Okay, fine, if you want it to just work anyway: any implementation of this proposal will require that the interpreter have some way to walk the "with block stack" (so to speak) and call the __suspend__ and __resume__ methods on demand. greenlets could also walk up that stack when suspending execution. (This would require the library to do a bit of nasty mucking about in the CPython internals, but c'mon, this is greenlets we're talking about. Compared to the other stuff it does this would hardly even register.) The one caveat here is that if we want this to be possible then it would require we skip statically optimizing out __suspend__/__resume__ checking in regular functions. c) Backup plan: greenlets could provide an API like: greenlets.register_suspend_resume_callbacks(), greenlets.unregister_suspend_resume_callbacks(), and context managers that wanted to handle both kinds of suspensions could define __suspend__/__resume__ and then do def __enter__(self): greenlets.register_suspend_resume_callbacks(self.__suspend__, self.__resume__) # ... actual code ... def __exit__(self): greenlets.unregister_suspend_resume_callbacks(self.__suspend__, self.__resume__) # ... actual code ... and everything would AFAICT just work. Kinda annoying, but I'm pretty sure that there is and will always be exactly one greenlets library across the whole Python ecosystem, so it's not much of a limitation.
2. I'm curious how this will be implemented and what's the performance impact.
I am also curious about the performance impact :-). I don't see any reason to expect it to be large, but one never knows for sure until one has an implementation. As for implementation, the most straightforward way I think would be to (1) add a variant of the SETUP_WITH bytecode that also looks up __suspend__ and __resume__ and pushes them into the value stack and then uses a unique b_type for its block stack entry, (2) add a variant of YIELD_VALUE that walks the block stack looking for this unique b_type and calls __suspend__ when found (b_level points to where to find it on the value stack), plus maybe as an optimization setting a flag on the frame noting that __resume__ will be needed, (3) tweak the resumption code so that it checks for this flag and if found walks the block stack calling __resume__, (4) add a variant of WITH_CLEANUP that knows how to clean up this more complex stack, (5) make sure that the unwind code knows to treat this new block type the same way it treats SETUP_FINALLY blocks. That's based on like 30 minutes of thought starting from zero knowledge of how this part of ceval.c works, so knowing ceval.c there are certainly a few dragons lurking that I've missed. But as an initial high-level sketch does it answer your question?
I don't understand the problem here. A coroutine decorator already knows when the wrapped coroutine is suspended/resumed, so there's nothing to fix -- basically for decorators this proposal is already implemented :-).
What if you can't write it as a generator with @contextlib.contextmanager?
Here I just don't understand the sentence :-). Can you rephrase? Answering one possible question you might have intended: @contextmanager indeed is not useful for defining a context manager that has __suspend__ and __resume__ callbacks, but this is just a convenience utility. My suspicion is that these kinds of context managers are already sufficiently rare and tricky to write that having special utilities to make it easier is not that important, but if someone comes up with a nice API for doing so then cool, we could put that in contextlib.
I can't see how to usefully implement, well, anything, using this interface as given. Maybe you mean something like class Context: @classmethod def switch_contexts(cls, old_ctx_id, new_ctx_id): for child in cls.__subclasses__(): child.switch_contexts(old_ctx_id, new_ctx_id) @classmethod def destroy_context(cls, ctx_id): for child in cls.__subclasses__(): child.destroy_context(ctx_id) ? Or a better approach that avoids having to generate and keep track of arbitrary identifiers would be: class Context: @classmethod def new_context(cls): return {subcls: subcls.new_context() for subcls in cls.__subclasses__()} @classmethod def get_context(cls): return {subcls: subcls.get_context() for subcls in cls.__subclasses__()} @classmethod def set_context(cls, context): for subcls in cls.__subclasses__(): subcls.set_context(context[subcls]) And then you could use it like: class myframework_context: def __enter__(self): self.old_context = Context.get_context() self.context = Context.new_context() Context.set_context(self.context) def __suspend__(self): Context.set_context(self.old_context) def __resume__(self): Context.set_context(self.context) def __exit__(self): Context.set_context(self.old_context) def myframework_run_task(coro): with myframework_context: yield from coro and voila, works even if you have multiple event loops calling into each other :-) Okay, I kid, but my point is that these different approaches really are not very different. The big advantage of this approach over ones that involve doing stuff like main_loop.get_dynamic_namespace() is that it lets each piece of state be stored for regular usage in whatever form makes sense -- e.g. numpy's errstate can live in a plain old TLS variable and be accessed quickly from there, and we only have to do expensive stuff at context switch time. But this also means that your context switches are now notionally identical to my context switches -- they call some suspend/resume callbacks and let each piece of state push/pop itself as appropriate. The real difference is just that you assume that every Python program has exactly one piece of manager code somewhere that knows about all context switches, so you have it do the calls, and I assume that there may be many context switch points within a single program (because, well, Python has dedicated syntax for such context switches built into the language itself), so I think Python itself should do the calls at each appropriate place.
Again, these approaches aren't necessarily contradictory. For something like a request id to be used in logging, which is tightly coupled to the flow of data through your async application, what you say makes perfect sense. For something like the decimal context, it feels pretty weird to me.
I'm afraid I have no idea what this means either. -n -- Nathaniel J. Smith -- http://vorpus.org

On Wed, Apr 29, 2015 at 2:29 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I couldn't think of any cases either, but said __asuspend__/__aresume__ to play it safe in case I missed any :-). I guess it makes sense, though, that from outside our thread, suspend/resume operations are invisible -- no-one cares whether the coroutine they're talking to is currently scheduled or currently suspended, that's purely internal. Therefore, suspend/resume callbacks should never need to block, yeah. -n -- Nathaniel J. Smith -- http://vorpus.org

This seems reasonable, though mostly also non-urgent. Have you thought about how it interacts with PEP 492 yet? On Wed, Apr 29, 2015 at 1:22 PM, Nathaniel Smith <njs@pobox.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On Wed, Apr 29, 2015 at 1:36 PM, Guido van Rossum <guido@python.org> wrote:
This seems reasonable, though mostly also non-urgent. Have you thought about how it interacts with PEP 492 yet?
The interaction with PEP 492 is pretty trivial AFAICT; they're almost entirely orthogonal. The only points of interaction are: 1) PEP 492 adds new suspension points beyond "yield" and "yield from" (i.e., "await" plus the implicit suspension points in "async for" and "async with"). This proposal tweaks the behavior of suspension points (so that they call __suspend__/__resume__). So obviously the solution is that if PEP 492 is accepted then we have to document that yes, the new suspension points also trigger __suspend__/__resume__ calls in the obvious way, just like "yield" does. 2) PEP 492 defines a new asynchronous context manager protocol, which is "the same as the regular context manager protocol, except with the letter 'a' added and they're coroutines instead of regular methods". This proposal adds stuff to the regular context manager protocol, so one would want to make the analogous changes to the asynchronous context manager protocol too. It's not 100% clear to me whether the __asuspend__/__aresume__ versions should be coroutines or not, but this is a pretty simple question to resolve. (I see downthread Yury that thinks not, so okay, I guess not, done :-).) So there aren't any real gotchas here AFAICT. -n
-- Nathaniel J. Smith -- http://vorpus.org

Hi Nathanial, Sorry for not replying to your last email promptly. FWIW I was going to do it later today ;) My opinion on this subject (and I've implemented lots of local-context kind of objects in different frameworks) is that *just* inserting some kind of suspend/resume points before/after yields does not work. Why: 1. Greenlets, gevent, eventlet, stackless python: they do not have 'yield'. Context switches are invisible to the interpreter. And those frameworks are popular. You want numpy.errstate/decimal.localcontext to work there too. 2. I'm curious how this will be implemented and what's the performance impact. 3. I think that mechanism should be more generic than just 'with' staments. What if you want to have a decorator that applies some context? What if you can't write it as a generator with @contextlib.contextmanager? What I would propose: I think that the right approach here would be to have a standard protocol; something similar to how we defined WSGI -- it doesn't require any changes in the interpreter, it is a protocol. For instance, we could have a module in the stdlib, with a class Context: class Context: @classmethod def __suspend_contexts(cls): for child in cls.__subclasses__(): child.__suspend_context() @classmethod def __resume_contexts(cls): .. To inform context objects that the context is about to change, asyncio event loop would call Context.__suspend_contexts() I think that it's up to the event loop/library/framework to manage context switches properly. In asyncio, for instance, you can attach callbacks to Futures. I think it would be great if such callbacks are executed in the right context. I would also suggest to think about a universal context -- the one that works both for asyncio coroutines and threadpools they might use. This way any framework can implement the context in the most efficient way. All in all, I'm in favor of API and a couple functions added to the stdlib for this. Thanks, Yury On 2015-04-29 4:22 PM, Nathaniel Smith wrote:

On 30 April 2015 at 07:01, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
+1 - I think we have reasonably good precedent for this in the form of atexit and there's also a proposal for an "atfork" manager: https://bugs.python.org/issue16500 Decimal contexts and (to a much lesser degree) fpectl are also interesting use cases, as is the event loop policy management in asyncio. One of the more interesting aspects of a context is its scope: * Can the same context be safely accessed from multiple threads concurrently? At different points in time? * Is there a "default context" which applies if no other context has been explicitly activated? By defining a global metacontext, it becomes feasible to sensibly manage things like the active asyncio event loop policy, the decimal context, etc, in ways that multiple frameworks can interoperate with. It's also potentially something that the logging module, pdb, and other tools could integrate with, by being able to better report the active context when requested. https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack is also an interest data point, as I can easily envision a variant of that API model that remembers what contexts *were* on the stack (this becoming a "ContextStack", rather than an "ExitStack"), making it easier to switch *between* stacks, rather than throwing them away when you're done. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Apr 29, 2015 at 6:53 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I've thought about these and other issues for quite some time. :) I have an old implementation of a generic context type kicking around that came out of my initial work on the "Import Engine" PEP. At the time it seemed genuinely useful and would have been used in decimal and maybe for email policies (in addition to import system contexts). I'm spinning back up on the successor to that PEP, which is part of why I brought "async-local" contexts the other day.
+1
interesting. -eric

On 2015-04-29 8:53 PM, Nick Coghlan wrote:
* Can the same context be safely accessed from multiple threads concurrently? At different points in time?
In one of my context implementations (closed source) I monkey-patch threading module to actually track when threads are started to preserve the context information. Internally, the data storage is a tree, which branches with control points each time the context is modified. This way it's possible to see all data that was added to context "on the way" to the point that is currently executing by Python.
* Is there a "default context" which applies if no other context has been explicitly activated?
In all implementations I myself wrote, I always had a root context. But I don't like this design, especially for something generic in the standard library, where you want access to this root object to be namespaced (i.e. numpy cannot conflict with decimal by using the same key). I'm OK with some central hidden root context though, exposed only via an API for event loops, pools, etc, to facilitate the context switching. Thanks, Yury

On Wed, Apr 29, 2015 at 2:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Very good observation. (I.e. I hadn't thought of this myself. :-)
I'm not sure about this, since currently those callbacks are how task scheduling happens. :-)
I'm curious -- how would you access the current contexts? -- --Guido van Rossum (python.org/~guido)

On 2015-04-29 11:28 PM, Guido van Rossum wrote:
I tried to find my implementation of context for asyncio (it was designed and tested to work with https://codereview.appspot.com/87940044/#ps1) but I couldn't. I'll need to spend some time to prototype this again. And not as a patch to asyncio, but as a generic thing. Yury

On Wed, Apr 29, 2015 at 3:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Good point.
Great idea. I've been thinking of something along those lines. I think there also needs to be a recognition of active vs. inactive context. Only active contexts would be used by the event loop/etc. A generic context would also make it easy to clone and to use in context managers. One approach: class Context: def activate(self): ... # or open def deactivate(self): ... # or close def clone(self): ... def __suspend__(self, cs, id): ... def __resume__(self, cs, id): ... def __enter__(self): self.activate() return self def __exit__(self, ...): self.deactivate() def active(ctx=Context): return ... The various "context switchers" (event loop, thread pool, etc.) would then call __suspend__ and __resume__ on all the active contexts. The two args (cs for "context switcher") to those methods would allow each context object to adjust to the targeted "linear execution context" uniquely identified by the (cs, id) pair. For example: thread/async-local namespaces. I've often found it to be cleaner to split the "spec" from the "instance", so: class InactiveContext: # or GlobalContext or DefaultContext def activate(self): ... # return a new cloned Context def clone(self): ... # merge with activate? class Context: def deactivate(self): ... def clone(self): ... def __suspend__(self, cm, id): ... def __resume__(self, cm, id): ... def __enter__(self): return self def __exit__(self, ...): self.deactivate() This is all rough and the result of divided attention over the last few hours, but the idea of leaving the switching up to the "context switcher" is the key thing. Also, I smell some overlapping concepts with existing types and patterns that could probably distill the idea of a generic Context type into something cleaner. -eric

On Wed, Apr 29, 2015 at 2:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Oh well, this way you got a nice clean thread to do it in instead :-)
Before I get into the details of your message, I just want to emphasize-- there are clearly cases where inserting some kind of suspend/resume points before/after yields is not just a sufficient solution, but actually the only possible solution. Consider a streaming data processing pipeline that uses a generator to load data from disk in chunks, and then a series of generators to process, split, filter, etc. this data before passing it on to the next generator in the stack. For certain problems this is 100% the best way to do it, generators are amazing. But right now you kinda... just cannot use certain kinds of 'with' blocks in such code without all kinds of weirdness ensuing. And coming up with fancy APIs for event loops to use won't help, because there are no event loops here at all. The problem here isn't caused by event loops, it's caused by the way adding coroutines to Python created a dissociation between a block's dynamic-scope-in-terms-of-lifetime and its dynamic-scope-in-terms-of-call-chain. Obviously there may be other cases where you also need something else on top of this proposal (though I'm not 100% convinced... see below). But logically speaking, even if you're right that this proposal is not sufficient, that doesn't imply that it's not necessary :-).
First, this is the only argument you're giving for "why" yield callbacks are insufficient, right? (I'm a little confused by your formatting and want to make sure I'm not missing anything.) Also, just to check, the only actual library that we really have to worry about here is greenlets, right, b/c gevent and eventlet use greenlets to do the heavy lifting, and stackless is a whole-interpreter fork so for them context switches are visible to the interpreter? Now, as to the actual point, I have a few reactions: a) The curmudgeonly reaction: Well, yes, the problem with those frameworks is that switches are invisible to both the interpreter and to user code, which makes it more difficult to manage shared state of all kinds, not just the very specific pseudo-dynamic-scoping state that we're talking about here. If you want a programming model where you can actually keep track of shared state, then maybe you should use something like asyncio instead :-). b) Okay, fine, if you want it to just work anyway: any implementation of this proposal will require that the interpreter have some way to walk the "with block stack" (so to speak) and call the __suspend__ and __resume__ methods on demand. greenlets could also walk up that stack when suspending execution. (This would require the library to do a bit of nasty mucking about in the CPython internals, but c'mon, this is greenlets we're talking about. Compared to the other stuff it does this would hardly even register.) The one caveat here is that if we want this to be possible then it would require we skip statically optimizing out __suspend__/__resume__ checking in regular functions. c) Backup plan: greenlets could provide an API like: greenlets.register_suspend_resume_callbacks(), greenlets.unregister_suspend_resume_callbacks(), and context managers that wanted to handle both kinds of suspensions could define __suspend__/__resume__ and then do def __enter__(self): greenlets.register_suspend_resume_callbacks(self.__suspend__, self.__resume__) # ... actual code ... def __exit__(self): greenlets.unregister_suspend_resume_callbacks(self.__suspend__, self.__resume__) # ... actual code ... and everything would AFAICT just work. Kinda annoying, but I'm pretty sure that there is and will always be exactly one greenlets library across the whole Python ecosystem, so it's not much of a limitation.
2. I'm curious how this will be implemented and what's the performance impact.
I am also curious about the performance impact :-). I don't see any reason to expect it to be large, but one never knows for sure until one has an implementation. As for implementation, the most straightforward way I think would be to (1) add a variant of the SETUP_WITH bytecode that also looks up __suspend__ and __resume__ and pushes them into the value stack and then uses a unique b_type for its block stack entry, (2) add a variant of YIELD_VALUE that walks the block stack looking for this unique b_type and calls __suspend__ when found (b_level points to where to find it on the value stack), plus maybe as an optimization setting a flag on the frame noting that __resume__ will be needed, (3) tweak the resumption code so that it checks for this flag and if found walks the block stack calling __resume__, (4) add a variant of WITH_CLEANUP that knows how to clean up this more complex stack, (5) make sure that the unwind code knows to treat this new block type the same way it treats SETUP_FINALLY blocks. That's based on like 30 minutes of thought starting from zero knowledge of how this part of ceval.c works, so knowing ceval.c there are certainly a few dragons lurking that I've missed. But as an initial high-level sketch does it answer your question?
I don't understand the problem here. A coroutine decorator already knows when the wrapped coroutine is suspended/resumed, so there's nothing to fix -- basically for decorators this proposal is already implemented :-).
What if you can't write it as a generator with @contextlib.contextmanager?
Here I just don't understand the sentence :-). Can you rephrase? Answering one possible question you might have intended: @contextmanager indeed is not useful for defining a context manager that has __suspend__ and __resume__ callbacks, but this is just a convenience utility. My suspicion is that these kinds of context managers are already sufficiently rare and tricky to write that having special utilities to make it easier is not that important, but if someone comes up with a nice API for doing so then cool, we could put that in contextlib.
I can't see how to usefully implement, well, anything, using this interface as given. Maybe you mean something like class Context: @classmethod def switch_contexts(cls, old_ctx_id, new_ctx_id): for child in cls.__subclasses__(): child.switch_contexts(old_ctx_id, new_ctx_id) @classmethod def destroy_context(cls, ctx_id): for child in cls.__subclasses__(): child.destroy_context(ctx_id) ? Or a better approach that avoids having to generate and keep track of arbitrary identifiers would be: class Context: @classmethod def new_context(cls): return {subcls: subcls.new_context() for subcls in cls.__subclasses__()} @classmethod def get_context(cls): return {subcls: subcls.get_context() for subcls in cls.__subclasses__()} @classmethod def set_context(cls, context): for subcls in cls.__subclasses__(): subcls.set_context(context[subcls]) And then you could use it like: class myframework_context: def __enter__(self): self.old_context = Context.get_context() self.context = Context.new_context() Context.set_context(self.context) def __suspend__(self): Context.set_context(self.old_context) def __resume__(self): Context.set_context(self.context) def __exit__(self): Context.set_context(self.old_context) def myframework_run_task(coro): with myframework_context: yield from coro and voila, works even if you have multiple event loops calling into each other :-) Okay, I kid, but my point is that these different approaches really are not very different. The big advantage of this approach over ones that involve doing stuff like main_loop.get_dynamic_namespace() is that it lets each piece of state be stored for regular usage in whatever form makes sense -- e.g. numpy's errstate can live in a plain old TLS variable and be accessed quickly from there, and we only have to do expensive stuff at context switch time. But this also means that your context switches are now notionally identical to my context switches -- they call some suspend/resume callbacks and let each piece of state push/pop itself as appropriate. The real difference is just that you assume that every Python program has exactly one piece of manager code somewhere that knows about all context switches, so you have it do the calls, and I assume that there may be many context switch points within a single program (because, well, Python has dedicated syntax for such context switches built into the language itself), so I think Python itself should do the calls at each appropriate place.
Again, these approaches aren't necessarily contradictory. For something like a request id to be used in logging, which is tightly coupled to the flow of data through your async application, what you say makes perfect sense. For something like the decimal context, it feels pretty weird to me.
I'm afraid I have no idea what this means either. -n -- Nathaniel J. Smith -- http://vorpus.org

On Wed, Apr 29, 2015 at 2:29 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I couldn't think of any cases either, but said __asuspend__/__aresume__ to play it safe in case I missed any :-). I guess it makes sense, though, that from outside our thread, suspend/resume operations are invisible -- no-one cares whether the coroutine they're talking to is currently scheduled or currently suspended, that's purely internal. Therefore, suspend/resume callbacks should never need to block, yeah. -n -- Nathaniel J. Smith -- http://vorpus.org
participants (5)
-
Eric Snow
-
Guido van Rossum
-
Nathaniel Smith
-
Nick Coghlan
-
Yury Selivanov