Re: [Python-ideas] Letting context managers react to yields inside their scope

April 29, 2015

      On Wed, Apr 29, 2015 at 2:01 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
...
Hi Nathanial,
Sorry for not replying to your last email promptly.
FWIW I was going to do it later today ;)
Oh well, this way you got a nice clean thread to do it in instead :-)
...
My opinion on this subject (and I've implemented lots
of local-context kind of objects in different frameworks) is
that *just* inserting some kind of suspend/resume points
before/after yields does not work.
Before I get into the details of your message, I just want to
emphasize-- there are clearly cases where inserting some kind of
suspend/resume points before/after yields is not just a sufficient
solution, but actually the only possible solution.

Consider a streaming data processing pipeline that uses a generator to
load data from disk in chunks, and then a series of generators to
process, split, filter, etc. this data before passing it on to the
next generator in the stack. For certain problems this is 100% the
best way to do it, generators are amazing. But right now you kinda...
just cannot use certain kinds of 'with' blocks in such code without
all kinds of weirdness ensuing. And coming up with fancy APIs for
event loops to use won't help, because there are no event loops here
at all. The problem here isn't caused by event loops, it's caused by
the way adding coroutines to Python created a dissociation between a
block's dynamic-scope-in-terms-of-lifetime and its
dynamic-scope-in-terms-of-call-chain.

Obviously there may be other cases where you also need something else
on top of this proposal (though I'm not 100% convinced... see below).
But logically speaking, even if you're right that this proposal is not
sufficient, that doesn't imply that it's not necessary :-).
...
Why:
1. Greenlets, gevent, eventlet, stackless python: they do
not have 'yield'. Context switches are invisible to the
interpreter.  And those frameworks are popular.  You want
numpy.errstate/decimal.localcontext to work there too.
First, this is the only argument you're giving for "why" yield
callbacks are insufficient, right? (I'm a little confused by your
formatting and want to make sure I'm not missing anything.) Also, just
to check, the only actual library that we really have to worry about
here is greenlets, right, b/c gevent and eventlet use greenlets to do
the heavy lifting, and stackless is a whole-interpreter fork so for
them context switches are visible to the interpreter?

Now, as to the actual point, I have a few reactions:

a) The curmudgeonly reaction: Well, yes, the problem with those
frameworks is that switches are invisible to both the interpreter and
to user code, which makes it more difficult to manage shared state of
all kinds, not just the very specific pseudo-dynamic-scoping state
that we're talking about here. If you want a programming model where
you can actually keep track of shared state, then maybe you should use
something like asyncio instead :-).

b) Okay, fine, if you want it to just work anyway: any implementation
of this proposal will require that the interpreter have some way to
walk the "with block stack" (so to speak) and call the __suspend__ and
__resume__ methods on demand. greenlets could also walk up that stack
when suspending execution. (This would require the library to do a bit
of nasty mucking about in the CPython internals, but c'mon, this is
greenlets we're talking about. Compared to the other stuff it does
this would hardly even register.) The one caveat here is that if we
want this to be possible then it would require we skip statically
optimizing out __suspend__/__resume__ checking in regular functions.

c) Backup plan: greenlets could provide an API like:
greenlets.register_suspend_resume_callbacks(),
greenlets.unregister_suspend_resume_callbacks(), and context managers
that wanted to handle both kinds of suspensions could define
__suspend__/__resume__ and then do

    def __enter__(self):
        greenlets.register_suspend_resume_callbacks(self.__suspend__,
self.__resume__)
        # ... actual code ...
    def __exit__(self):
        greenlets.unregister_suspend_resume_callbacks(self.__suspend__,
self.__resume__)
        # ... actual code ...

and everything would AFAICT just work. Kinda annoying, but I'm pretty
sure that there is and will always be exactly one greenlets library
across the whole Python ecosystem, so it's not much of a limitation.
...
2. I'm curious how this will be implemented and what's
the performance impact.
I am also curious about the performance impact :-). I don't see any
reason to expect it to be large, but one never knows for sure until
one has an implementation.

As for implementation, the most straightforward way I think would be
to (1) add a variant of the SETUP_WITH bytecode that also looks up
__suspend__ and __resume__ and pushes them into the value stack and
then uses a unique b_type for its block stack entry, (2) add a variant
of YIELD_VALUE that walks the block stack looking for this unique
b_type and calls __suspend__ when found (b_level points to where to
find it on the value stack), plus maybe as an optimization setting a
flag on the frame noting that __resume__ will be needed, (3) tweak the
resumption code so that it checks for this flag and if found walks the
block stack calling __resume__, (4) add a variant of WITH_CLEANUP that
knows how to clean up this more complex stack, (5) make sure that the
unwind code knows to treat this new block type the same way it treats
SETUP_FINALLY blocks.

That's based on like 30 minutes of thought starting from zero
knowledge of how this part of ceval.c works, so knowing ceval.c there
are certainly a few dragons lurking that I've missed. But as an
initial high-level sketch does it answer your question?
...
3. I think that mechanism should be more generic than
just 'with' staments. What if you want to have a decorator
that applies some context?
I don't understand the problem here. A coroutine decorator already
knows when the wrapped coroutine is suspended/resumed, so there's
nothing to fix -- basically for decorators this proposal is already
implemented :-).
...
What if you can't write it as
a generator with @contextlib.contextmanager?
Here I just don't understand the sentence :-). Can you rephrase?

Answering one possible question you might have intended:
@contextmanager indeed is not useful for defining a context manager
that has __suspend__ and __resume__ callbacks, but this is just a
convenience utility. My suspicion is that these kinds of context
managers are already sufficiently rare and tricky to write that having
special utilities to make it easier is not that important, but if
someone comes up with a nice API for doing so then cool, we could put
that in contextlib.
...
What I would propose:
I think that the right approach here would be to have a
standard protocol; something similar to how we defined
WSGI -- it doesn't require any changes in the interpreter,
it is a protocol.
For instance, we could have a module in the stdlib, with a
class Context:
class Context:
      @classmethod
      def __suspend_contexts(cls):
         for child in cls.__subclasses__():
            child.__suspend_context()
@classmethod
def __resume_contexts(cls): ..
To inform context objects that the context is about to change,
asyncio event loop would call Context.__suspend_contexts()
I can't see how to usefully implement, well, anything, using this
interface as given. Maybe you mean something like

class Context:
    @classmethod
    def switch_contexts(cls, old_ctx_id, new_ctx_id):
        for child in cls.__subclasses__():
            child.switch_contexts(old_ctx_id, new_ctx_id)

    @classmethod
    def destroy_context(cls, ctx_id):
        for child in cls.__subclasses__():
            child.destroy_context(ctx_id)

?

Or a better approach that avoids having to generate and keep track of
arbitrary identifiers would be:

class Context:
    @classmethod
    def new_context(cls):
        return {subcls: subcls.new_context() for subcls in cls.__subclasses__()}

    @classmethod
    def get_context(cls):
        return {subcls: subcls.get_context() for subcls in cls.__subclasses__()}

    @classmethod
    def set_context(cls, context):
        for subcls in cls.__subclasses__():
            subcls.set_context(context[subcls])

And then you could use it like:

class myframework_context:
    def __enter__(self):
        self.old_context = Context.get_context()
        self.context = Context.new_context()
        Context.set_context(self.context)

    def __suspend__(self):
        Context.set_context(self.old_context)

    def __resume__(self):
        Context.set_context(self.context)

    def __exit__(self):
        Context.set_context(self.old_context)

def myframework_run_task(coro):
    with myframework_context:
        yield from coro

and voila, works even if you have multiple event loops calling into
each other :-)

Okay, I kid, but my point is that these different approaches really
are not very different. The big advantage of this approach over ones
that involve doing stuff like main_loop.get_dynamic_namespace() is
that it lets each piece of state be stored for regular usage in
whatever form makes sense -- e.g. numpy's errstate can live in a plain
old TLS variable and be accessed quickly from there, and we only have
to do expensive stuff at context switch time. But this also means that
your context switches are now notionally identical to my context
switches -- they call some suspend/resume callbacks and let each piece
of state push/pop itself as appropriate.

The real difference is just that you assume that every Python program
has exactly one piece of manager code somewhere that knows about all
context switches, so you have it do the calls, and I assume that there
may be many context switch points within a single program (because,
well, Python has dedicated syntax for such context switches built into
the language itself), so I think Python itself should do the calls at
each appropriate place.
...
I think that it's up to the event loop/library/framework
to manage context switches properly.  In asyncio, for instance,
you can attach callbacks to Futures. I think it would be
great if such callbacks are executed in the right context.
Again, these approaches aren't necessarily contradictory. For
something like a request id to be used in logging, which is tightly
coupled to the flow of data through your async application, what you
say makes perfect sense. For something like the decimal context, it
feels pretty weird to me.
...
I would also suggest to think about a universal context --
the one that works both for asyncio coroutines and
threadpools they might use.
I'm afraid I have no idea what this means either.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org

Re: [Python-ideas] Letting context managers react to yields inside their scope

Nathaniel Smith