[Python-Dev] Timeout for PEP 550 / Execution Context discussion

Wed Oct 18 00:40:24 EDT 2017

On 18 October 2017 at 05:55, Yury Selivanov <yselivanov.ml at gmail.com> wrote:

> On Tue, Oct 17, 2017 at 2:25 PM, Guido van Rossum <guido at python.org>
> wrote:
> > On Tue, Oct 17, 2017 at 8:54 AM, Yury Selivanov <yselivanov.ml at gmail.com
> >
> [..]
> >> My way of thinking about this: "get_execution_context()" returns you a
> >> shallow copy of the current EC (at least conceptually).  So making any
> >> modifications on it won't affect the current environment.  The only
> >> way to actually apply the modified EC object to the environment will
> >> be its 'run(callable)' method.
> >
> >
> > I understand that you don't want to throw away the implementation work
> > you've already done. But I find that the abstractions you've introduced
> are
> > getting in the way of helping people understand what they can do with
> > context variables, and I really want to go back to a model that is *much*
> > closer to understanding how instance variables are just self.__dict__.
> (Even
> > though there are possible complications due to __slots__ and @property.)
>
> I don't really care about the implementation work that has already
> been done, it's OK if I write it from scratch again.
>
> I actually like what you did in
> https://github.com/gvanrossum/pep550/blob/master/simpler.py, it seems
> reasonable.  The only thing that I'd change is to remove "set_ctx"
> from the public API and add "Context.run(callable)".  This makes the
> API more flexible to potential future changes and amendments.
>

Yep, with that tweak, I like Guido's suggested API as well.

Attempting to explain why I think we want "Context.run(callable)" rather
"context_vars.set_ctx()" by drawing an analogy to thread local storage:

1. In C, the compiler & CPU work together to ensure you can't access
another thread's thread locals.
2. In Python's thread locals API, we do the same thing: you can only get
access to the running thread's thread locals, not anyone else's

At the Python API layer, we don't expose the ability to switch explicitly
to another thread state while remaining within the current function.
Instead, we only offer two options: starting a new thread, and waiting for
a thread to finish execution. The lifecycle of the thread local storage is
then intrinsically linked to the lifecycle of the thread it belongs to.

That intrinsic link makes various aspects of thread local storage easier to
reason about, since the active thread state can't change in the middle of a
running function - even if the current thread gets suspended by the OS,
resuming the function also implies resuming the original thread.

Including a "contextvars.set_ctx" API would be akin to making
PyThreadState_Swap a public Python-level API, rather than only exposing
_thread.start_new_thread the way we do now.

One reason we *don't* do that is because it would make thread locals much
harder to reason about - every function call could have an implicit side
effect of changing the active thread state, which would mean the thread
locals at the start of the function could differ from those at the end of
the function, even if the function itself didn't do anything to change them.

Only offering Context.run(callable) provides a similar "the only changes to
the execution context will be those this function, or a function it called,
explicitly initiates" protection for context variables, and Guido's
requested API simplifications make this aspect even easier to reason about:
after any given function call, you can be certain of being back in the
context you started in, because we wouldn't expose any Python level API
that allowed an execution context switch to persist beyond the frame that
initiated it.

====

The above is my main rationale for preferring contextvars.Context.run() to
contextvars.set_ctx(), but it's not the only reason I prefer it.

At a more abstract design philosophy level, I think the distinction between
symmetric and asymmetric coroutines is relevant here [2]:

* in symmetric coroutines, there's a single operation that says "switch to
running this other coroutine"
* in asymmetric coroutines, there are separate operations for starting or
resuming coroutine and for suspending the currently running one

Python's native coroutines are asymmetric - we don't provide a "switch to
this coroutine" primitive, we instead provide an API for starting or
resuming a coroutine (via cr.__next__(), cr.send() & cr.throw()), and an
API for suspending one (via await).

The contextvars.set_ctx() API would be suitable for symmetric coroutines,
as there's no implied notion of parent context/child context, just a notion
of switching which context is active.

The Context.run() API aligns better with asymmetric coroutines, as there's
a clear distinction between the parent frame (the one initiating the
context switch) and the child frame (the one running in the designated
context).

As a practical matter, Context.run also composes nicely (in combination
with functools.partial) for use with any existing API based on submitting
functions for delayed execution, or execution in another thread or process:

- sched
- concurrent.futures
- arbitrary callback APIs
- method based protocols (including iteration)

By contrast, "contextvars.set_ctx" would need various wrappers to handle
correctly reverting the context change, and would hence be prone to
"changed the active context without changing it back" bugs (which can be
especially fun when you're dealing with a shared pool of worker threads or
processes).

Cheers,
Nick.

[1] Technically C extensions can play games with this via
PyThreadState_Swap, but I'm not going to worry about that here
[2]
https://stackoverflow.com/questions/41891989/what-is-the-difference-between-asymmetric-and-symmetric-coroutines

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171018/b896bb39/attachment-0001.html>