[Python-Dev] PEP 567 v2

Wed Jan 3 18:25:56 EST 2018

2018-01-03 23:01 GMT+01:00 Guido van Rossum <guido at python.org>:
> Heh, you're right, I forgot about that. It should be more like this:
>
> def run(self, func, *args, **kwds):
>     old = _get_current_context()
>     _set_current_context(self)  # <--- changed line
>     try:
>         return func(*args, **kwds)
>     finally:
>         _set_current_context(old)
>
> This version, like the PEP, assumes that the Context object is truly
> immutable (not just in name) and that you should call it like this:
>
> contextvars.copy_context().run(func, <args>)

I don't see how asyncio would use Context.run() to keep the state
(variables values) between callbacks and tasks, if run() is
"stateless": forgets everything at exit.

I asked if it would be possible to modify run() to return a new
context object with the new state, but Yury confirmed that it's not
doable:

Yury:
> [Context.run()] can't return a new context because the callable you're running can raise an exception. In which case you'd lose modifications prior to the error.

Guido:
> Yury strongly favors an immutable Context, and that's what his reference implementation has (https://github.com/python/cpython/pull/5027). His reasoning is that in the future we *might* want to support automatic context management for generators by default (like described in his original PEP 550), and then it's essential to use the immutable version so that "copying" the context when a generator is created or resumed is super fast (and in particular O(1)).

To get acceptable performances, PEP 550 and 567 require O(1) cost when
copying a context, since the API requires to copy contexts frequently
(in asyncio, each task has its own private context, creating a task
copies the current context). Yury proposed to use "Hash Array Mapped
Tries (HAMT)" to get O(1) copy.

Each ContextVar.set() change creates a *new* HAMT. Extract of the PEP 567:
---
    def set(self, value):
        ts : PyThreadState = PyThreadState_Get()
        data : _ContextData = ts.context_data

        try:
            old_value = data.get(self)
        except KeyError:
            old_value = Token.MISSING

        ts.context_data = data.set(self, value)
        return Token(self, old_value)
---

The link between ContextVar, Context and HAMT (called "context data"
in the PEP 567) is non obvious:

* ContextVar.set() creates a new HAMT from
PyThreadState_Get().context_data and writes the new one into
PyThreadState_Get().context_data -- technically, it works on a thread
local storage (TLS)
* Context.run() doesn't set the "current context": in practice, it
sets its internal "context data" as the current context data, and then
save the *new* context data in its own context data

PEP 567:
---
    def run(self, callable, *args, **kwargs):
        ts : PyThreadState = PyThreadState_Get()
        saved_data : _ContextData = ts.context_data

        try:
            ts.context_data = self._data
            return callable(*args, **kwargs)
        finally:
            self._data = ts.context_data
            ts.context_data = saved_data
---

The main key of the PEP 567 implementation is that there is no
"current context" in practice. There is only a private *current*
context data.

Not having get_current_contex() allows the trick of context data
handled by a TLS. Otherwise, I'm not sure that it would be possible to
synchronize a Context object with a TLS variable.

>From the user point of view, Context.run() does modify the context.
After the call, variables values changed. A second run() call gives
you the updated context.

I don't think that a mutable context would have an impact in
performance, since copying "context data" will still have a cost of
O(1). IMHO it's just a matter of taste for the API.

Or maybe I missed something.

Victor