data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
This is a second version of PEP 567. A few things have changed: 1. I now have a reference implementation: https://github.com/python/cpython/pull/5027 2. The C API was updated to match the implementation. 3. The get_context() function was renamed to copy_context() to better reflect what it is really doing. 4. Few clarifications/edits here and there to address earlier feedback. Yury PEP: 567 Title: Context Variables Version: $Revision$ Last-Modified: $Date$ Author: Yury Selivanov <yury@magic.io> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Dec-2017 Python-Version: 3.7 Post-History: 12-Dec-2017, 28-Dec-2017 Abstract ======== This PEP proposes a new ``contextvars`` module and a set of new CPython C APIs to support context variables. This concept is similar to thread-local storage (TLS), but, unlike TLS, it also allows correctly keeping track of values per asynchronous task, e.g. ``asyncio.Task``. This proposal is a simplified version of :pep:`550`. The key difference is that this PEP is concerned only with solving the case for asynchronous tasks, not for generators. There are no proposed modifications to any built-in types or to the interpreter. This proposal is not strictly related to Python Context Managers. Although it does provide a mechanism that can be used by Context Managers to store their state. Rationale ========= Thread-local variables are insufficient for asynchronous tasks that execute concurrently in the same OS thread. Any context manager that saves and restores a context value using ``threading.local()`` will have its context values bleed to other code unexpectedly when used in async/await code. A few examples where having a working context local storage for asynchronous code is desirable: * Context managers like ``decimal`` contexts and ``numpy.errstate``. * Request-related data, such as security tokens and request data in web applications, language context for ``gettext``, etc. * Profiling, tracing, and logging in large code bases. Introduction ============ The PEP proposes a new mechanism for managing context variables. The key classes involved in this mechanism are ``contextvars.Context`` and ``contextvars.ContextVar``. The PEP also proposes some policies for using the mechanism around asynchronous tasks. The proposed mechanism for accessing context variables uses the ``ContextVar`` class. A module (such as ``decimal``) that wishes to store a context variable should: * declare a module-global variable holding a ``ContextVar`` to serve as a key; * access the current value via the ``get()`` method on the key variable; * modify the current value via the ``set()`` method on the key variable. The notion of "current value" deserves special consideration: different asynchronous tasks that exist and execute concurrently may have different values for the same key. This idea is well-known from thread-local storage but in this case the locality of the value is not necessarily bound to a thread. Instead, there is the notion of the "current ``Context``" which is stored in thread-local storage, and is accessed via ``contextvars.copy_context()`` function. Manipulation of the current ``Context`` is the responsibility of the task framework, e.g. asyncio. A ``Context`` is conceptually a read-only mapping, implemented using an immutable dictionary. The ``ContextVar.get()`` method does a lookup in the current ``Context`` with ``self`` as a key, raising a ``LookupError`` or returning a default value specified in the constructor. The ``ContextVar.set(value)`` method clones the current ``Context``, assigns the ``value`` to it with ``self`` as a key, and sets the new ``Context`` as the new current ``Context``. Specification ============= A new standard library module ``contextvars`` is added with the following APIs: 1. ``copy_context() -> Context`` function is used to get a copy of the current ``Context`` object for the current OS thread. 2. ``ContextVar`` class to declare and access context variables. 3. ``Context`` class encapsulates context state. Every OS thread stores a reference to its current ``Context`` instance. It is not possible to control that reference manually. Instead, the ``Context.run(callable, *args, **kwargs)`` method is used to run Python code in another context. contextvars.ContextVar ---------------------- The ``ContextVar`` class has the following constructor signature: ``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter is used only for introspection and debug purposes, and is exposed as a read-only ``ContextVar.name`` attribute. The ``default`` parameter is optional. Example:: # Declare a context variable 'var' with the default value 42. var = ContextVar('var', default=42) (The ``_NO_DEFAULT`` is an internal sentinel object used to detect if the default value was provided.) ``ContextVar.get()`` returns a value for context variable from the current ``Context``:: # Get the value of `var`. var.get() ``ContextVar.set(value) -> Token`` is used to set a new value for the context variable in the current ``Context``:: # Set the variable 'var' to 1 in the current context. var.set(1) ``ContextVar.reset(token)`` is used to reset the variable in the current context to the value it had before the ``set()`` operation that created the ``token``:: assert var.get(None) is None token = var.set(1) try: ... finally: var.reset(token) assert var.get(None) is None ``ContextVar.reset()`` method is idempotent and can be called multiple times on the same Token object: second and later calls will be no-ops. contextvars.Token ----------------- ``contextvars.Token`` is an opaque object that should be used to restore the ``ContextVar`` to its previous value, or remove it from the context if the variable was not set before. It can be created only by calling ``ContextVar.set()``. For debug and introspection purposes it has: * a read-only attribute ``Token.var`` pointing to the variable that created the token; * a read-only attribute ``Token.old_value`` set to the value the variable had before the ``set()`` call, or to ``Token.MISSING`` if the variable wasn't set before. Having the ``ContextVar.set()`` method returning a ``Token`` object and the ``ContextVar.reset(token)`` method, allows context variables to be removed from the context if they were not in it before the ``set()`` call. contextvars.Context ------------------- ``Context`` object is a mapping of context variables to values. ``Context()`` creates an empty context. To get a copy of the current ``Context`` for the current OS thread, use the ``contextvars.copy_context()`` method:: ctx = contextvars.copy_context() To run Python code in some ``Context``, use ``Context.run()`` method:: ctx.run(function) Any changes to any context variables that ``function`` causes will be contained in the ``ctx`` context:: var = ContextVar('var') var.set('spam') def function(): assert var.get() == 'spam' var.set('ham') assert var.get() == 'ham' ctx = copy_context() # Any changes that 'function' makes to 'var' will stay # isolated in the 'ctx'. ctx.run(function) assert var.get() == 'spam' Any changes to the context will be contained in the ``Context`` object on which ``run()`` is called on. ``Context.run()`` is used to control in which context asyncio callbacks and Tasks are executed. It can also be used to run some code in a different thread in the context of the current thread:: executor = ThreadPoolExecutor() current_context = contextvars.copy_context() executor.submit( lambda: current_context.run(some_function)) ``Context`` objects implement the ``collections.abc.Mapping`` ABC. This can be used to introspect context objects:: ctx = contextvars.copy_context() # Print all context variables and their values in 'ctx': print(ctx.items()) # Print the value of 'some_variable' in context 'ctx': print(ctx[some_variable]) asyncio ------- ``asyncio`` uses ``Loop.call_soon()``, ``Loop.call_later()``, and ``Loop.call_at()`` to schedule the asynchronous execution of a function. ``asyncio.Task`` uses ``call_soon()`` to run the wrapped coroutine. We modify ``Loop.call_{at,later,soon}`` and ``Future.add_done_callback()`` to accept the new optional *context* keyword-only argument, which defaults to the current context:: def call_soon(self, callback, *args, context=None): if context is None: context = contextvars.copy_context() # ... some time later context.run(callback, *args) Tasks in asyncio need to maintain their own context that they inherit from the point they were created at. ``asyncio.Task`` is modified as follows:: class Task: def __init__(self, coro): ... # Get the current context snapshot. self._context = contextvars.copy_context() self._loop.call_soon(self._step, context=self._context) def _step(self, exc=None): ... # Every advance of the wrapped coroutine is done in # the task's context. self._loop.call_soon(self._step, context=self._context) ... C API ----- 1. ``PyContextVar * PyContextVar_New(char *name, PyObject *default)``: create a ``ContextVar`` object. 2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``: return ``-1`` if an error occurs during the lookup, ``0`` otherwise. If a value for the context variable is found, it will be set to the ``value`` pointer. Otherwise, ``value`` will be set to ``default_value`` when it is not ``NULL``. If ``default_value`` is ``NULL``, ``value`` will be set to the default value of the variable, which can be ``NULL`` too. ``value`` is always a borrowed reference. 3. ``PyContextToken * PyContextVar_Set(PyContextVar *, PyObject *)``: set the value of the variable in the current context. 4. ``PyContextVar_Reset(PyContextVar *, PyContextToken *)``: reset the value of the context variable. 5. ``PyContext * PyContext_New()``: create a new empty context. 6. ``PyContext * PyContext_Copy()``: get a copy of the current context. 7. ``int PyContext_Enter(PyContext *)`` and ``int PyContext_Exit(PyContext *)`` allow to set and restore the context for the current OS thread. It is required to always restore the previous context:: PyContext *old_ctx = PyContext_Copy(); if (old_ctx == NULL) goto error; if (PyContext_Enter(new_ctx)) goto error; // run some code if (PyContext_Exit(old_ctx)) goto error; Implementation ============== This section explains high-level implementation details in pseudo-code. Some optimizations are omitted to keep this section short and clear. For the purposes of this section, we implement an immutable dictionary using ``dict.copy()``:: class _ContextData: def __init__(self): self._mapping = dict() def get(self, key): return self._mapping[key] def set(self, key, value): copy = _ContextData() copy._mapping = self._mapping.copy() copy._mapping[key] = value return copy def delete(self, key): copy = _ContextData() copy._mapping = self._mapping.copy() del copy._mapping[key] return copy Every OS thread has a reference to the current ``_ContextData``. ``PyThreadState`` is updated with a new ``context_data`` field that points to a ``_ContextData`` object:: class PyThreadState: context_data: _ContextData ``contextvars.copy_context()`` is implemented as follows:: def copy_context(): ts : PyThreadState = PyThreadState_Get() if ts.context_data is None: ts.context_data = _ContextData() ctx = Context() ctx._data = ts.context_data return ctx ``contextvars.Context`` is a wrapper around ``_ContextData``:: class Context(collections.abc.Mapping): def __init__(self): self._data = _ContextData() def run(self, callable, *args, **kwargs): ts : PyThreadState = PyThreadState_Get() saved_data : _ContextData = ts.context_data try: ts.context_data = self._data return callable(*args, **kwargs) finally: self._data = ts.context_data ts.context_data = saved_data # Mapping API methods are implemented by delegating # `get()` and other Mapping calls to `self._data`. ``contextvars.ContextVar`` interacts with ``PyThreadState.context_data`` directly:: class ContextVar: def __init__(self, name, *, default=_NO_DEFAULT): self._name = name self._default = default @property def name(self): return self._name def get(self, default=_NO_DEFAULT): ts : PyThreadState = PyThreadState_Get() data : _ContextData = ts.context_data try: return data.get(self) except KeyError: pass if default is not _NO_DEFAULT: return default if self._default is not _NO_DEFAULT: return self._default raise LookupError def set(self, value): ts : PyThreadState = PyThreadState_Get() data : _ContextData = ts.context_data try: old_value = data.get(self) except KeyError: old_value = Token.MISSING ts.context_data = data.set(self, value) return Token(self, old_value) def reset(self, token): if token._used: return if token._old_value is Token.MISSING: ts.context_data = data.delete(token._var) else: ts.context_data = data.set(token._var, token._old_value) token._used = True class Token: MISSING = object() def __init__(self, var, old_value): self._var = var self._old_value = old_value self._used = False @property def var(self): return self._var @property def old_value(self): return self._old_value Implementation Notes ==================== * The internal immutable dictionary for ``Context`` is implemented using Hash Array Mapped Tries (HAMT). They allow for O(log N) ``set`` operation, and for O(1) ``copy_context()`` function, where *N* is the number of items in the dictionary. For a detailed analysis of HAMT performance please refer to :pep:`550` [1]_. * ``ContextVar.get()`` has an internal cache for the most recent value, which allows to bypass a hash lookup. This is similar to the optimization the ``decimal`` module implements to retrieve its context from ``PyThreadState_GetDict()``. See :pep:`550` which explains the implementation of the cache in a great detail. Summary of the New APIs ======================= * A new ``contextvars`` module with ``ContextVar``, ``Context``, and ``Token`` classes, and a ``copy_context()`` function. * ``asyncio.Loop.call_at()``, ``asyncio.Loop.call_later()``, ``asyncio.Loop.call_soon()``, and ``asyncio.Future.add_done_callback()`` run callback functions in the context they were called in. A new *context* keyword-only parameter can be used to specify a custom context. * ``asyncio.Task`` is modified internally to maintain its own context. Design Considerations ===================== Why contextvars.Token and not ContextVar.unset()? ------------------------------------------------- The Token API allows to get around having a ``ContextVar.unset()`` method, which is incompatible with chained contexts design of :pep:`550`. Future compatibility with :pep:`550` is desired (at least for Python 3.7) in case there is demand to support context variables in generators and asynchronous generators. The Token API also offers better usability: the user does not have to special-case absence of a value. Compare:: token = cv.get() try: cv.set(blah) # code finally: cv.reset(token) with:: _deleted = object() old = cv.get(default=_deleted) try: cv.set(blah) # code finally: if old is _deleted: cv.unset() else: cv.set(old) Rejected Ideas ============== Replication of threading.local() interface ------------------------------------------ Please refer to :pep:`550` where this topic is covered in detail: [2]_. Backwards Compatibility ======================= This proposal preserves 100% backwards compatibility. Libraries that use ``threading.local()`` to store context-related values, currently work correctly only for synchronous code. Switching them to use the proposed API will keep their behavior for synchronous code unmodified, but will automatically enable support for asynchronous code. Reference Implementation ======================== The reference implementation can be found here: [3]_. References ========== .. [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis .. [2] https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-int... .. [3] https://github.com/python/cpython/pull/5027 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Hi, I like the new version of the PEP using "read only mapping" and copy_context(). It's easier to understand. I'm ok with seeing a context as a mapping, but I am confused about a context variable considered as a mapping item. I still see a context variable as a variable, so something which has a value or not. I just propose to rename the default parameter of the ContextVar constructor. Le 28 déc. 2017 7:10 AM, "Yury Selivanov" <yselivanov.ml@gmail.com> a écrit : ContextVar ---------------------- The ``ContextVar`` class has the following constructor signature: ``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter is used only for introspection and debug purposes, and is exposed as a read-only ``ContextVar.name`` attribute. The ``default`` parameter is optional. Example:: # Declare a context variable 'var' with the default value 42. var = ContextVar('var', default=42) In term of API, "default" parameter name is strange. Why not simply calling it "value"? var = ContextVar('var', default=42) and: var = ContextVar('var') var.set (42) behaves the same, no? The implementation explains where the "default" name comes from, but IMHO "value" is a better name. (The ``_NO_DEFAULT`` is an internal sentinel object used to detect if the default value was provided.) I would call it _NOT_SET. * a read-only attribute ``Token.old_value`` set to the value the variable had before the ``set()`` call, or to ``Token.MISSING`` if the variable wasn't set before. Hum, I also suggest to rename Token.MISSING to Token.NOT_SET. It would be more conistent with the last sentence. C API ----- Would it be possible to make this API private? 2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``: (...) ``value`` is always a borrowed reference. I'm not sure that it's a good idea to add a new public C function which returns a borrowed reference. I would prefer to only use (regular) strong references in the public API. I don't want to elaborate here. You may see: http://vstinner.readthedocs.io/python_new_stable_api.html Internally, I don't care, do whatever you want for best performances :-) Victor
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Dec 28, 2017 at 1:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
No, they're different. The second sets the value in the current context. The first sets the value in all contexts that currently exist, and all empty contexts created in the future. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
NLe 28 déc. 2017 11:20 AM, "Nathaniel Smith" <njs@pobox.com> a écrit : On Thu, Dec 28, 2017 at 1:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
No, they're different. The second sets the value in the current context. The first sets the value in all contexts that currently exist, and all empty contexts created in the future. Oh, that's an important information. In this case, "default" is the best name. The PEP may be more explicit about the effect on all contexts. Proposition of documentation: "The optional *default* parameter is the default value in all contexts. If the variable is not set in the current context, it is returned by by context[var_name] and by var.get(), when get() is called without the default parameter." Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Oh, the "Specification" section in the PEP is too brief on several of these subjects. It doesn't really specify what var.get() does if the value is not set, nor does it even mention var.get(<default>) except in the code examples for var.reset(). It's also subtle that ctx[var] returns the default (if there is one). I suppose it will raise if there isn't one -- resulting in the somewhat surprising behavior where `var in ctx` may be true but `ctx[var]` may raise. And what does it raise? (All these questions are answered by the code, but they should be clearly stated in the PEP.) I would really like to invite more people to review this PEP! I expect I'll be accepting it in the next two weeks, but it needs to go through more rigorous review. On Thu, Dec 28, 2017 at 4:48 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
I read again the PEP and I am still very confused by Context.run(). The PEP states multiple times that a context is immutable: * "read-only mapping" * inherit from Mapping, not from MutableMapping But run() does modify the context (or please correct me if I completely misunderstood the PEP! I had to read it 3 times to check if run() mutates or not the context). It would help if the ctx.run() example in the PEP would not only test var.get() but also test ctx.get(var). Or maybe show that the variable value is kept in a second function call, but the variable is "restored" between run() calls. The PEP tries hard to hide "context data", which is the only read only thing in the whole PEP, whereas it's a key concept to understand the implementation. I understood that: * _ContextData is immutable * ContextVar.set() creates a new _ContextData and sets it in the current Python thread state * When the called function completes, Context.run() sets its context data to the new context data from the Python thread state: so run() does modify the "immutable" context The distinction between the internal/hiden *immutable* context data and public/visible "mutable" (from my point of view) context is unclear to me in the PEP. The concept of "current context" is not defined in the PEP. In practice, there is no "current context", there is only a "current context data" in the current Python thread. There is no need for a concrete context instance to store variable variables values. It's also hard to understand that in the PEP. Why Context could not inherit from MutableMapping? (Allow ctx.set(var, value) and ctx [var] = value.) Is it just to keep the API small: changes should only be made using var.set()? Or maybe Context.run() should really be immutable and return the result of the called function *and* a new context? But I dislike such theorical API, since it would be complex to return the new context if the called function raises an exception. Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Hum, it seems like the specification (API) part of the PEP is polluted by its implementation. The PEP just require a few minor changes to better describe the behaviour/API instead of insisting on the read only internal thing which is specific to the proposed implementation which is just one arbitrary implemention (designed for best performances). IMHO the PEP shouldn't state that a context is read only. From my point of view, it's mutable and it's the mapping holding variable values. There is a current context which holds the current values. Context.run() switchs temporarely the current context with another context. The fact that there is no concrete context instance by default doesn't really matter in term of API. Victor Le 3 janv. 2018 00:34, "Victor Stinner" <victor.stinner@gmail.com> a écrit :
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 5:30 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
Yeah, we need some more words. I still hope someone proposes some, but if threats fail I will have to try to find time myself.
I think the issue here is a bit different than Yury's response suggests -- it's more like how a variable containing an immutable value (e.g. a string) can be modified, e.g. x = 'a' x += 'b' In our case the *variable* is the current thread state (in particular the slot therein that holds the context -- this slot can be modified by the C API). The *value* is the Context object. It is a collections.Mapping (or typing.Mapping) which does not have mutating methods. (The mutable type is called MutableMapping.) The *reason* for doing it this way is that Yury doesn't want Context to implement __delitem__, since it would complicate the specification of chained lookups by a future PEP, and chained lookups look to be the best option to extend the Context machinery for generators. We're not doing that in Python 3.7 (PEP 550 v2 and later did this but it was too complex) but we might want to do in 3.8 or 3.9 once we are comfortable with PEP 567. (Or not -- it's not clear to me that generators bite decimal users the way tasks do. Coroutines always run on behalf of a task so they're not a problem.) --Guido
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:34, "Guido van Rossum" <guido@python.org> a écrit : I think the issue here is a bit different than Yury's response suggests -- it's more like how a variable containing an immutable value (e.g. a string) can be modified, e.g. x = 'a' x += 'b' In our case the *variable* is the current thread state (in particular the slot therein that holds the context -- this slot can be modified by the C API). The *value* is the Context object. It is a collections.Mapping (or typing.Mapping) which does not have mutating methods. (The mutable type is called MutableMapping.) I can see a parallel with a Python namespace, like globals and locals arguments of exec(): ns = globals().copy() # ctx = copy_context() exec("x = 'a'", ns, ns) # ctx.run(...) ns['x'] += 'b' # Context ??? print(ns ['x']) # print(ctx[x]) The *reason* for doing it this way is that Yury doesn't want Context to implement __delitem__, since it would complicate the specification of chained lookups by a future PEP, and chained lookups look to be the best option to extend the Context machinery for generators. Again, why not just raise an exception on "del ctx[var]"? Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:30 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
You've convinced me that Context is neither immutable nor read-only, and the PEP should admit this. Its *Mapping* interface doesn't allow mutations but its run() method does. E.g. here's a function that mutates a Context, effectively doing vtx[var] = value: def mutate_context(ctx, var, value): ctx.run(lambda: var.set(value)) However you've not convinced me that it would be better to make Context implement the full MutableMapping interface (with `__delitem__` always raising). There are use cases for inspecting Context, e.g. a debugger that wants to print the Context for some or all suspended tasks. But I don't see a use case for mutating a Context object that's not the current context, and when it is the current context, ContextVar.set() is more direct. I also don't see use cases for other MutableMapping methods like pop() or update(). (And clear() falls under the same prohibition as __delitem__().) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 6:09 PM, Guido van Rossum <guido@python.org> wrote:
I was looking at this again, and I realized there's some confusion. I've been repeating the thing about not wanting to implement __delitem__ too, but actually, I think __delitem__ is not the problem :-). The problem is that we don't want to provide ContextVar.unset() -- that's the operation that adds complication in a PEP 550 world. If you have a stack of Contexts that ContextVar.get() searches, and set/unset are only allowed to mutate the top entry in the stack, then the only way to implement unset() is to do something like context_stack[-1][self] = _MISSING, so it can hide any entries below it in the stack. This is extra complication for a feature that it's not clear anyone cares about. (And if it turns out people do care, we could add it later.) Deleting entries from individual Context objects shouldn't create conceptual problems. OTOH I don't see how it's useful either. I don't think implementing MutableMapping would actually cause problems, but it's a bunch of extra code to write/test/maintain without any clear use cases. This does make me think that I should write up a short PEP for extending PEP 567 to add context lookup, PEP 550 style: it can start out in Status: deferred and then we can debate it properly before 3.8, but at least having the roadmap written down now would make it easier to catch these details. (And it might also help address Paul's reasonable complaint about "unstated requirements".) -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 7:58 PM, Nathaniel Smith <njs@pobox.com> wrote:
Ah yes, that's it. A stack of Contexts could have the same semantics as ChainMap, in which __delitem__ deletes the key from the first mapping if present and otherwise raises KeyError even if it is present in a later mapping. That's enough to implement var.reset(), but not enough to implement arbitrary var.unset(). I'm fine with only having var.reset() and not var.unset().
The implementation would have to maintain cache consistency, but that's not necessarily a big deal, and it only needs to be done in a few places. But I agree that the a case hasn't been indicated yet. (I like being able to specify ContextVar's behavior using Context as a MutableMapping, but that's not a real use case.)
Anything that will help us kill a 550-pound gorilla sounds good to me. :-) It might indeed be pretty short if we follow the lead of ChainMap (even using a different API than MutableMapping to mutate it). Maybe copy_context() would map to new_child()? Using ChainMap as a model we might even avoid the confusion between Lo[gi]calContext and ExecutionContext which was the nail in PEP 550's coffin. The LC associated with a generator in PEP 550 would be akin to a loose dict which can be pushed on top of a ChainMap using cm = cm.new_child(<dict>). (Always taking for granted that instead of an actual dict we'd use some specialized mutable object implementing the Mapping protocol and a custom mutation protocol so it can maintain ContextVar cache consistency.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 9:42 PM, Guido van Rossum <guido@python.org> wrote:
The approach I took in PEP 568 is even simpler, I think. The PEP is a few pages long because I wanted to be exhaustive to make sure we weren't missing any details, but the tl;dr is: The ChainMap lives entirely inside the threadstate, so there's no need to create a LC/EC distinction -- users just see Contexts, or there's the one stack introspection API, get_context_stack(), which returns a List[Context]. Instead of messing with new_child, copy_context is just Context(dict(chain_map)) -- i.e., it creates a flattened copy of the current mapping. (If we used new_child, then we'd have to have a way to return a ChainMap, reintroducing the LC/EC mess. Plus it would allow for changes to the Context's inside one ChainMap to "leak" into the child or vice-versa.) And Context push/pop is just push/pop on the threadstate's ChainMap's '.maps' attribute. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/75a60/75a60da5c274977fef5c33fea963c84db3b74279" alt=""
This sounds reasonable. Although keep in mind that merging hamt is still an expensive operation, so flattening shouldn't always be performed (this is covered in 550). I also wouldn't call LC/EC a "mess". Your pep just names things differently, but otherwise is entirely built on concepts and ideas introduced in pep 550. Yury
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Tue, Jan 9, 2018 at 2:59 AM, Yury Selivanov <yselivanov@gmail.com> wrote:
Right, the PEP mostly focuses on the semantics rather than the implementation and this is an implementation detail (the user can't tell whether a Context internally holds a stack of HAMTs or just one). But there is a note that we might choose to perform the actual flattening lazily if it turns out to be worthwhile.
I also wouldn't call LC/EC a "mess". Your pep just names things differently, but otherwise is entirely built on concepts and ideas introduced in pep 550.
Sorry for phrasing it like that -- I just meant that at the API level, the LC/EC split caused a lot of confusion (and apparently this was it's "nail in the coffin"!). In the PEP 567/568 version, the underlying concepts are the same, but the API ends up being simpler. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Wed, Jan 3, 2018 at 2:36 AM Victor Stinner <victor.stinner@gmail.com> wrote:
tuples in Python are immutable, but you can have a tuple with a dict as its single element. The tuple is immutable, the dict is mutable. At the C level we have APIs that can mutate a tuple though. Now, tuple is not a direct analogy to Context, but there are some parallels. Context is a container like tuple, with some additional APIs on top.
Because that would be confusing to end users. ctx = copy_context() ctx[var] = something What did we just do? Did we modify the 'var' in the code that is currently executing? No, you still need to call Context.run to see the new value for var. Another problem is that MutableMapping defines a __delitem__ method, which i don't want the Context to implement. Deleting variables like that is incompatible with PEP 550, where it's ambiguous (due to the stacked nature of contexts). Now we don't want PEP 550 in 3.7, but I want to keep the door open for its design, in case we want context to work with generators.
It can't return a new context because the callable you're running can raise an exception. In which case you'd lose modifications prior to the error. ps i'm on vacation and don't always have an internet connection. yury
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:05, "Yury Selivanov" <yselivanov.ml@gmail.com> a écrit : tuples in Python are immutable, but you can have a tuple with a dict as its single element. The tuple is immutable, the dict is mutable. At the C level we have APIs that can mutate a tuple though. Now, tuple is not a direct analogy to Context, but there are some parallels. Context is a container like tuple, with some additional APIs on top. Sorry, I don't think that it's a good analogy. Context.run() is a public method accessible in Python which allows to modify the context. A tuple doesn't have such method. While it's technically possible to modify a tuple or a str at C level, it's a bad practice leading to complex bugs when it's not done carefully: see https://bugs.python.org/issue30156 property_descr_get() optimization was fixed twice but still has a bug. I proposed a PR to remove the hack. Why Context could not inherit from MutableMapping? (Allow ctx.set(var,
value) and ctx [var] = value.) Is it just to keep the API small: changes should only be made using var.set()?
Because that would be confusing to end users. ctx = copy_context() ctx[var] = something What did we just do? Did we modify the 'var' in the code that is currently executing? No, you still need to call Context.run to see the new value for var. IMHO it's easy to understand that modifying a *copy* of the current context doesn't impact the current context. It's one the first thing to learn when learning Python: a = [1, 2] b = a.copy() b.append(3) assert a == [1, 2] assert b == [1, 2, 3] Another problem is that MutableMapping defines a __delitem__ method, which i don't want the Context to implement. I wouldn't be shocked if "del ctx [var]" would raise an exception. I almost never use del anyway. I prefer to assign a variable to None, since "del var" looks like C++ destructor whereas it's more complex than a direct call to the destructor. But it's annoying to have to call a function with Context.run() whereas context is just a mutable mapping. It seems overkill to me to have to call run() to modify a context variable: run() changes temporarely the context and requires to use the indirect ContextVar API, while I know that ContextVar.set() modifies the context. Except of del corner case, I don't see any technical reason to prevent direct modification of a context. contextvars isn't new, it extends what we already have: decimal context. And decimal quick start documentation shows how to modify a context and then set it as the current context:
https://docs.python.org/dev/library/decimal.html Well, technically it doesn't modify a context. An example closer to contextvars would be:
Note: "getcontext().prec = 6" does modify the decimal context directly, and it's the *first* example in the doc. But here contextvars is different since there is no API to get the current API. The lack of API to access directly the current contextvars context is the main difference with decimal context, and I'm fine with that. It's easy to see a parallel since decimal context can be copied using Context.copy(), it has also multiple (builtin) "variables", it's just that the API is different (decimal context variables are modified as attributes), and it's possible to set a context using decimal.setcontext(). Victor
data:image/s3,"s3://crabby-images/75a60/75a60da5c274977fef5c33fea963c84db3b74279" alt=""
Do you have any use case for modifying a variable inside some context? numpy, decimal, or some sort of tracing for http requests or async frameworks like asyncio do not need that.
I think you are confusing context in decimal and pep 567. Decimal context is a mutable object. We use threading.local to store it. With pep 567 you will use a context variable behind the scenes to store it. I think it's incorrect to compare decimal contexts to pep567 in any way. Yury
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Do you have any use case for modifying a variable inside some context?
numpy, decimal, or some sort of tracing for http requests or async frameworks like asyncio do not need that.
Maybe I misunderstood how contextvars is supposed to be used. So let me give you an example. I understand that decimal.py will declare its context variable like this: --- contextvar = contextvars.ContextVar('decimal', default=Context(...)) --- Later if I would like to run an asyncio callback with a different decimal context, I would like to write: --- cb_context = contextvars.copy_context() decimal_context = cb_context[decimal.contextvar].copy() decimal_context.prec = 100 cb_context[decimal.contextvar] = decimal_context # <--- HERE loop.call_soon(func, context=cb_context) ---- The overall code would behaves as: --- with localcontext() as ctx: ctx.prec = 100 loop.call_soon(func) --- I don't know if the two code snippets have exactly the same behaviour. I don't want to modify func() to run it with a different decimal context. So I would prefer to not have to call decimal.contextvar.set() or decimal.setcontext() in func(). But I would need contextvars.Context[var]=value to support such use case. Decimal contexts are mutable, so modifying directly the decimal context object would impact all contexts which isn't my intent here. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I think decimal is a bit of a red herring, because the entire decimal context is stored as a single thread-local variable (_local.__decimal_context__ in _pydecimal.py, where _local is a global threading.local instance; using "___DECIMAL_CTX__" as a key in the C-level dict of thread-locals in the C version). Nevertheless there is the issue of whether it's better to make contextvars.Context a MutableMapping or to make it an (immutable) Mapping. the current context, sets that as the current context, runs the function, and then restores the previous context (the one that it copied). With a truly immutable Context offering only the (immutable) Mapping interface (plus an internal API that returns a new Context that has a different value for one key), ContextVar.set() is a bit more complicated because it has to use set_context() (actually an internal thing that updates the current context in the thread state) and similar for ContextVar.reset(token). (An alternative design is possible where a Context is an immutable-looking wrapper around a plain dict, with private APIs that mutate that dict, but apart from having different invariants about the identities of Context objects it works out about the same from a user's POV.) Anyway, the differences between these are user-visible so we can't make this an implementation detail: We have to choose. Should Context be a MutableMapping or not? Yury strongly favors an immutable Context, and that's what his reference implementation has (https://github.com/python/cpython/pull/5027). His reasoning is that in the future we *might* want to support automatic context management for generators by default (like described in his original PEP 550), and then it's essential to use the immutable version so that "copying" the context when a generator is created or resumed is super fast (and in particular O(1)). I am personally not sure that we'll ever need it but at the same time I'm also not confident that we won't, so I give Yury the benefit of the doubt here -- after all he has spent an enormous amount of time thinking this design through so I value his intuition. In addition I agree that for most users the basic interface will be ContextVar, not Context, and the needs of framework authors are easily met by Context.run(). So I think that Yury's desire for an immutable Context will not harm anyone, and hence I support the current design of the PEP. (Though I want some of the details to be written up clearer -- everyone seems to agree with that. :-) Maybe I should clarify again what run() does. Here's how I think of it in pseudo code: def run(self, func, *args, **kwds): old = _get_current_context() new = old.copy() _set_current_context(new) try: return func(*args, **kwds) finally: _set_current_context(old) If you look carefully at the version in the PEP you'll see that it's the same thing, but the PEP inlines the implementations of _get_current_context() and _set_current_context() (which I made up -- these won't be real APIs) through manipulation of the current thread state. I hope this clarifies everything. --Guido On Wed, Jan 3, 2018 at 2:26 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 3, 2018 at 12:32 PM, Glenn Linderman <v+python@g.nevcal.com> wrote:
Heh, you're right, I forgot about that. It should be more like this: def run(self, func, *args, **kwds): old = _get_current_context() _set_current_context(self) # <--- changed line try: return func(*args, **kwds) finally: _set_current_context(old) This version, like the PEP, assumes that the Context object is truly immutable (not just in name) and that you should call it like this: contextvars.copy_context().run(func, <args>) The PEP definitely needs to be clearer about this -- right now one has to skip back and forth between the specification and the implementation (and sometimes the introduction) to figure out how things really work. Help is wanted! -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
2018-01-03 23:01 GMT+01:00 Guido van Rossum <guido@python.org>:
I don't see how asyncio would use Context.run() to keep the state (variables values) between callbacks and tasks, if run() is "stateless": forgets everything at exit. I asked if it would be possible to modify run() to return a new context object with the new state, but Yury confirmed that it's not doable: Yury:
[Context.run()] can't return a new context because the callable you're running can raise an exception. In which case you'd lose modifications prior to the error.
Guido:
Yury strongly favors an immutable Context, and that's what his reference implementation has (https://github.com/python/cpython/pull/5027). His reasoning is that in the future we *might* want to support automatic context management for generators by default (like described in his original PEP 550), and then it's essential to use the immutable version so that "copying" the context when a generator is created or resumed is super fast (and in particular O(1)).
To get acceptable performances, PEP 550 and 567 require O(1) cost when copying a context, since the API requires to copy contexts frequently (in asyncio, each task has its own private context, creating a task copies the current context). Yury proposed to use "Hash Array Mapped Tries (HAMT)" to get O(1) copy. Each ContextVar.set() change creates a *new* HAMT. Extract of the PEP 567: --- def set(self, value): ts : PyThreadState = PyThreadState_Get() data : _ContextData = ts.context_data try: old_value = data.get(self) except KeyError: old_value = Token.MISSING ts.context_data = data.set(self, value) return Token(self, old_value) --- The link between ContextVar, Context and HAMT (called "context data" in the PEP 567) is non obvious: * ContextVar.set() creates a new HAMT from PyThreadState_Get().context_data and writes the new one into PyThreadState_Get().context_data -- technically, it works on a thread local storage (TLS) * Context.run() doesn't set the "current context": in practice, it sets its internal "context data" as the current context data, and then save the *new* context data in its own context data PEP 567: --- def run(self, callable, *args, **kwargs): ts : PyThreadState = PyThreadState_Get() saved_data : _ContextData = ts.context_data try: ts.context_data = self._data return callable(*args, **kwargs) finally: self._data = ts.context_data ts.context_data = saved_data --- The main key of the PEP 567 implementation is that there is no "current context" in practice. There is only a private *current* context data. Not having get_current_contex() allows the trick of context data handled by a TLS. Otherwise, I'm not sure that it would be possible to synchronize a Context object with a TLS variable.
I don't think that a mutable context would have an impact in performance, since copying "context data" will still have a cost of O(1). IMHO it's just a matter of taste for the API. Or maybe I missed something. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Oh, dang, I forgot about this. ContextVar.set() modifies the current Context in-place using a private API. In the PEP, asyncio copies the Context once and then calls run() repeatedly (for each _step call). So run() isn't stateless, it just saves and restores the notion of the current context (in the thread state). I don't have time right now to respond in more detail, sorry for shedding darkness. :-( Hopefully I'll have time Friday. On Wed, Jan 3, 2018 at 4:25 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Ok, I finally got access to a computer and I was able to test the PEP 567 implementation: see my code snippet below. The behaviour is more tricky than what I expected. While running context.run(), the context object is out of sync of the "current context". It's only synchronized again at run() exit. So ContextVar.set() doesn't immediately modifies the "current context" object (set by Context.run()). Ok, and now something completely different! What if Context looses its whole mapping API and becomes a "blackbox" object only with a run() method and no attribute? It can be documented as an object mapping context variables to their values, but don't give access to these values directly. It would avoid to have explain the weird run() behaviour (context out of sync). It would avoid to have to decide if it's a mutable or immutable mapping. It would avoid to have to explain the internal O(1) copy using HAMT. Code testing Context.run(): --- import contextvars name = contextvars.ContextVar('name', default='name') def assertNotIn(var, context): try: context[var] except LookupError: pass else: raise AssertionError("name is set is context") def func1(): name.set('yury') assert name.get() == 'yury' assertNotIn(name, context) def func2(): assert name.get() == 'yury' assert context[name] == 'yury' context = contextvars.copy_context() assert name.get() == 'name' assertNotIn(name, context) context.run(func1) assert name.get() == 'name' assert context[name] == 'yury' context.run(func2) assert name.get() == 'name' assert context[name] == 'yury' --- Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
That was actually Yury's original design, but I insisted on a Mapping so you can introspect it. (There needs to be some way to introspect a Context, for debugging.) Again, more later. :-( On Wed, Jan 3, 2018 at 4:44 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
2018-01-04 0:44 GMT+01:00 Victor Stinner <victor.stinner@gmail.com>:
A better description of Context.run() behaviour would be: "Create a copy of the context and sets it as the current context. Once the function completes, updates the context from the copy." This description explains why run() doesn't immediately updates the context object. Victor
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Wed, Jan 3, 2018 at 3:44 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
To me this sounds like a oversight (= bug), not intended behavior. At the conceptual level, I think what we want is: - Context is a mutable object representing a mapping - BUT it doesn't allow mutation through the MutableMapping interface; instead, the only way to mutate it is by calling Context.run and then ContextVar.set(). Funneling all 'set' operations through a single place makes it easier to do clever caching tricks, and it lets us avoid dealing with operations that we don't want here (like 'del') just because they happen to be in the MutableMapping interface. - OTOH we do implement the (read-only) Mapping interface because there's no harm in it and it's probably useful for debuggers. (Note that I didn't say anything about HAMTs here, because that's orthogonal implementation detail. It would make perfect sense to have Context be an opaque wrapper around a regular dict; it would just give different performance trade-offs.) -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 3, 2018 at 6:35 PM, Nathaniel Smith <njs@pobox.com> wrote:
I think that in essence what Victor saw is a cache consistency issue. If you look at the implementation section in the PEP, the ContextVar.set() operation mutates _ContextData, which is a private (truly) immutable data structure that stands in for the HAMT, and the threadstate contains one of these (not a Context). When you call copy_context() you get a fresh Context that wraps the current _ContextData. Because the latter is effectively immutable this is a true clone. ctx.run() manipulates the threadstate to make the current _ContextData the one from ctx, then calls the function. If the function calls var.set(), this will create a new _ContextData that is stored in the threadstate, but it doesn't update the ctx. This is where the current state and ctx go out of sync. Once the function returns or raises, run() takes the _ContextData from the threadstate and stuffs it into ctx, resolving the inconsistency. (It then also restores the previous _ContextData that it had saved before any of this started.) So all in all Context is mutable but the only time it is mutated is when run() returns. I think Yury's POV is that you rarely if ever want to introspect a Context object that's not freshly obtained from copy_context(). I'm not sure if that's really true; it means that introspecting the context stored in an asyncio.Task may give incorrect results if it's the currently running task. Should we declare it a bug? The fix would be complex given the current implementation (either the PEP's pseudo-code or Yury's actual HAMT-based implementation). I think it would involve keeping track of the current Context in the threadstate rather than just the _ContextData, and updating the Context object on each var.set() call. And this is something that Yury wants to avoid, so that he can do more caching for var.get() (IIUC). We could also add extra words to the PEP's spec for run() explaining this temporary inconsistency. I think changing the introspection method from Mapping to something custom won't fix the basic issue (which is that there's a difference between the Context and the _ContextData, and ContextVar actually only manipulates the latter, always accessing it via the threadstate). However there's another problem with the Mapping interface, which is: what should it do with variables that are not set and have no default value? Should they be considered to have a value equal to _NO_DEFAULT or Token.MISSING? Or should they be left out of the keys altogether? The PEP hand-waves on this issue (we didn't think of missing values when we made the design). Should it be possible to introspect a Context that's not the current context?
Agreed, that's how the PEP pseudo-code does it. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 8:30 AM, Guido van Rossum <guido@python.org> wrote:
Yeah, that's a good way to think about it.
I think the fix is a little bit cumbersome, but straightforward, and actually *simplifies* caching. If we track both the _ContextData and the Context in the threadstate, then set() becomes something like: def set(self, value): # These two lines are like the current implementation new_context_data = tstate->current_context_data->hamt_clone_with_new_item(key=self, value=value) tstate->current_context_data = new_context_data # Update the Context to have the new _ContextData tstate->current_context->data = new_context_data # New caching: instead of tracking integer ids, we just need to track the Context object # This relies on the assumption that self->last_value is updated every time any Context is mutated self->last_value = value self->last_context = tstate->current_context And then the caching in get() becomes: def get(self): if tstate->current_context != self->last_context: # Update cache self->last_value = tstate->current_context_data->hamt_lookup(self) self->last_context = tstate->current_context return self->last_value (I think the current cache invalidation logic is necessary for a PEP 550 implementation, but until/unless we implement that we can get away with something simpler.) So I'd say yeah, let's declare it a bug. If it turns out that I'm wrong and there's some reason this is really difficult, then we could consider making introspection on a currently-in-use Context raise an error, instead of returning stale data. This should be pretty easy, since Contexts already track whether they're currently in use (so they can raise an error if you try to use the same Context in two threads simultaneously).
I've been thinking this over, and I don't *think* there are any design constraints that force us towards one approach or another, so it's just about what's most convenient for users. My preference for how missing values / defaults / etc. should be handled is, Context acts just like a dict that's missing its mutation methods, and ContextVar does: class ContextVar: # Note: default=None instead of default=_MAGIC_SENTINEL_VALUE # If users want to distinguish between unassigned and None, then they can # pass their own sentinel value. IME this is very rare though. def __init__(self, name, *, default=None): self.name = name self.default = default # Note: no default= argument here, because merging conflicting default= values # is inherently confusing, and not really needed. def get(self): return current_context().get(self, self.default) Rationale: I've never seen a thread local use case where you wanted *different* default values at different calls to getattr. I've seen lots of thread local use cases that jumped through hoops to make sure they used the same default everywhere, either by defining a wrapper around getattr() or by subclassing local to define fallback values. Likewise, I've seen lots of cases where having to check for whether a thread local attribute was actually defined or not was a nuisance, and never any where it was important to distinguish between missing and None. But, just in case someone does fine such a case, we should make it possible to distinguish. Allowing users to override the default= is enough to do this. And the default= argument is also useful on those occasions where someone wants a default value other than None, which does happen occasionally. For example, django.urls.resolvers.RegexURLResolver._local.populating is semantically a bool with a default value of False. Currently, it's always accessed by writing getattr(_local, "populating", False). With this approach it could instead use ContextVar("populating", default=False) and then just call get(). Everything I just said is about the ergonomics for ContextVar users, so it makes sense to handle all this inside ContextVar. OTOH, Context is a low-level interface for people writing task schedulers and debuggers, so it makes sense to keep it as simple and transparent as possible, and "it's just like a dict" is about as simple and transparent as it gets. Also, this way the pseudocode is really really short.
Should it be possible to introspect a Context that's not the current context?
I think debuggers will definitely want to be able to do things like print Context values from arbitrary tasks. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Your suggestions sound reasonable, but we are now running into a logistical problem -- I don't want to decide this unilaterally but Yury is on vacation until Jan 15. That gives us at most 2 weeks for approval of the PEP and review + commit of the implementation ( https://github.com/python/cpython/pull/5027) before the 3.7.0 feature freeze / beta (Jan 29). Your approach to defaults *may* have another advantage: perhaps we could get rid of reset() and Token. Code to set and restore a context var could just obtain the old value with get() and later restore it with set(). OTOH this would not be entirely transparent, since it would permanently add the var to the keys of the Context if the var was previously not set. [Later] Oh, drat! Writing that paragraph made me realize there was a bug in my own reasoning that led to this question about variables that have no value: they aren't present in the keys of _ContextData, so my concern about keys and values being inconsistent was unfounded. So the only reason to change the situation with defaults would be that it's more convenient. Though I personally kind of like that you can cause var.get() to raise LookupError if the value is not set -- just like all other variables. [Even later] Re: your other suggestion, why couldn't the threadstate contain just the Context? It seems one would just write tstate->current_context->_data everywhere instead of tstate->current_context_data. On Thu, Jan 4, 2018 at 4:18 PM, Nathaniel Smith <njs@pobox.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 4:48 PM, Guido van Rossum <guido@python.org> wrote:
I don't think there are any fundamental disagreements here, so I guess we can sort something out pretty quickly once Yury gets back.
Yeah, that's probably better :-). -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 3:18 PM, Nathaniel Smith <njs@pobox.com> wrote:
Actually, this trick of using the Context* as the cache validation key doesn't work :-/. The problem is that the cache doesn't hold a strong reference to the Context, so there's an ABA problem [1] if the Context is deallocated, and then later the same memory location gets used for a new Context object. It could be fixed by using weakref callbacks, but that's not really simpler than the current counter-based cache validation logic. -n [1] https://en.wikipedia.org/wiki/ABA_problem -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 00:17, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Agreed. This was something that bothered me, too. I mentioned it in my review, but that seemed to get lost in the other comments in this thread... I get the impression that the logic is that the context is immutable, but the ContextVars that it contains aren't, and the copy is deep (at least 1 level deep) so you copy then change the value of a ContextVar. But rereading that sentence, it sounds confused even to me, so it's either not right or the implementation falls foul of "If the implementation is hard to explain, it's a bad idea." :-) Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 3:25 AM, Paul Moore <p.f.moore@gmail.com> wrote:
It was get_context() in an earlier version of PEP 567. We changed it to copy_context() believing that that would clarify that you get a clone that is unaffected by subsequent ContextVar.set() operations (which affect the *current* context rather than the copy you just got). [The discussion is fragmentary because Yury is on vacation until the 15th and I am scrambling for time myself. But your long post is saved, and not forgotten.] -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 15:56, Guido van Rossum <guido@python.org> wrote:
Ah thanks. In which case, simply changing the emphasis to avoid the implication that Context objects are immutable (while that may be true in a technical/implementation sense, it's not really true in a design sense if ContextVar.set modifies the value of a variable in a context) is probably sufficient.
OK. I knew you were short on time, but hadn't realised Yury was on vacation too. Sorry if I sounded like I was nagging. Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 9:27 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Do you have a specific proposal for a wording change? PEP 567 describes Context as "a read-only mapping, implemented using an immutable dictionary." This sounds all right to me -- "read-only" is weaker than "immutable". Maybe the implementation should not be mentioned here? (The crux here is that a given Context acts as a variable referencing an immutable dict -- but it may reference different immutable dicts at different times.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 23:58, Guido van Rossum <guido@python.org> wrote:
I've been struggling to think of good alternative wordings (it's a case of "I'm not sure what you're trying to say, so I can't work out how you should say it", unfortunately). The best I can come up with is """ A Context is a mapping from ContextVar objects to their values. The Context itself exposes the Mapping interface, so cannot be modified directly - to modify the value associated with a variable you need to use the ContextVar.set() method. """ Does that explain things correctly? One thing I am sure of is that we should remove "implemented using an immutable dictionary" - it's an implementation detail, and adds nothing but confusion to mention it here. Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Fri, Jan 5, 2018 at 2:41 AM, Paul Moore <p.f.moore@gmail.com> wrote:
This is clear, but IMO there's one important detail missing: using ContextVar.set() you can only modify the current Context. This part of the PEP (in particular the definition of ContextVar.set() on line 90) also lies about what ContextVar.set() does -- it implies that Context is immutable. If it was, the recommended use of Context.run() in asyncio would make no sense, since run() clearly modifies the Context object in place. This is not an implementation detail -- it is an API detail that affects how frameworks should use Context objects. Re-reading, there's a lot of language in this part of the PEP that needs updating... :-(
Agreed. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 4:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Well, it's not *immutable* (it shouldn't support hash()), but it doesn't follow the MutableMapping protocol -- it only follows the Mapping protocol. Note that the latter carefully doesn't call itself ImmutableMapping. Context is a mutable object that implements the Mapping protocol. The only way to mutate a Context is to use var.set() when that Context is the current context. (Modulo the caching bug discussed in the subthread with Nathaniel.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 2 janv. 2018 18:57, "Guido van Rossum" <guido@python.org> a écrit : Oh, the "Specification" section in the PEP is too brief on several of these subjects. It doesn't really specify what var.get() does if the value is not set, nor does it even mention var.get(<default>) except in the code examples for var.reset(). It's also subtle that ctx[var] returns the default (if there is one). I suppose it will raise if there isn't one -- resulting in the somewhat surprising behavior where `var in ctx` may be true but `ctx[var]` may raise. And what does it raise? (All these questions are answered by the code, but they should be clearly stated in the PEP.) A variable has or has no default value. Would it make sense to expose the default value as a public read-only attribute (it would be equal to _NO_DEFAULT or Token.MISSING if there is no default) and/or add a is_set() method? is_set() returns true if the variable has a default value or if it was set in the "current context". Currently, a custom sentinel is needed to check if var.get(), ctx.get(var) and ctx[var] would raise an exception or not. Example: my_sentinel = object() is_set = (var.get(default=my_sentinel) is not my_sentinel) # no exception if is_set is true ContextVar.get() is non obvious because the variable has an optinal default, get() has an optional default parameter, and the variable can be set or not in the current context. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:45 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
But is there a common use case? For var.get() I'd rather just pass the default or catch the exception if the flow is different. Using ctx[var] is rare (mostly for printing contexts, and perhaps for explaining var.get()). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:38, "Guido van Rossum" <guido@python.org> a écrit : But is there a common use case? For var.get() I'd rather just pass the default or catch the exception if the flow is different. Using ctx[var] is rare (mostly for printing contexts, and perhaps for explaining var.get()). I don't think that it would be a common use case. Maybe we don't need is_set(), I'm fine with catching an exception. But for introspection at least, it would help to expose the default as a read-only attribute, no? Another example of a mapping with default value: https://docs.python.org/dev/library/collections.html#collections.defaultdict And defaultdict has a default_factory attribute. The difference here is that default_factory is mandatory. ContextVar would be simpler if the default would be mandatory as well :-) Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Why ContextVar.reset(token) does nothing at the second call with the same token? What is the purpose of Token._used? I guess that there is an use case to justify this behaviour. reset() should have a result: true if the variable was restored to its previous state, false if reset() did nothing because the token was already used. And/Or Token should have a read-only "used" property. Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
What is the behaviour of ContextVar.reset(token) if the token was created from a different variable? Raise an exception? token = var1.set("value") var2.reset(token) The PEP states that Token.var only exists for debug or introspection. Victor Le 3 janv. 2018 00:51, "Victor Stinner" <victor.stinner@gmail.com> a écrit : Why ContextVar.reset(token) does nothing at the second call with the same token? What is the purpose of Token._used? I guess that there is an use case to justify this behaviour. reset() should have a result: true if the variable was restored to its previous state, false if reset() did nothing because the token was already used. And/Or Token should have a read-only "used" property. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:51 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
That depends again on the use case. The only real purpose for reset() is to be able to write a context manager that sets and restores a context variable (like in `with decimal.localcontext()`). Handling double resets is about as useful as specifying what happens if __exit__ is called twice. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
PEP: "int PyContext_Enter(PyContext *) and int PyContext_Exit(PyContext *) allow to set and restore the context for the current OS thread." What is the difference between Enter and Exit? Why not having a single Py_SetContext() function? Victor
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
I don't want to expose a SetContext operation because of, again, potential incompatibility with PEP 550, where generators expect to fully control push/pop context operation. Second, Context.run is 100% enough for *any* async framework to add support for PEP 567. And because the PEP is focused just on async, I think that we don't need anything more than 'run'. Third, I have a suspicion that we focus too much on actual Context and Context.run. These APIs are meant for asyncio/twisted/trio/etc maintainers, not for an average Python user. An average person will likely not interact with any of the PEP 567 machinery directly, wven when using PEP 567-enabled libraries like numpy/decimal. Yury On Wed, Jan 3, 2018 at 2:56 AM Victor Stinner <victor.stinner@gmail.com> wrote:
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Do you see any average Python users on this thread? We are trying to pick the PEP apart from the POV of having to use the complicated parts of the API in a framework. Victor's questions are reasonable. On Tue, Jan 2, 2018 at 10:13 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Thu, Dec 28, 2017 at 4:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Thanks, Victor!
As Nathaniel already explained, a 'default' for ContextVars is literally a default -- default value returned when a ContextVar hasn't been assigned a value in a context. So my opinion on this is that 'default' is the less ambiguous name here. [..]
I like MISSING more than NOT_SET, but this is very subjective, of course. If Guido wants to rename it I rename it.
We want _decimal and numpy to use the new API, and they will call ContextVar.get() on basically all operations, so it needs to be as fast as possible. asyncio/uvloop also want the fastest copy_context() and Context.run() possible, as they use them for *every* callback. So I think it's OK for us to add new C APIs here.
Sure, I'll change it. Yury
data:image/s3,"s3://crabby-images/2ffc5/2ffc57797bd7cd44247b24896591b7a1da6012d6" alt=""
I have a couple basic questions around how this API could be used in practice. Both of my questions are for the Python API as applied to Tasks in asyncio. 1) Would this API support looking up the value of a context variable for **another** Task? For example, if you're managing multiple tasks using asyncio.wait() and there is an exception in some task, you might want to examine and report the value of a context variable for that task. 2) Would an appropriate use of this API be to assign a unique task id to each task? Or can that be handled more simply? I'm wondering because I recently thought this would be useful, and it doesn't seem like asyncio means for one to subclass Task (though I could be wrong). Thanks, --Chris On Wed, Dec 27, 2017 at 10:08 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Thu, Dec 28, 2017 at 5:28 AM, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
No, unless that another Task explicitly shares the value or captures its context and shares it. Same as with threading.local.
The API should be used to share one ID between a Task and tasks it creates. You can use it to store individual Task IDs, but a combination of a WeakKeyDictionary and Task.current_task() seems to be a better/easier option. Yury
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 28 December 2017 at 06:08, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This is a second version of PEP 567.
Overall, I like the proposal. It's relatively straightforward to follow, and makes sense. One thing I *don't* see in the PEP is an example of how code using thread-local storage should be modified to use context variables. My impression is that it's simply a matter of replacing the TLS calls with equivalent ContextVar calls, but an example might be helpful. Some detail points below.
I understand how this could happen, having followed the discussions here, but a (simple) example of the issue might be useful.
Accessed by copying it? That seems weird to me. I'd expect either that you'd be able to access the current Context directly, *or* that you'd say that the current Context is not directly accessible by the user, but that a copy can be obtained using copy_context. But given that the Context is immutable, why the need tp copy it? Also, the references to threads in the above are confusing. It says that this is a well-known concept in terms of thread-local storage, but this case is different. It then goes on to say that the current Context is stored in thread local storage, which gives me the impression that the new idea *is* related to thread local storage... I think that the fact that a Context is held in thread-local storage is an implementation detail. Assuming I'm right, don't bother mentioning it - simply say that there's a notion of a current Context and leave it at that.
On first reading, this confused me because I didn't spot that you're saying a *Context* is read-only, but a *ContextVar* has get and set methods. Maybe reword this to say that a Context is a read-only mapping from ContextVars to values. A ContextVar has a get method that looks up its value in the current Context, and a set method that replaces the current Context with a new one that associates the specified value with this ContextVar. (The current version feels confusing to me because it goes into too much detail on how the implementation does this, rather than sticking to the high-level specification)
Context.run() came a bit out of nowhere here. Maybe the part from "It is not possible..." should be in the introduction above? Something like the following, covering this and copy_context: The current Context cannot be accessed directly by user code. If the frameowrk wants to run some code in a different Context, the Context.run(callable, *args, **kwargs) method is used to do that. To construct a new context for this purpose, the current context can be copied via the copy_context function, and manipulated prior to the call to run().
My first thought was that default was the context variable's initial value. But if that's what it is, why not call it that? If the default has another effect as well as being the initial value, maybe clarify here what that is?
get doesn't take an argument. Typo?
same typo?
[...] I haven't commented on these as they aren't my area of expertise.
Again, I'm ignoring this as I don't really have an interest in how the facility is implemented.
Would it be worth exposing this data structure elsewhere, in case other uses for it exist?
Should the cache (or at least the performance guarantees it implies) be part of the spec? Do we care if other implementations fail to implement a cache?
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Some inline responses to Paul (I snipped everything I agree with). On Wed, Jan 3, 2018 at 3:34 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Because it's not immutable. (I think by now people following this thread understand that.) The claims or implications in the PEP that Context is immutable are wrong (and contradict the recommended use of run() for asyncio, for example).
The PEP's language does seem confused. This is because it doesn't come out and define the concept of "task". The correspondence is roughly that thread-locals are to threads what Contexts and ContextVars are to tasks. (This still requires quite a bit of squinting due to the API differences but it's better than what the PEP says.)
No, actually it's important that each thread has its own current context. It is even possible to pass Contexts between threads, e.g. if you have a ThreadExecutor e, you can call e.submit(ctx.run, some_function, args...). However, run() is not thread-safe! @Yury: There's an example that does almost exactly this in the PEP, but I think it could result in a race condition if you called run() concurrently on that same context in a different thread -- whichever run() finishes last will overwrite the other's state. I think we should just document this and recommend always using a fresh copy in such scenarios. Hm, does this mean it would be good to have an explicit Context.copy() method? Or should we show how to create a copy of an arbitrary Context using ctx.run(copy_context)?
We went over this passage in another subthread. IMO what it says about ContextVar.set() is incorrect.
I agree that run() comes out of nowhere but I'd suggest a simpler fix -- just say "Instead, Context.run() must be used, see below."
IMO it's more and different than the "initial value". The ContextVar never gets set directly to the default -- you can verify this by checking "var in ctx" for a variable that has a default but isn't set -- it's not present. It really is used as the "default default" by ContextVar.get(). That's not an implementation detail.
Actually it does, the argument specifies a default (to override the "default default" set in the constructor). However this hasn't been mentioned yet at this point (the description of ContextVar.get() earlier doesn't mention it, only the implementation below). It would be good to update the earlier description of ContextVar.get() to mention the optional default (and how it interacts with the "default default").
asyncio -------
(Too bad, since there's an important clue about the mutability of Context hidden in this section! :-)
(Again, too bad, since despite the section heading this acts as a pseudo-code specification that is much more exact than the "specification" section above.)
I've asked Yury this several times, but he's shy about exposing it. Maybe it's better to wait until 3.8 so the implementation and its API can stabilize a bit before the API is constrained by backwards compatibility.
IMO it's a quality-of-implementation issue, but the speed of the CPython implementation plays an important role in acceptance of the PEP (since we don't want to slow down e.g. asyncio task creation). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Hi, I like the new version of the PEP using "read only mapping" and copy_context(). It's easier to understand. I'm ok with seeing a context as a mapping, but I am confused about a context variable considered as a mapping item. I still see a context variable as a variable, so something which has a value or not. I just propose to rename the default parameter of the ContextVar constructor. Le 28 déc. 2017 7:10 AM, "Yury Selivanov" <yselivanov.ml@gmail.com> a écrit : ContextVar ---------------------- The ``ContextVar`` class has the following constructor signature: ``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter is used only for introspection and debug purposes, and is exposed as a read-only ``ContextVar.name`` attribute. The ``default`` parameter is optional. Example:: # Declare a context variable 'var' with the default value 42. var = ContextVar('var', default=42) In term of API, "default" parameter name is strange. Why not simply calling it "value"? var = ContextVar('var', default=42) and: var = ContextVar('var') var.set (42) behaves the same, no? The implementation explains where the "default" name comes from, but IMHO "value" is a better name. (The ``_NO_DEFAULT`` is an internal sentinel object used to detect if the default value was provided.) I would call it _NOT_SET. * a read-only attribute ``Token.old_value`` set to the value the variable had before the ``set()`` call, or to ``Token.MISSING`` if the variable wasn't set before. Hum, I also suggest to rename Token.MISSING to Token.NOT_SET. It would be more conistent with the last sentence. C API ----- Would it be possible to make this API private? 2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value)``: (...) ``value`` is always a borrowed reference. I'm not sure that it's a good idea to add a new public C function which returns a borrowed reference. I would prefer to only use (regular) strong references in the public API. I don't want to elaborate here. You may see: http://vstinner.readthedocs.io/python_new_stable_api.html Internally, I don't care, do whatever you want for best performances :-) Victor
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Dec 28, 2017 at 1:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
No, they're different. The second sets the value in the current context. The first sets the value in all contexts that currently exist, and all empty contexts created in the future. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
NLe 28 déc. 2017 11:20 AM, "Nathaniel Smith" <njs@pobox.com> a écrit : On Thu, Dec 28, 2017 at 1:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
No, they're different. The second sets the value in the current context. The first sets the value in all contexts that currently exist, and all empty contexts created in the future. Oh, that's an important information. In this case, "default" is the best name. The PEP may be more explicit about the effect on all contexts. Proposition of documentation: "The optional *default* parameter is the default value in all contexts. If the variable is not set in the current context, it is returned by by context[var_name] and by var.get(), when get() is called without the default parameter." Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Oh, the "Specification" section in the PEP is too brief on several of these subjects. It doesn't really specify what var.get() does if the value is not set, nor does it even mention var.get(<default>) except in the code examples for var.reset(). It's also subtle that ctx[var] returns the default (if there is one). I suppose it will raise if there isn't one -- resulting in the somewhat surprising behavior where `var in ctx` may be true but `ctx[var]` may raise. And what does it raise? (All these questions are answered by the code, but they should be clearly stated in the PEP.) I would really like to invite more people to review this PEP! I expect I'll be accepting it in the next two weeks, but it needs to go through more rigorous review. On Thu, Dec 28, 2017 at 4:48 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
I read again the PEP and I am still very confused by Context.run(). The PEP states multiple times that a context is immutable: * "read-only mapping" * inherit from Mapping, not from MutableMapping But run() does modify the context (or please correct me if I completely misunderstood the PEP! I had to read it 3 times to check if run() mutates or not the context). It would help if the ctx.run() example in the PEP would not only test var.get() but also test ctx.get(var). Or maybe show that the variable value is kept in a second function call, but the variable is "restored" between run() calls. The PEP tries hard to hide "context data", which is the only read only thing in the whole PEP, whereas it's a key concept to understand the implementation. I understood that: * _ContextData is immutable * ContextVar.set() creates a new _ContextData and sets it in the current Python thread state * When the called function completes, Context.run() sets its context data to the new context data from the Python thread state: so run() does modify the "immutable" context The distinction between the internal/hiden *immutable* context data and public/visible "mutable" (from my point of view) context is unclear to me in the PEP. The concept of "current context" is not defined in the PEP. In practice, there is no "current context", there is only a "current context data" in the current Python thread. There is no need for a concrete context instance to store variable variables values. It's also hard to understand that in the PEP. Why Context could not inherit from MutableMapping? (Allow ctx.set(var, value) and ctx [var] = value.) Is it just to keep the API small: changes should only be made using var.set()? Or maybe Context.run() should really be immutable and return the result of the called function *and* a new context? But I dislike such theorical API, since it would be complex to return the new context if the called function raises an exception. Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Hum, it seems like the specification (API) part of the PEP is polluted by its implementation. The PEP just require a few minor changes to better describe the behaviour/API instead of insisting on the read only internal thing which is specific to the proposed implementation which is just one arbitrary implemention (designed for best performances). IMHO the PEP shouldn't state that a context is read only. From my point of view, it's mutable and it's the mapping holding variable values. There is a current context which holds the current values. Context.run() switchs temporarely the current context with another context. The fact that there is no concrete context instance by default doesn't really matter in term of API. Victor Le 3 janv. 2018 00:34, "Victor Stinner" <victor.stinner@gmail.com> a écrit :
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 5:30 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
Yeah, we need some more words. I still hope someone proposes some, but if threats fail I will have to try to find time myself.
I think the issue here is a bit different than Yury's response suggests -- it's more like how a variable containing an immutable value (e.g. a string) can be modified, e.g. x = 'a' x += 'b' In our case the *variable* is the current thread state (in particular the slot therein that holds the context -- this slot can be modified by the C API). The *value* is the Context object. It is a collections.Mapping (or typing.Mapping) which does not have mutating methods. (The mutable type is called MutableMapping.) The *reason* for doing it this way is that Yury doesn't want Context to implement __delitem__, since it would complicate the specification of chained lookups by a future PEP, and chained lookups look to be the best option to extend the Context machinery for generators. We're not doing that in Python 3.7 (PEP 550 v2 and later did this but it was too complex) but we might want to do in 3.8 or 3.9 once we are comfortable with PEP 567. (Or not -- it's not clear to me that generators bite decimal users the way tasks do. Coroutines always run on behalf of a task so they're not a problem.) --Guido
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:34, "Guido van Rossum" <guido@python.org> a écrit : I think the issue here is a bit different than Yury's response suggests -- it's more like how a variable containing an immutable value (e.g. a string) can be modified, e.g. x = 'a' x += 'b' In our case the *variable* is the current thread state (in particular the slot therein that holds the context -- this slot can be modified by the C API). The *value* is the Context object. It is a collections.Mapping (or typing.Mapping) which does not have mutating methods. (The mutable type is called MutableMapping.) I can see a parallel with a Python namespace, like globals and locals arguments of exec(): ns = globals().copy() # ctx = copy_context() exec("x = 'a'", ns, ns) # ctx.run(...) ns['x'] += 'b' # Context ??? print(ns ['x']) # print(ctx[x]) The *reason* for doing it this way is that Yury doesn't want Context to implement __delitem__, since it would complicate the specification of chained lookups by a future PEP, and chained lookups look to be the best option to extend the Context machinery for generators. Again, why not just raise an exception on "del ctx[var]"? Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:30 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
You've convinced me that Context is neither immutable nor read-only, and the PEP should admit this. Its *Mapping* interface doesn't allow mutations but its run() method does. E.g. here's a function that mutates a Context, effectively doing vtx[var] = value: def mutate_context(ctx, var, value): ctx.run(lambda: var.set(value)) However you've not convinced me that it would be better to make Context implement the full MutableMapping interface (with `__delitem__` always raising). There are use cases for inspecting Context, e.g. a debugger that wants to print the Context for some or all suspended tasks. But I don't see a use case for mutating a Context object that's not the current context, and when it is the current context, ContextVar.set() is more direct. I also don't see use cases for other MutableMapping methods like pop() or update(). (And clear() falls under the same prohibition as __delitem__().) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 6:09 PM, Guido van Rossum <guido@python.org> wrote:
I was looking at this again, and I realized there's some confusion. I've been repeating the thing about not wanting to implement __delitem__ too, but actually, I think __delitem__ is not the problem :-). The problem is that we don't want to provide ContextVar.unset() -- that's the operation that adds complication in a PEP 550 world. If you have a stack of Contexts that ContextVar.get() searches, and set/unset are only allowed to mutate the top entry in the stack, then the only way to implement unset() is to do something like context_stack[-1][self] = _MISSING, so it can hide any entries below it in the stack. This is extra complication for a feature that it's not clear anyone cares about. (And if it turns out people do care, we could add it later.) Deleting entries from individual Context objects shouldn't create conceptual problems. OTOH I don't see how it's useful either. I don't think implementing MutableMapping would actually cause problems, but it's a bunch of extra code to write/test/maintain without any clear use cases. This does make me think that I should write up a short PEP for extending PEP 567 to add context lookup, PEP 550 style: it can start out in Status: deferred and then we can debate it properly before 3.8, but at least having the roadmap written down now would make it easier to catch these details. (And it might also help address Paul's reasonable complaint about "unstated requirements".) -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 7:58 PM, Nathaniel Smith <njs@pobox.com> wrote:
Ah yes, that's it. A stack of Contexts could have the same semantics as ChainMap, in which __delitem__ deletes the key from the first mapping if present and otherwise raises KeyError even if it is present in a later mapping. That's enough to implement var.reset(), but not enough to implement arbitrary var.unset(). I'm fine with only having var.reset() and not var.unset().
The implementation would have to maintain cache consistency, but that's not necessarily a big deal, and it only needs to be done in a few places. But I agree that the a case hasn't been indicated yet. (I like being able to specify ContextVar's behavior using Context as a MutableMapping, but that's not a real use case.)
Anything that will help us kill a 550-pound gorilla sounds good to me. :-) It might indeed be pretty short if we follow the lead of ChainMap (even using a different API than MutableMapping to mutate it). Maybe copy_context() would map to new_child()? Using ChainMap as a model we might even avoid the confusion between Lo[gi]calContext and ExecutionContext which was the nail in PEP 550's coffin. The LC associated with a generator in PEP 550 would be akin to a loose dict which can be pushed on top of a ChainMap using cm = cm.new_child(<dict>). (Always taking for granted that instead of an actual dict we'd use some specialized mutable object implementing the Mapping protocol and a custom mutation protocol so it can maintain ContextVar cache consistency.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 9:42 PM, Guido van Rossum <guido@python.org> wrote:
The approach I took in PEP 568 is even simpler, I think. The PEP is a few pages long because I wanted to be exhaustive to make sure we weren't missing any details, but the tl;dr is: The ChainMap lives entirely inside the threadstate, so there's no need to create a LC/EC distinction -- users just see Contexts, or there's the one stack introspection API, get_context_stack(), which returns a List[Context]. Instead of messing with new_child, copy_context is just Context(dict(chain_map)) -- i.e., it creates a flattened copy of the current mapping. (If we used new_child, then we'd have to have a way to return a ChainMap, reintroducing the LC/EC mess. Plus it would allow for changes to the Context's inside one ChainMap to "leak" into the child or vice-versa.) And Context push/pop is just push/pop on the threadstate's ChainMap's '.maps' attribute. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/75a60/75a60da5c274977fef5c33fea963c84db3b74279" alt=""
This sounds reasonable. Although keep in mind that merging hamt is still an expensive operation, so flattening shouldn't always be performed (this is covered in 550). I also wouldn't call LC/EC a "mess". Your pep just names things differently, but otherwise is entirely built on concepts and ideas introduced in pep 550. Yury
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Tue, Jan 9, 2018 at 2:59 AM, Yury Selivanov <yselivanov@gmail.com> wrote:
Right, the PEP mostly focuses on the semantics rather than the implementation and this is an implementation detail (the user can't tell whether a Context internally holds a stack of HAMTs or just one). But there is a note that we might choose to perform the actual flattening lazily if it turns out to be worthwhile.
I also wouldn't call LC/EC a "mess". Your pep just names things differently, but otherwise is entirely built on concepts and ideas introduced in pep 550.
Sorry for phrasing it like that -- I just meant that at the API level, the LC/EC split caused a lot of confusion (and apparently this was it's "nail in the coffin"!). In the PEP 567/568 version, the underlying concepts are the same, but the API ends up being simpler. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Wed, Jan 3, 2018 at 2:36 AM Victor Stinner <victor.stinner@gmail.com> wrote:
tuples in Python are immutable, but you can have a tuple with a dict as its single element. The tuple is immutable, the dict is mutable. At the C level we have APIs that can mutate a tuple though. Now, tuple is not a direct analogy to Context, but there are some parallels. Context is a container like tuple, with some additional APIs on top.
Because that would be confusing to end users. ctx = copy_context() ctx[var] = something What did we just do? Did we modify the 'var' in the code that is currently executing? No, you still need to call Context.run to see the new value for var. Another problem is that MutableMapping defines a __delitem__ method, which i don't want the Context to implement. Deleting variables like that is incompatible with PEP 550, where it's ambiguous (due to the stacked nature of contexts). Now we don't want PEP 550 in 3.7, but I want to keep the door open for its design, in case we want context to work with generators.
It can't return a new context because the callable you're running can raise an exception. In which case you'd lose modifications prior to the error. ps i'm on vacation and don't always have an internet connection. yury
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:05, "Yury Selivanov" <yselivanov.ml@gmail.com> a écrit : tuples in Python are immutable, but you can have a tuple with a dict as its single element. The tuple is immutable, the dict is mutable. At the C level we have APIs that can mutate a tuple though. Now, tuple is not a direct analogy to Context, but there are some parallels. Context is a container like tuple, with some additional APIs on top. Sorry, I don't think that it's a good analogy. Context.run() is a public method accessible in Python which allows to modify the context. A tuple doesn't have such method. While it's technically possible to modify a tuple or a str at C level, it's a bad practice leading to complex bugs when it's not done carefully: see https://bugs.python.org/issue30156 property_descr_get() optimization was fixed twice but still has a bug. I proposed a PR to remove the hack. Why Context could not inherit from MutableMapping? (Allow ctx.set(var,
value) and ctx [var] = value.) Is it just to keep the API small: changes should only be made using var.set()?
Because that would be confusing to end users. ctx = copy_context() ctx[var] = something What did we just do? Did we modify the 'var' in the code that is currently executing? No, you still need to call Context.run to see the new value for var. IMHO it's easy to understand that modifying a *copy* of the current context doesn't impact the current context. It's one the first thing to learn when learning Python: a = [1, 2] b = a.copy() b.append(3) assert a == [1, 2] assert b == [1, 2, 3] Another problem is that MutableMapping defines a __delitem__ method, which i don't want the Context to implement. I wouldn't be shocked if "del ctx [var]" would raise an exception. I almost never use del anyway. I prefer to assign a variable to None, since "del var" looks like C++ destructor whereas it's more complex than a direct call to the destructor. But it's annoying to have to call a function with Context.run() whereas context is just a mutable mapping. It seems overkill to me to have to call run() to modify a context variable: run() changes temporarely the context and requires to use the indirect ContextVar API, while I know that ContextVar.set() modifies the context. Except of del corner case, I don't see any technical reason to prevent direct modification of a context. contextvars isn't new, it extends what we already have: decimal context. And decimal quick start documentation shows how to modify a context and then set it as the current context:
https://docs.python.org/dev/library/decimal.html Well, technically it doesn't modify a context. An example closer to contextvars would be:
Note: "getcontext().prec = 6" does modify the decimal context directly, and it's the *first* example in the doc. But here contextvars is different since there is no API to get the current API. The lack of API to access directly the current contextvars context is the main difference with decimal context, and I'm fine with that. It's easy to see a parallel since decimal context can be copied using Context.copy(), it has also multiple (builtin) "variables", it's just that the API is different (decimal context variables are modified as attributes), and it's possible to set a context using decimal.setcontext(). Victor
data:image/s3,"s3://crabby-images/75a60/75a60da5c274977fef5c33fea963c84db3b74279" alt=""
Do you have any use case for modifying a variable inside some context? numpy, decimal, or some sort of tracing for http requests or async frameworks like asyncio do not need that.
I think you are confusing context in decimal and pep 567. Decimal context is a mutable object. We use threading.local to store it. With pep 567 you will use a context variable behind the scenes to store it. I think it's incorrect to compare decimal contexts to pep567 in any way. Yury
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Do you have any use case for modifying a variable inside some context?
numpy, decimal, or some sort of tracing for http requests or async frameworks like asyncio do not need that.
Maybe I misunderstood how contextvars is supposed to be used. So let me give you an example. I understand that decimal.py will declare its context variable like this: --- contextvar = contextvars.ContextVar('decimal', default=Context(...)) --- Later if I would like to run an asyncio callback with a different decimal context, I would like to write: --- cb_context = contextvars.copy_context() decimal_context = cb_context[decimal.contextvar].copy() decimal_context.prec = 100 cb_context[decimal.contextvar] = decimal_context # <--- HERE loop.call_soon(func, context=cb_context) ---- The overall code would behaves as: --- with localcontext() as ctx: ctx.prec = 100 loop.call_soon(func) --- I don't know if the two code snippets have exactly the same behaviour. I don't want to modify func() to run it with a different decimal context. So I would prefer to not have to call decimal.contextvar.set() or decimal.setcontext() in func(). But I would need contextvars.Context[var]=value to support such use case. Decimal contexts are mutable, so modifying directly the decimal context object would impact all contexts which isn't my intent here. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I think decimal is a bit of a red herring, because the entire decimal context is stored as a single thread-local variable (_local.__decimal_context__ in _pydecimal.py, where _local is a global threading.local instance; using "___DECIMAL_CTX__" as a key in the C-level dict of thread-locals in the C version). Nevertheless there is the issue of whether it's better to make contextvars.Context a MutableMapping or to make it an (immutable) Mapping. the current context, sets that as the current context, runs the function, and then restores the previous context (the one that it copied). With a truly immutable Context offering only the (immutable) Mapping interface (plus an internal API that returns a new Context that has a different value for one key), ContextVar.set() is a bit more complicated because it has to use set_context() (actually an internal thing that updates the current context in the thread state) and similar for ContextVar.reset(token). (An alternative design is possible where a Context is an immutable-looking wrapper around a plain dict, with private APIs that mutate that dict, but apart from having different invariants about the identities of Context objects it works out about the same from a user's POV.) Anyway, the differences between these are user-visible so we can't make this an implementation detail: We have to choose. Should Context be a MutableMapping or not? Yury strongly favors an immutable Context, and that's what his reference implementation has (https://github.com/python/cpython/pull/5027). His reasoning is that in the future we *might* want to support automatic context management for generators by default (like described in his original PEP 550), and then it's essential to use the immutable version so that "copying" the context when a generator is created or resumed is super fast (and in particular O(1)). I am personally not sure that we'll ever need it but at the same time I'm also not confident that we won't, so I give Yury the benefit of the doubt here -- after all he has spent an enormous amount of time thinking this design through so I value his intuition. In addition I agree that for most users the basic interface will be ContextVar, not Context, and the needs of framework authors are easily met by Context.run(). So I think that Yury's desire for an immutable Context will not harm anyone, and hence I support the current design of the PEP. (Though I want some of the details to be written up clearer -- everyone seems to agree with that. :-) Maybe I should clarify again what run() does. Here's how I think of it in pseudo code: def run(self, func, *args, **kwds): old = _get_current_context() new = old.copy() _set_current_context(new) try: return func(*args, **kwds) finally: _set_current_context(old) If you look carefully at the version in the PEP you'll see that it's the same thing, but the PEP inlines the implementations of _get_current_context() and _set_current_context() (which I made up -- these won't be real APIs) through manipulation of the current thread state. I hope this clarifies everything. --Guido On Wed, Jan 3, 2018 at 2:26 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 3, 2018 at 12:32 PM, Glenn Linderman <v+python@g.nevcal.com> wrote:
Heh, you're right, I forgot about that. It should be more like this: def run(self, func, *args, **kwds): old = _get_current_context() _set_current_context(self) # <--- changed line try: return func(*args, **kwds) finally: _set_current_context(old) This version, like the PEP, assumes that the Context object is truly immutable (not just in name) and that you should call it like this: contextvars.copy_context().run(func, <args>) The PEP definitely needs to be clearer about this -- right now one has to skip back and forth between the specification and the implementation (and sometimes the introduction) to figure out how things really work. Help is wanted! -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
2018-01-03 23:01 GMT+01:00 Guido van Rossum <guido@python.org>:
I don't see how asyncio would use Context.run() to keep the state (variables values) between callbacks and tasks, if run() is "stateless": forgets everything at exit. I asked if it would be possible to modify run() to return a new context object with the new state, but Yury confirmed that it's not doable: Yury:
[Context.run()] can't return a new context because the callable you're running can raise an exception. In which case you'd lose modifications prior to the error.
Guido:
Yury strongly favors an immutable Context, and that's what his reference implementation has (https://github.com/python/cpython/pull/5027). His reasoning is that in the future we *might* want to support automatic context management for generators by default (like described in his original PEP 550), and then it's essential to use the immutable version so that "copying" the context when a generator is created or resumed is super fast (and in particular O(1)).
To get acceptable performances, PEP 550 and 567 require O(1) cost when copying a context, since the API requires to copy contexts frequently (in asyncio, each task has its own private context, creating a task copies the current context). Yury proposed to use "Hash Array Mapped Tries (HAMT)" to get O(1) copy. Each ContextVar.set() change creates a *new* HAMT. Extract of the PEP 567: --- def set(self, value): ts : PyThreadState = PyThreadState_Get() data : _ContextData = ts.context_data try: old_value = data.get(self) except KeyError: old_value = Token.MISSING ts.context_data = data.set(self, value) return Token(self, old_value) --- The link between ContextVar, Context and HAMT (called "context data" in the PEP 567) is non obvious: * ContextVar.set() creates a new HAMT from PyThreadState_Get().context_data and writes the new one into PyThreadState_Get().context_data -- technically, it works on a thread local storage (TLS) * Context.run() doesn't set the "current context": in practice, it sets its internal "context data" as the current context data, and then save the *new* context data in its own context data PEP 567: --- def run(self, callable, *args, **kwargs): ts : PyThreadState = PyThreadState_Get() saved_data : _ContextData = ts.context_data try: ts.context_data = self._data return callable(*args, **kwargs) finally: self._data = ts.context_data ts.context_data = saved_data --- The main key of the PEP 567 implementation is that there is no "current context" in practice. There is only a private *current* context data. Not having get_current_contex() allows the trick of context data handled by a TLS. Otherwise, I'm not sure that it would be possible to synchronize a Context object with a TLS variable.
I don't think that a mutable context would have an impact in performance, since copying "context data" will still have a cost of O(1). IMHO it's just a matter of taste for the API. Or maybe I missed something. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Oh, dang, I forgot about this. ContextVar.set() modifies the current Context in-place using a private API. In the PEP, asyncio copies the Context once and then calls run() repeatedly (for each _step call). So run() isn't stateless, it just saves and restores the notion of the current context (in the thread state). I don't have time right now to respond in more detail, sorry for shedding darkness. :-( Hopefully I'll have time Friday. On Wed, Jan 3, 2018 at 4:25 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Ok, I finally got access to a computer and I was able to test the PEP 567 implementation: see my code snippet below. The behaviour is more tricky than what I expected. While running context.run(), the context object is out of sync of the "current context". It's only synchronized again at run() exit. So ContextVar.set() doesn't immediately modifies the "current context" object (set by Context.run()). Ok, and now something completely different! What if Context looses its whole mapping API and becomes a "blackbox" object only with a run() method and no attribute? It can be documented as an object mapping context variables to their values, but don't give access to these values directly. It would avoid to have explain the weird run() behaviour (context out of sync). It would avoid to have to decide if it's a mutable or immutable mapping. It would avoid to have to explain the internal O(1) copy using HAMT. Code testing Context.run(): --- import contextvars name = contextvars.ContextVar('name', default='name') def assertNotIn(var, context): try: context[var] except LookupError: pass else: raise AssertionError("name is set is context") def func1(): name.set('yury') assert name.get() == 'yury' assertNotIn(name, context) def func2(): assert name.get() == 'yury' assert context[name] == 'yury' context = contextvars.copy_context() assert name.get() == 'name' assertNotIn(name, context) context.run(func1) assert name.get() == 'name' assert context[name] == 'yury' context.run(func2) assert name.get() == 'name' assert context[name] == 'yury' --- Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
That was actually Yury's original design, but I insisted on a Mapping so you can introspect it. (There needs to be some way to introspect a Context, for debugging.) Again, more later. :-( On Wed, Jan 3, 2018 at 4:44 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
2018-01-04 0:44 GMT+01:00 Victor Stinner <victor.stinner@gmail.com>:
A better description of Context.run() behaviour would be: "Create a copy of the context and sets it as the current context. Once the function completes, updates the context from the copy." This description explains why run() doesn't immediately updates the context object. Victor
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Wed, Jan 3, 2018 at 3:44 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
To me this sounds like a oversight (= bug), not intended behavior. At the conceptual level, I think what we want is: - Context is a mutable object representing a mapping - BUT it doesn't allow mutation through the MutableMapping interface; instead, the only way to mutate it is by calling Context.run and then ContextVar.set(). Funneling all 'set' operations through a single place makes it easier to do clever caching tricks, and it lets us avoid dealing with operations that we don't want here (like 'del') just because they happen to be in the MutableMapping interface. - OTOH we do implement the (read-only) Mapping interface because there's no harm in it and it's probably useful for debuggers. (Note that I didn't say anything about HAMTs here, because that's orthogonal implementation detail. It would make perfect sense to have Context be an opaque wrapper around a regular dict; it would just give different performance trade-offs.) -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Wed, Jan 3, 2018 at 6:35 PM, Nathaniel Smith <njs@pobox.com> wrote:
I think that in essence what Victor saw is a cache consistency issue. If you look at the implementation section in the PEP, the ContextVar.set() operation mutates _ContextData, which is a private (truly) immutable data structure that stands in for the HAMT, and the threadstate contains one of these (not a Context). When you call copy_context() you get a fresh Context that wraps the current _ContextData. Because the latter is effectively immutable this is a true clone. ctx.run() manipulates the threadstate to make the current _ContextData the one from ctx, then calls the function. If the function calls var.set(), this will create a new _ContextData that is stored in the threadstate, but it doesn't update the ctx. This is where the current state and ctx go out of sync. Once the function returns or raises, run() takes the _ContextData from the threadstate and stuffs it into ctx, resolving the inconsistency. (It then also restores the previous _ContextData that it had saved before any of this started.) So all in all Context is mutable but the only time it is mutated is when run() returns. I think Yury's POV is that you rarely if ever want to introspect a Context object that's not freshly obtained from copy_context(). I'm not sure if that's really true; it means that introspecting the context stored in an asyncio.Task may give incorrect results if it's the currently running task. Should we declare it a bug? The fix would be complex given the current implementation (either the PEP's pseudo-code or Yury's actual HAMT-based implementation). I think it would involve keeping track of the current Context in the threadstate rather than just the _ContextData, and updating the Context object on each var.set() call. And this is something that Yury wants to avoid, so that he can do more caching for var.get() (IIUC). We could also add extra words to the PEP's spec for run() explaining this temporary inconsistency. I think changing the introspection method from Mapping to something custom won't fix the basic issue (which is that there's a difference between the Context and the _ContextData, and ContextVar actually only manipulates the latter, always accessing it via the threadstate). However there's another problem with the Mapping interface, which is: what should it do with variables that are not set and have no default value? Should they be considered to have a value equal to _NO_DEFAULT or Token.MISSING? Or should they be left out of the keys altogether? The PEP hand-waves on this issue (we didn't think of missing values when we made the design). Should it be possible to introspect a Context that's not the current context?
Agreed, that's how the PEP pseudo-code does it. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 8:30 AM, Guido van Rossum <guido@python.org> wrote:
Yeah, that's a good way to think about it.
I think the fix is a little bit cumbersome, but straightforward, and actually *simplifies* caching. If we track both the _ContextData and the Context in the threadstate, then set() becomes something like: def set(self, value): # These two lines are like the current implementation new_context_data = tstate->current_context_data->hamt_clone_with_new_item(key=self, value=value) tstate->current_context_data = new_context_data # Update the Context to have the new _ContextData tstate->current_context->data = new_context_data # New caching: instead of tracking integer ids, we just need to track the Context object # This relies on the assumption that self->last_value is updated every time any Context is mutated self->last_value = value self->last_context = tstate->current_context And then the caching in get() becomes: def get(self): if tstate->current_context != self->last_context: # Update cache self->last_value = tstate->current_context_data->hamt_lookup(self) self->last_context = tstate->current_context return self->last_value (I think the current cache invalidation logic is necessary for a PEP 550 implementation, but until/unless we implement that we can get away with something simpler.) So I'd say yeah, let's declare it a bug. If it turns out that I'm wrong and there's some reason this is really difficult, then we could consider making introspection on a currently-in-use Context raise an error, instead of returning stale data. This should be pretty easy, since Contexts already track whether they're currently in use (so they can raise an error if you try to use the same Context in two threads simultaneously).
I've been thinking this over, and I don't *think* there are any design constraints that force us towards one approach or another, so it's just about what's most convenient for users. My preference for how missing values / defaults / etc. should be handled is, Context acts just like a dict that's missing its mutation methods, and ContextVar does: class ContextVar: # Note: default=None instead of default=_MAGIC_SENTINEL_VALUE # If users want to distinguish between unassigned and None, then they can # pass their own sentinel value. IME this is very rare though. def __init__(self, name, *, default=None): self.name = name self.default = default # Note: no default= argument here, because merging conflicting default= values # is inherently confusing, and not really needed. def get(self): return current_context().get(self, self.default) Rationale: I've never seen a thread local use case where you wanted *different* default values at different calls to getattr. I've seen lots of thread local use cases that jumped through hoops to make sure they used the same default everywhere, either by defining a wrapper around getattr() or by subclassing local to define fallback values. Likewise, I've seen lots of cases where having to check for whether a thread local attribute was actually defined or not was a nuisance, and never any where it was important to distinguish between missing and None. But, just in case someone does fine such a case, we should make it possible to distinguish. Allowing users to override the default= is enough to do this. And the default= argument is also useful on those occasions where someone wants a default value other than None, which does happen occasionally. For example, django.urls.resolvers.RegexURLResolver._local.populating is semantically a bool with a default value of False. Currently, it's always accessed by writing getattr(_local, "populating", False). With this approach it could instead use ContextVar("populating", default=False) and then just call get(). Everything I just said is about the ergonomics for ContextVar users, so it makes sense to handle all this inside ContextVar. OTOH, Context is a low-level interface for people writing task schedulers and debuggers, so it makes sense to keep it as simple and transparent as possible, and "it's just like a dict" is about as simple and transparent as it gets. Also, this way the pseudocode is really really short.
Should it be possible to introspect a Context that's not the current context?
I think debuggers will definitely want to be able to do things like print Context values from arbitrary tasks. -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Your suggestions sound reasonable, but we are now running into a logistical problem -- I don't want to decide this unilaterally but Yury is on vacation until Jan 15. That gives us at most 2 weeks for approval of the PEP and review + commit of the implementation ( https://github.com/python/cpython/pull/5027) before the 3.7.0 feature freeze / beta (Jan 29). Your approach to defaults *may* have another advantage: perhaps we could get rid of reset() and Token. Code to set and restore a context var could just obtain the old value with get() and later restore it with set(). OTOH this would not be entirely transparent, since it would permanently add the var to the keys of the Context if the var was previously not set. [Later] Oh, drat! Writing that paragraph made me realize there was a bug in my own reasoning that led to this question about variables that have no value: they aren't present in the keys of _ContextData, so my concern about keys and values being inconsistent was unfounded. So the only reason to change the situation with defaults would be that it's more convenient. Though I personally kind of like that you can cause var.get() to raise LookupError if the value is not set -- just like all other variables. [Even later] Re: your other suggestion, why couldn't the threadstate contain just the Context? It seems one would just write tstate->current_context->_data everywhere instead of tstate->current_context_data. On Thu, Jan 4, 2018 at 4:18 PM, Nathaniel Smith <njs@pobox.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 4:48 PM, Guido van Rossum <guido@python.org> wrote:
I don't think there are any fundamental disagreements here, so I guess we can sort something out pretty quickly once Yury gets back.
Yeah, that's probably better :-). -n -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/e7510/e7510abb361d7860f4e4cc2642124de4d110d36f" alt=""
On Thu, Jan 4, 2018 at 3:18 PM, Nathaniel Smith <njs@pobox.com> wrote:
Actually, this trick of using the Context* as the cache validation key doesn't work :-/. The problem is that the cache doesn't hold a strong reference to the Context, so there's an ABA problem [1] if the Context is deallocated, and then later the same memory location gets used for a new Context object. It could be fixed by using weakref callbacks, but that's not really simpler than the current counter-based cache validation logic. -n [1] https://en.wikipedia.org/wiki/ABA_problem -- Nathaniel J. Smith -- https://vorpus.org
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 00:17, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Agreed. This was something that bothered me, too. I mentioned it in my review, but that seemed to get lost in the other comments in this thread... I get the impression that the logic is that the context is immutable, but the ContextVars that it contains aren't, and the copy is deep (at least 1 level deep) so you copy then change the value of a ContextVar. But rereading that sentence, it sounds confused even to me, so it's either not right or the implementation falls foul of "If the implementation is hard to explain, it's a bad idea." :-) Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 3:25 AM, Paul Moore <p.f.moore@gmail.com> wrote:
It was get_context() in an earlier version of PEP 567. We changed it to copy_context() believing that that would clarify that you get a clone that is unaffected by subsequent ContextVar.set() operations (which affect the *current* context rather than the copy you just got). [The discussion is fragmentary because Yury is on vacation until the 15th and I am scrambling for time myself. But your long post is saved, and not forgotten.] -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 15:56, Guido van Rossum <guido@python.org> wrote:
Ah thanks. In which case, simply changing the emphasis to avoid the implication that Context objects are immutable (while that may be true in a technical/implementation sense, it's not really true in a design sense if ContextVar.set modifies the value of a variable in a context) is probably sufficient.
OK. I knew you were short on time, but hadn't realised Yury was on vacation too. Sorry if I sounded like I was nagging. Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 9:27 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Do you have a specific proposal for a wording change? PEP 567 describes Context as "a read-only mapping, implemented using an immutable dictionary." This sounds all right to me -- "read-only" is weaker than "immutable". Maybe the implementation should not be mentioned here? (The crux here is that a given Context acts as a variable referencing an immutable dict -- but it may reference different immutable dicts at different times.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 4 January 2018 at 23:58, Guido van Rossum <guido@python.org> wrote:
I've been struggling to think of good alternative wordings (it's a case of "I'm not sure what you're trying to say, so I can't work out how you should say it", unfortunately). The best I can come up with is """ A Context is a mapping from ContextVar objects to their values. The Context itself exposes the Mapping interface, so cannot be modified directly - to modify the value associated with a variable you need to use the ContextVar.set() method. """ Does that explain things correctly? One thing I am sure of is that we should remove "implemented using an immutable dictionary" - it's an implementation detail, and adds nothing but confusion to mention it here. Paul
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Fri, Jan 5, 2018 at 2:41 AM, Paul Moore <p.f.moore@gmail.com> wrote:
This is clear, but IMO there's one important detail missing: using ContextVar.set() you can only modify the current Context. This part of the PEP (in particular the definition of ContextVar.set() on line 90) also lies about what ContextVar.set() does -- it implies that Context is immutable. If it was, the recommended use of Context.run() in asyncio would make no sense, since run() clearly modifies the Context object in place. This is not an implementation detail -- it is an API detail that affects how frameworks should use Context objects. Re-reading, there's a lot of language in this part of the PEP that needs updating... :-(
Agreed. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Thu, Jan 4, 2018 at 4:56 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Well, it's not *immutable* (it shouldn't support hash()), but it doesn't follow the MutableMapping protocol -- it only follows the Mapping protocol. Note that the latter carefully doesn't call itself ImmutableMapping. Context is a mutable object that implements the Mapping protocol. The only way to mutate a Context is to use var.set() when that Context is the current context. (Modulo the caching bug discussed in the subthread with Nathaniel.) -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 2 janv. 2018 18:57, "Guido van Rossum" <guido@python.org> a écrit : Oh, the "Specification" section in the PEP is too brief on several of these subjects. It doesn't really specify what var.get() does if the value is not set, nor does it even mention var.get(<default>) except in the code examples for var.reset(). It's also subtle that ctx[var] returns the default (if there is one). I suppose it will raise if there isn't one -- resulting in the somewhat surprising behavior where `var in ctx` may be true but `ctx[var]` may raise. And what does it raise? (All these questions are answered by the code, but they should be clearly stated in the PEP.) A variable has or has no default value. Would it make sense to expose the default value as a public read-only attribute (it would be equal to _NO_DEFAULT or Token.MISSING if there is no default) and/or add a is_set() method? is_set() returns true if the variable has a default value or if it was set in the "current context". Currently, a custom sentinel is needed to check if var.get(), ctx.get(var) and ctx[var] would raise an exception or not. Example: my_sentinel = object() is_set = (var.get(default=my_sentinel) is not my_sentinel) # no exception if is_set is true ContextVar.get() is non obvious because the variable has an optinal default, get() has an optional default parameter, and the variable can be set or not in the current context. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:45 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
But is there a common use case? For var.get() I'd rather just pass the default or catch the exception if the flow is different. Using ctx[var] is rare (mostly for printing contexts, and perhaps for explaining var.get()). -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Le 3 janv. 2018 06:38, "Guido van Rossum" <guido@python.org> a écrit : But is there a common use case? For var.get() I'd rather just pass the default or catch the exception if the flow is different. Using ctx[var] is rare (mostly for printing contexts, and perhaps for explaining var.get()). I don't think that it would be a common use case. Maybe we don't need is_set(), I'm fine with catching an exception. But for introspection at least, it would help to expose the default as a read-only attribute, no? Another example of a mapping with default value: https://docs.python.org/dev/library/collections.html#collections.defaultdict And defaultdict has a default_factory attribute. The difference here is that default_factory is mandatory. ContextVar would be simpler if the default would be mandatory as well :-) Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
Why ContextVar.reset(token) does nothing at the second call with the same token? What is the purpose of Token._used? I guess that there is an use case to justify this behaviour. reset() should have a result: true if the variable was restored to its previous state, false if reset() did nothing because the token was already used. And/Or Token should have a read-only "used" property. Victor
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
What is the behaviour of ContextVar.reset(token) if the token was created from a different variable? Raise an exception? token = var1.set("value") var2.reset(token) The PEP states that Token.var only exists for debug or introspection. Victor Le 3 janv. 2018 00:51, "Victor Stinner" <victor.stinner@gmail.com> a écrit : Why ContextVar.reset(token) does nothing at the second call with the same token? What is the purpose of Token._used? I guess that there is an use case to justify this behaviour. reset() should have a result: true if the variable was restored to its previous state, false if reset() did nothing because the token was already used. And/Or Token should have a read-only "used" property. Victor
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
On Tue, Jan 2, 2018 at 4:51 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
That depends again on the use case. The only real purpose for reset() is to be able to write a context manager that sets and restores a context variable (like in `with decimal.localcontext()`). Handling double resets is about as useful as specifying what happens if __exit__ is called twice. -- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/b3d87/b3d872f9a7bbdbbdbd3c3390589970e6df22385a" alt=""
PEP: "int PyContext_Enter(PyContext *) and int PyContext_Exit(PyContext *) allow to set and restore the context for the current OS thread." What is the difference between Enter and Exit? Why not having a single Py_SetContext() function? Victor
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
I don't want to expose a SetContext operation because of, again, potential incompatibility with PEP 550, where generators expect to fully control push/pop context operation. Second, Context.run is 100% enough for *any* async framework to add support for PEP 567. And because the PEP is focused just on async, I think that we don't need anything more than 'run'. Third, I have a suspicion that we focus too much on actual Context and Context.run. These APIs are meant for asyncio/twisted/trio/etc maintainers, not for an average Python user. An average person will likely not interact with any of the PEP 567 machinery directly, wven when using PEP 567-enabled libraries like numpy/decimal. Yury On Wed, Jan 3, 2018 at 2:56 AM Victor Stinner <victor.stinner@gmail.com> wrote:
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Do you see any average Python users on this thread? We are trying to pick the PEP apart from the POV of having to use the complicated parts of the API in a framework. Victor's questions are reasonable. On Tue, Jan 2, 2018 at 10:13 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Thu, Dec 28, 2017 at 4:51 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Thanks, Victor!
As Nathaniel already explained, a 'default' for ContextVars is literally a default -- default value returned when a ContextVar hasn't been assigned a value in a context. So my opinion on this is that 'default' is the less ambiguous name here. [..]
I like MISSING more than NOT_SET, but this is very subjective, of course. If Guido wants to rename it I rename it.
We want _decimal and numpy to use the new API, and they will call ContextVar.get() on basically all operations, so it needs to be as fast as possible. asyncio/uvloop also want the fastest copy_context() and Context.run() possible, as they use them for *every* callback. So I think it's OK for us to add new C APIs here.
Sure, I'll change it. Yury
data:image/s3,"s3://crabby-images/2ffc5/2ffc57797bd7cd44247b24896591b7a1da6012d6" alt=""
I have a couple basic questions around how this API could be used in practice. Both of my questions are for the Python API as applied to Tasks in asyncio. 1) Would this API support looking up the value of a context variable for **another** Task? For example, if you're managing multiple tasks using asyncio.wait() and there is an exception in some task, you might want to examine and report the value of a context variable for that task. 2) Would an appropriate use of this API be to assign a unique task id to each task? Or can that be handled more simply? I'm wondering because I recently thought this would be useful, and it doesn't seem like asyncio means for one to subclass Task (though I could be wrong). Thanks, --Chris On Wed, Dec 27, 2017 at 10:08 PM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
data:image/s3,"s3://crabby-images/983b1/983b1e0f0dbf564edf66ca509e63491851f04e82" alt=""
On Thu, Dec 28, 2017 at 5:28 AM, Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
No, unless that another Task explicitly shares the value or captures its context and shares it. Same as with threading.local.
The API should be used to share one ID between a Task and tasks it creates. You can use it to store individual Task IDs, but a combination of a WeakKeyDictionary and Task.current_task() seems to be a better/easier option. Yury
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 28 December 2017 at 06:08, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This is a second version of PEP 567.
Overall, I like the proposal. It's relatively straightforward to follow, and makes sense. One thing I *don't* see in the PEP is an example of how code using thread-local storage should be modified to use context variables. My impression is that it's simply a matter of replacing the TLS calls with equivalent ContextVar calls, but an example might be helpful. Some detail points below.
I understand how this could happen, having followed the discussions here, but a (simple) example of the issue might be useful.
Accessed by copying it? That seems weird to me. I'd expect either that you'd be able to access the current Context directly, *or* that you'd say that the current Context is not directly accessible by the user, but that a copy can be obtained using copy_context. But given that the Context is immutable, why the need tp copy it? Also, the references to threads in the above are confusing. It says that this is a well-known concept in terms of thread-local storage, but this case is different. It then goes on to say that the current Context is stored in thread local storage, which gives me the impression that the new idea *is* related to thread local storage... I think that the fact that a Context is held in thread-local storage is an implementation detail. Assuming I'm right, don't bother mentioning it - simply say that there's a notion of a current Context and leave it at that.
On first reading, this confused me because I didn't spot that you're saying a *Context* is read-only, but a *ContextVar* has get and set methods. Maybe reword this to say that a Context is a read-only mapping from ContextVars to values. A ContextVar has a get method that looks up its value in the current Context, and a set method that replaces the current Context with a new one that associates the specified value with this ContextVar. (The current version feels confusing to me because it goes into too much detail on how the implementation does this, rather than sticking to the high-level specification)
Context.run() came a bit out of nowhere here. Maybe the part from "It is not possible..." should be in the introduction above? Something like the following, covering this and copy_context: The current Context cannot be accessed directly by user code. If the frameowrk wants to run some code in a different Context, the Context.run(callable, *args, **kwargs) method is used to do that. To construct a new context for this purpose, the current context can be copied via the copy_context function, and manipulated prior to the call to run().
My first thought was that default was the context variable's initial value. But if that's what it is, why not call it that? If the default has another effect as well as being the initial value, maybe clarify here what that is?
get doesn't take an argument. Typo?
same typo?
[...] I haven't commented on these as they aren't my area of expertise.
Again, I'm ignoring this as I don't really have an interest in how the facility is implemented.
Would it be worth exposing this data structure elsewhere, in case other uses for it exist?
Should the cache (or at least the performance guarantees it implies) be part of the spec? Do we care if other implementations fail to implement a cache?
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
Some inline responses to Paul (I snipped everything I agree with). On Wed, Jan 3, 2018 at 3:34 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Because it's not immutable. (I think by now people following this thread understand that.) The claims or implications in the PEP that Context is immutable are wrong (and contradict the recommended use of run() for asyncio, for example).
The PEP's language does seem confused. This is because it doesn't come out and define the concept of "task". The correspondence is roughly that thread-locals are to threads what Contexts and ContextVars are to tasks. (This still requires quite a bit of squinting due to the API differences but it's better than what the PEP says.)
No, actually it's important that each thread has its own current context. It is even possible to pass Contexts between threads, e.g. if you have a ThreadExecutor e, you can call e.submit(ctx.run, some_function, args...). However, run() is not thread-safe! @Yury: There's an example that does almost exactly this in the PEP, but I think it could result in a race condition if you called run() concurrently on that same context in a different thread -- whichever run() finishes last will overwrite the other's state. I think we should just document this and recommend always using a fresh copy in such scenarios. Hm, does this mean it would be good to have an explicit Context.copy() method? Or should we show how to create a copy of an arbitrary Context using ctx.run(copy_context)?
We went over this passage in another subthread. IMO what it says about ContextVar.set() is incorrect.
I agree that run() comes out of nowhere but I'd suggest a simpler fix -- just say "Instead, Context.run() must be used, see below."
IMO it's more and different than the "initial value". The ContextVar never gets set directly to the default -- you can verify this by checking "var in ctx" for a variable that has a default but isn't set -- it's not present. It really is used as the "default default" by ContextVar.get(). That's not an implementation detail.
Actually it does, the argument specifies a default (to override the "default default" set in the constructor). However this hasn't been mentioned yet at this point (the description of ContextVar.get() earlier doesn't mention it, only the implementation below). It would be good to update the earlier description of ContextVar.get() to mention the optional default (and how it interacts with the "default default").
asyncio -------
(Too bad, since there's an important clue about the mutability of Context hidden in this section! :-)
(Again, too bad, since despite the section heading this acts as a pseudo-code specification that is much more exact than the "specification" section above.)
I've asked Yury this several times, but he's shy about exposing it. Maybe it's better to wait until 3.8 so the implementation and its API can stabilize a bit before the API is constrained by backwards compatibility.
IMO it's a quality-of-implementation issue, but the speed of the CPython implementation plays an important role in acceptance of the PEP (since we don't want to slow down e.g. asyncio task creation). -- --Guido van Rossum (python.org/~guido)
participants (10)
-
Chris Jerdonek
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
Guido van Rossum
-
Nathaniel Smith
-
Paul Moore
-
Victor Stinner
-
Yury Selivanov
-
Yury Selivanov