[Python-Dev] PEP 567 v2

Guido van Rossum guido at python.org
Fri Jan 5 12:54:37 EST 2018


Some inline responses to Paul (I snipped everything I agree with).

On Wed, Jan 3, 2018 at 3:34 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 28 December 2017 at 06:08, Yury Selivanov <yselivanov.ml at gmail.com>
> wrote:
> > This is a second version of PEP 567.
> [...]
>
> > The notion of "current value" deserves special consideration:
> > different asynchronous tasks that exist and execute concurrently
> > may have different values for the same key.  This idea is well-known
> > from thread-local storage but in this case the locality of the value is
> > not necessarily bound to a thread.  Instead, there is the notion of the
> > "current ``Context``" which is stored in thread-local storage, and
> > is accessed via ``contextvars.copy_context()`` function.
>
> Accessed by copying it? That seems weird to me. I'd expect either that
> you'd be
> able to access the current Context directly, *or* that you'd say that the
> current Context is not directly accessible by the user, but that a copy
> can be
> obtained using copy_context. But given that the Context is immutable, why
> the
> need to copy it?
>

Because it's not immutable. (I think by now people following this thread
understand that.) The claims or implications in the PEP that Context is
immutable are wrong (and contradict the recommended use of run() for
asyncio, for example).


> Also, the references to threads in the above are confusing. It says that
> this
> is a well-known concept in terms of thread-local storage, but this case is
> different. It then goes on to say that the current Context is stored in
> thread
> local storage, which gives me the impression that the new idea *is*
> related to
> thread local storage...
>

The PEP's language does seem confused. This is because it doesn't come out
and define the concept of "task". The correspondence is roughly that
thread-locals are to threads what Contexts and ContextVars are to tasks.
(This still requires quite a bit of squinting due to the API differences
but it's better than what the PEP says.)


> I think that the fact that a Context is held in thread-local storage is an
> implementation detail. Assuming I'm right, don't bother mentioning it -
> simply
> say that there's a notion of a current Context and leave it at that.
>

No, actually it's important that each thread has its own current context.
It is even possible to pass Contexts between threads, e.g. if you have a
ThreadExecutor e, you can call e.submit(ctx.run, some_function, args...).

However, run() is not thread-safe! @Yury: There's an example that does
almost exactly this in the PEP, but I think it could result in a race
condition if you called run() concurrently on that same context in a
different thread -- whichever run() finishes last will overwrite the
other's state. I think we should just document this and recommend always
using a fresh copy in such scenarios. Hm, does this mean it would be good
to have an explicit Context.copy() method? Or should we show how to create
a copy of an arbitrary Context using ctx.run(copy_context)?


> > Manipulation of the current ``Context`` is the responsibility of the
> > task framework, e.g. asyncio.
> >
> > A ``Context`` is conceptually a read-only mapping, implemented using
> > an immutable dictionary.  The ``ContextVar.get()`` method does a
> > lookup in the current ``Context`` with ``self`` as a key, raising a
> > ``LookupError``  or returning a default value specified in
> > the constructor.
> >
> > The ``ContextVar.set(value)`` method clones the current ``Context``,
> > assigns the ``value`` to it with ``self`` as a key, and sets the
> > new ``Context`` as the new current ``Context``.
> >
>
> On first reading, this confused me because I didn't spot that you're
> saying a
> *Context* is read-only, but a *ContextVar* has get and set methods.
>
> Maybe reword this to say that a Context is a read-only mapping from
> ContextVars
> to values. A ContextVar has a get method that looks up its value in the
> current
> Context, and a set method that replaces the current Context with a new one
> that
> associates the specified value with this ContextVar.
>
> (The current version feels confusing to me because it goes into too much
> detail
> on how the implementation does this, rather than sticking to the high-level
> specification)
>

We went over this passage in another subthread. IMO what it says about
ContextVar.set() is incorrect.


> > Specification
> > =============
> >
> > A new standard library module ``contextvars`` is added with the
> > following APIs:
> >
> > 1. ``copy_context() -> Context`` function is used to get a copy of
> >    the current ``Context`` object for the current OS thread.
> >
> > 2. ``ContextVar`` class to declare and access context variables.
> >
> > 3. ``Context`` class encapsulates context state.  Every OS thread
> >    stores a reference to its current ``Context`` instance.
> >    It is not possible to control that reference manually.
> >    Instead, the ``Context.run(callable, *args, **kwargs)`` method is
> >    used to run Python code in another context.
>
> Context.run() came a bit out of nowhere here. Maybe the part from "It
> is not possible..." should be in the introduction above? Something
> like the following, covering this and copy_context:
>
>     The current Context cannot be accessed directly by user code. If the
>     framework wants to run some code in a different Context, the
>     Context.run(callable, *args, **kwargs) method is used to do that. To
>     construct a new context for this purpose, the current context can be
> copied
>     via the copy_context function, and manipulated prior to the call to
> run().
>

I agree that run() comes out of nowhere but I'd suggest a simpler fix --
just say "Instead, Context.run() must be used, see below."


> >
> > contextvars.ContextVar
> > ----------------------
> >
> > The ``ContextVar`` class has the following constructor signature:
> > ``ContextVar(name, *, default=_NO_DEFAULT)``.  The ``name`` parameter
> > is used only for introspection and debug purposes, and is exposed
> > as a read-only ``ContextVar.name`` attribute.  The ``default``
> > parameter is optional.  Example::
> >
> >     # Declare a context variable 'var' with the default value 42.
> >     var = ContextVar('var', default=42)
> >
> > (The ``_NO_DEFAULT`` is an internal sentinel object used to
> > detect if the default value was provided.)
>
> My first thought was that default was the context variable's initial
> value. But
> if that's what it is, why not call it that? If the default has another
> effect
> as well as being the initial value, maybe clarify here what that is?
>

IMO it's more and different than the "initial value". The ContextVar never
gets set directly to the default -- you can verify this by checking "var in
ctx" for a variable that has a default but isn't set -- it's not present.
It really is used as the "default default" by ContextVar.get(). That's not
an implementation detail.


> > ``ContextVar.get()`` returns a value for context variable from the
> > current ``Context``::
> >
> >     # Get the value of `var`.
> >     var.get()
> >
> > ``ContextVar.set(value) -> Token`` is used to set a new value for
> > the context variable in the current ``Context``::
> >
> >     # Set the variable 'var' to 1 in the current context.
> >     var.set(1)
> >
> > ``ContextVar.reset(token)`` is used to reset the variable in the
> > current context to the value it had before the ``set()`` operation
> > that created the ``token``::
> >
> >     assert var.get(None) is None
>
> get doesn't take an argument. Typo?
>

Actually it does, the argument specifies a default (to override the
"default default" set in the constructor). However this hasn't been
mentioned yet at this point (the description of ContextVar.get() earlier
doesn't mention it, only the implementation below). It would be good to
update the earlier description of ContextVar.get() to mention the optional
default (and how it interacts with the "default default").

> asyncio
> -------

> [...]
> >
> > C API
> > -----
> >
> [...]
>
> I haven't commented on these as they aren't my area of expertise.
>

(Too bad, since there's an important clue about the mutability of Context
hidden in this section! :-)


> > Implementation
> > ==============
> >
> > This section explains high-level implementation details in
> > pseudo-code.  Some optimizations are omitted to keep this section
> > short and clear.
>
> Again, I'm ignoring this as I don't really have an interest in how the
> facility
> is implemented.
>

(Again, too bad, since despite the section heading this acts as a
pseudo-code specification that is much more exact than the "specification"
section above.)


> > Implementation Notes
> > ====================
> >
> > * The internal immutable dictionary for ``Context`` is implemented
> >   using Hash Array Mapped Tries (HAMT).  They allow for O(log N)
> >   ``set`` operation, and for O(1) ``copy_context()`` function, where
> >   *N* is the number of items in the dictionary.  For a detailed
> >   analysis of HAMT performance please refer to :pep:`550` [1]_.
>
> Would it be worth exposing this data structure elsewhere, in case
> other uses for it exist?
>

I've asked Yury this several times, but he's shy about exposing it. Maybe
it's better to wait until 3.8 so the implementation and its API can
stabilize a bit before the API is constrained by backwards compatibility.


> > * ``ContextVar.get()`` has an internal cache for the most recent
> >   value, which allows to bypass a hash lookup.  This is similar
> >   to the optimization the ``decimal`` module implements to
> >   retrieve its context from ``PyThreadState_GetDict()``.
> >   See :pep:`550` which explains the implementation of the cache
> >   in a great detail.
> >
>
> Should the cache (or at least the performance guarantees it implies) be
> part of
> the spec? Do we care if other implementations fail to implement a cache?
>

IMO it's a quality-of-implementation issue, but the speed of the CPython
implementation plays an important role in acceptance of the PEP (since we
don't want to slow down e.g. asyncio task creation).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180105/bbc7446e/attachment.html>


More information about the Python-Dev mailing list