[Python-Dev] PEP 567 pre v3

Tue Jan 9 02:02:07 EST 2018

On Mon, Jan 8, 2018 at 11:34 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> 1. Proposal: ContextVar has default set to None.
>
> From the typing point of view that would mean that if a context
> variable is declared without an explicit default, its type would be
> Optional.  E.g. say we have a hypothetical web framework that allows
> to access the current request object through a context variable:
>
>   request_var: ContextVar[Optional[Request]] = \
>       ContextVar('current_request')
>
> When we need to get the current request object, we would write:
>
>   request: Optional[Request] = request_var.get()
>
> And we'd also need to explicitly handle when 'request' is set to None.
> Of course we could create request_var with its default set to some
> "InvalidRequest" object, but that would complicate things.  It would
> be easier to just state that the framework always sets the current
> request and it's a bug if it's not set.
>
> Therefore, in my opinion, it's better to keep the current behaviour:
> if a context variable was created without a default value,
> ContextVar.get() can raise a LookupError.

All the different behaviors here can work, so I don't want to make a
huge deal about this. But the current behavior is bugging me, and I
don't think anyone has brought up the reason why, so here goes :-).

Right now, the set of valid states for a ContextVar are: it can hold
any Python object, or it can be undefined. However, the only way it
can be in the "undefined" state is in a new Context where it has never
had a value; once it leaves the undefined state, it can never return
to it.

This makes me itch. It's very weird to have a mutable variable with a
valid state that you can't reach by mutating it. I see two
self-consistent ways to make me stop itching: (a) double-down on
undefined as being part of ContextVar's domain, or (b) reduce the
domain so that undefined is never a valid state.

# Option 1

In the first approach, we conceptualize ContextVar as being a
container that either holds a value or is empty (and then there's one
of these containers for each context). We also want to be able to
define an initial value that the container takes on when a new context
materializes, because that's really convenient. And then after that we
provide ways to get the value (if present), or control the value
(either set it to a particular value or unset it). So something like:

var1 = ContextVar("var1")  # no initial value
var2 = ContextVar("var2", initial_value="hello")

with assert_raises(SomeError):
    var1.get()
# get's default lets us give a different outcome in cases where it
would otherwise raise
assert var1.get(None) is None
assert var2.get() == "hello"
# If get() doesn't raise, then the argument is ignored
assert var2.get(None) == "hello"

# We can set to arbitrary values
for var in [var1, var2]:
    var.set("new value")
    assert var.get() == "new value"

# We can unset again, so get() will raise
for var in [var1, var2]:
    var.unset()
    with assert_raises(SomeError):
        var.get()
    assert var.get(None) is None

To fulfill all that, we need an implementation like:

MISSING = make_sentinel()

class ContextVar:
    def __init__(self, name, *, initial_value=MISSING):
        self.name = name
        self.initial_value = initial_value

    def set(self, value):
        if value is MISSING: raise TypeError
        current_context()._dict[self] = value
        # Token handling elided because it's orthogonal to this issue
        return Token(...)

    def unset(self):
        current_context()._dict[self] = MISSING
        # Token handling elided because it's orthogonal to this issue
        return Token(...)

    def get(self, default=_NOT_GIVEN):
        value = current_context().get(self, self.initial_value)
        if value is MISSING:
            if default is _NOT_GIVEN:
                raise ...
            else:
                return default
        else:
            return value

Note that the implementation here is somewhat tricky and non-obvious.
In particular, to preserve the illusion of a simple container with an
optional initial value, we have to encode a logically undefined
ContextVar as one that has Context[var] set to MISSING, and a missing
entry in Context encodes the presence of the inital value. If we
defined unset() as 'del current_context._dict[self]', then we'd have:

var2.unset()
assert var2.get() is None

which would be very surprising to users who just want to think about
ContextVars and ignore all that stuff about Contexts. This, in turn,
means that we need to expose the MISSING sentinel in general, because
anyone introspecting Context objects directly needs to know how to
recognize this magic value to interpret things correctly.

AFAICT this is the minimum complexity required to get a complete and
internally-consistent set of operations for a ContextVar that's
conceptualized as being a container that either holds an arbitrary
value or is empty.

# Option 2

The other complete and coherent conceptualization I see is to say that
a ContextVar always holds a value. If we eliminate the "unset" state
entirely, then there's no "missing unset method" -- there just isn't
any concept of an unset value in the first place, so there's nothing
to miss. This idea shows up in lots of types in Python, actually --
e.g. for any exception object, obj.__context__ is always defined. Its
value might be None, but it has a value. In this approach,
ContextVar's are similar.

To fulfill all that, we need an implementation like:

class ContextVar:
    # Or maybe it'd be better to make initial_value mandatory, like this?
    #     def __init__(self, name, *, initial_value):
    def __init__(self, name, *, initial_value=None):
        self.name = name
        self.initial_value = initial_value

    def set(self, value):
        current_context()._dict[self] = value
        # Token handling elided because it's orthogonal to this issue
        return Token(...)

    def get(self):
        return current_context().get(self, self.initial_value)

This is also a complete and internally consistent set of operations,
but this time for a somewhat different way of conceptualizing
ContextVar.

Actually, the more I think about it, the more I think that if we take
this approach and say that every ContextVar always has a value, it
makes sense to make initial_value= a mandatory argument instead of
defaulting it to None. Then the typing works too, right? Something
like:

ContextVar(name: str, *, initial_value: T) -> ContextVar[T]
ContextVar.get() -> T
ContextVar.set(T) -> Token

? And it's hardly a burden on users to type 'ContextVar("myvar",
initial_value=None)' if that's what they want.

Anyway... between these two options, I like Option 2 better because
it's substantially simpler without (AFAICT) any meaningful reduction
in usability. But I'd prefer either of them to the current PEP 567,
which seems like an internally-contradictory hybrid of these ideas. It
makes sense if you know how the code and Contexts work. But if I was
talking to someone who wanted to ignore those details and just use a
ContextVar, and they asked me for a one sentence summary of how it
worked, I wouldn't know what to tell them.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org