[Python-Dev] PEP 550 v4

Nathaniel Smith njs at pobox.com
Mon Aug 28 21:07:44 EDT 2017


On Mon, Aug 28, 2017 at 3:14 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Sat, Aug 26, 2017 at 3:09 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> You might be interested in these notes I wrote to motivate why we need
>> a chain of namespaces, and why simple "async task locals" aren't
>> sufficient:
>>
>>     https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb
>
> Thanks, Nathaniel!  That helped me understand the rationale, though
> I'm still unconvinced chained lookup is necessary for the stated goal
> of the PEP.
>
> (The rest of my reply is not specific to Nathaniel.)
>
> tl;dr Please:
>   * make the chained lookup aspect of the proposal more explicit (and
> distinct) in the beginning sections of the PEP (or drop chained
> lookup).
>   * explain why normal frames do not get to take advantage of chained
> lookup (or allow them to).
>
> --------------------
>
> If I understood right, the problem is that we always want context vars
> resolved relative to the current frame and then to the caller's frame
> (and on up the call stack).  For generators, "caller" means the frame
> that resumed the generator.  Since we don't know what frame will
> resume the generator beforehand, we can't simply copy the current LC
> when a generator is created and bind it to the generator's frame.
>
> However, I'm still not convinced that's the semantics we need.  The
> key statement is "and then to the caller's frame (and on up the call
> stack)", i.e. chained lookup.  On the linked page Nathaniel explained
> the position (quite clearly, thank you) using sys.exc_info() as an
> example of async-local state.  I posit that that example isn't
> particularly representative of what we actually need.  Isn't the point
> of the PEP to provide an async-safe alternative to threading.local()?
>
> Any existing code using threading.local() would not expect any kind of
> chained lookup since threads don't have any.  So introducing chained
> lookup in the PEP is unnecessary and consequently not ideal since it
> introduces significant complexity.

There's a lot of Python code out there, and it's hard to know what it
all wants :-). But I don't think we should get hung up on matching
threading.local() -- no-one sits down and says "okay, what my users
want is for me to write some code that uses a thread-local", i.e.,
threading.local() is a mechanism, not an end-goal.

My hypothesis is in most cases, when people reach for
threading.local(), it's because they have some "contextual" variable,
and they want to be able to do things like set it to a value that
affects all and only the code that runs inside a 'with' block. So far
the only way to approximate this in Python has been to use
threading.local(), but chained lookup would work even better.

As evidence for this hypothesis: something like chained lookup is
important for exc_info() [1] and for Trio's cancellation semantics,
and I'm pretty confident that it's what users naturally expect for use
cases like 'with decimal.localcontext(): ...' or 'with
numpy.errstate(...): ...'. And it works fine for cases like Flask's
request-locals that get set once near the top of a callstack and then
treated as read-only by most of the code.

I'm not aware of any alternative to chained lookup that fulfills all
of these use cases -- are you? And I'm not aware of any use cases that
require something more than threading.local() but less than chained
lookup -- are you?

[1] I guess I should say something about including sys.exc_info() as
evidence that chained lookup as useful, given that CPython probably
won't share code between it's PEP 550 implementation and its
sys.exc_info() implementation. I'm mostly citing it as a evidence that
this is a real kind of need that can arise when writing programs -- if
it happens once, it'll probably happen again. But I can also imagine
that other implementations might want to share code here, and it's
certainly nice if the Python-the-language spec can just say
"exc_info() has semantics 'as if' it were implemented using PEP 550
storage" and leave it at that. Plus it's kind of rude for the
interpreter to claim semantics for itself that it won't let anyone
else implement :-).

> As the PEP is currently written, chained lookup is a key part of the
> proposal, though it does not explicitly express this.  I suppose this
> is where my confusion has been.
>
> At this point I think I understand one rationale for the chained
> lookup functionality; it takes advantage of the cooperative scheduling
> characteristics of generators, et al.  Unlike with threads, a
> programmer can know the context under which a generator will be
> resumed.  Thus it may be useful to the programmer to allow (or expect)
> the resumed generator to fall back to the calling context.  However,
> given the extra complexity involved, is there enough evidence that
> such capability is sufficiently useful?  Could chained lookup be
> addressed separately (in another PEP)?
>
> Also, wouldn't it be equally useful to support chained lookup for
> function calls?  Programmers have the same level of knowledge about
> the context stack with function calls as with generators.  I would
> expect evidence in favor of chained lookups for generators to also
> favor the same for normal function calls.

The important difference between generators/coroutines and normal
function calls is that with normal function calls, the link between
the caller and callee is fixed for the entire lifetime of the inner
frame, so there's no way for the context to shift under your feet. If
all we had were normal function calls, then (green-) thread locals
using the save/restore trick would be enough to handle all the use
cases above -- it's only for generators/coroutines where the
save/restore trick breaks down. This means that pushing/popping LCs
when crossing into/out of a generator frame is the minimum needed to
get the desired semantics, and it keeps the LC stack small (important
since lookups can be O(n) in the worst case), and it minimizes the
backcompat breakage for operations like decimal.setcontext() where
people *do* expect to call it in a subroutine and have the effects be
visible in the caller.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-Dev mailing list