[Python-Dev] PEP 550 leak-in vs leak-out, why not just a ChainMap

Nathaniel Smith njs at pobox.com
Thu Aug 24 04:22:52 EDT 2017


On Wed, Aug 23, 2017 at 9:32 PM, Jim J. Jewett <jimjjewett at gmail.com> wrote:
>> While the context is defined conceptually as a nested chain of
>> key:value mappings, we avoid using the mapping syntax because of the
>> way the values can shift dynamically out from under you based on who
>> called you
> ...
>> instead of having the problem of changes inside the
>> generator leaking out, we instead had the problem of
>> changes outside the generator *not* making their way in
>
> I still don't see how this is different from a ChainMap.
>
> If you are using a stack(chain) of [d_global, d_thread, d_A, d_B, d_C,
> d_mine]  maps as your implicit context, then a change to d_thread map
> (that some other code could make) will be visible unless it is masked.
>
> Similarly, if the fallback for d_C changes from d_B to d_B1 (which
> points directly to d_thread), that will be visible for any keys that
> were previously resolved in d_A or d_B, or are now resolved in dB1.
>
> Those seem like exactly the cases that would (and should) cause
> "shifting values".
>
> This does mean that you can't cache everything in the localmost map,
> but that is a problem with the optimization regardless of how the
> implementation is done.

It's crucial that the only thing that can effect the result of calling
ContextKey.get() is other method calls on that same ContextKey within
the same thread. That's what enables ContextKey to efficiently cache
repeated lookups, which is an important optimization for code like
decimal or numpy that needs to access their local context *extremely*
quickly (like, a dict lookup is too slow). In fact this caching trick
is just preserving what decimal does right now in their thread-local
handling (because accessing the threadstate dict is too slow for
them).

So we can't expose any kind of mutable API to individual maps, because
then someone might call __setitem__ on some map that's lower down in
the stack, and break caching.

> And, of course, using a ChainMap means that the keys do NOT have to be
> predefined ... so the Key class really can be skipped.

The motivations for the Key class are to eliminate the need to worry
about accidental key collisions between unrelated libraries, to
provide some important optimizations (as per above), and to make it
easy and natural to provide convenient APIs like for saving and
restoring the state of a value inside a context manager. Those are all
orthogonal to whether the underlying structure is implemented as a
ChainMap or as something more specialized.

But I tend to agree with your general argument that the current PEP is
trying a bit too hard to hide away all this structure where no-one can
see it. The above constraints mean that simply exposing a ChainMap as
the public API is probably a bad idea. Even if there are compelling
performance advantages to fancy immutable-HAMT implementation (I'm in
wait-and-see mode on this myself), then there are still a lot more
introspection-y operations that could be provided, like:

- make a LocalContext out of a {ContextKey: value} dict
- get a {ContextKey: value} dict out of a LocalContext
- get the underlying list of LocalContexts out of an ExecutionContext
- create an ExecutionContext out of a list of LocalContexts
- given a ContextKey and a LocalContext, get the current value of the
key in the context
- given a ContextKey and an ExecutionContext, get out a list of values
at each level

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-Dev mailing list