
I know I'm not the only one who is confused by at least some of the alternative terminology choices. I suspect I'm not the only one who sometimes missed part of the argument because I was distracted figuring out what the objects were, and forgot to verify what was being done and why. I also suspect that it could be much simpler to follow if the API were designed in the abstract, with the implementation left for later. So is the following API missing anything important? (1) Get the current (writable) context. Currently proposed as a sys.* call, but I think injecting to __builtins__ or globals would work as well. (2) Get a value from the current context, by string key. Currently proposed as key.get, rather env.__getitem__ (3) Write a value to the current context, by string key. Currently proposed as key.set, rather env.__setitem__ (4) Create a new (writable) empty context. (5) Create a copy of the current context, so that changes can be isolated. The copy will not be able to change anything in the current context, though it can shadow keys. (6) Choose which context to use when calling another function/generator/iterator/etc/ At this point, it looks an awful lot like a subset of ChainMap, except that: (A) The current mapping is available through a series of sys.* calls. (why not a builtin? Or at least a global, injected when a different environment is needed?) (B) Concurrency APIs are supposed to ensure that each process/thread/Task/worker is using its own private context, unless the call explicitly requests a shared or otherwise different context. (C) The current API requires users to initialize every key before it can be added to a context. This is presumably to support limits of the proposed implementation. If the semantics are right, and collections.ChainMap is rejected only for efficiency, please say so in the PEP. If the semantics are wrong, please explain how they differ. Sample code: olduser=env["username"] env["reason"] = "Spanish Inquisition" with env.copy(): env["username"] = "secret admin" foo() print ("debugging", env["foodebug"]) bar() with env.empty(): assert "username" not in env assert env["username"] is olduser -jJ

Hi Jim, In short, yes, we can "dumb down" PEP 550 to a chain of maps. PEP 550 does the following on top of that dumbed down version: 0. Adds execution_context "chain" root to PyThreadState. 1. Extends (async-)generator objects to support this chaining -- each generator has its own "env" to accumulate its changes. 2. ContextKey is an object that we use to work with EC. Compared to using strings, using an object allows us to implement caching (important for numpy and decimal-like libs) and avoids name clashes. 3. Yes, efficiency is important. If you start an asyncio.Task, or schedule an asyncio callback, or want to run some code in a separate OS thread, you need to capture the current EC -- make a shallow copy of all LCs in it. That's expensive, and the PEP solves this problem by using special datastructures (a), and providing just enough APIs to work with the EC so that those datastructures are not exposed to the end user (b). 4. Provides common APIs that will be used by asyncio, decimal, numpy, etc.
This was never proposed :) I decided to put new APIs to the sys module as we usually are conservative about adding new globals, and the feature is low-level (like working with frames).
If the semantics are right, and collections.ChainMap is rejected only for efficiency, please say so in the PEP.
`collections.ChainMap` on its own is not a solution, it's one of possible implementations. Efficiency is indeed the reason why using ChainMap is not an option (see (3) above). This whole "capturing of execution context" topic is not covered well enough in the PEP, and is something that we'll fix in the next version (soon). Yury

On Tue, Aug 22, 2017 at 2:56 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Hi Jim,
In short, yes, we can "dumb down" PEP 550 to a chain of maps.
I think it's also good to think about the actual problem(s) that are being solved, without going too deeply into the implementation. It might be useful to look at all the motivating use cases and to make sure this is really the best way to provide a solution to them.
PEP 550 does the following on top of that dumbed down version:
[...]
How exactly is caching dependent on the proposed ContextKey thing? To avoid a dict-lookup or similar to get the cached value? But now we need to look up the key object from somewhere? [...]
4. Provides common APIs that will be used by asyncio, decimal, numpy, etc.
Which APIs? The C API you mean? Something that is not in Jim's list? Something that is (not) in the PEP? People need to get a clear picture of what is being proposed. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Jim J. Jewett wrote:
You're definitely not alone! I think I get the gist of the proposal, and its motivation, but I'm definitely confused by the terminology. As I stated elsewhere, the word "context" has a well-established meaning in Python, with context managers, their protocols, and contextlib. When talking with another Pythonista three years from now, I don't want to have to resolve which context they're talking about based on context. ;) I think you have a point too about designing the abstract behavior and API first, and then worry about implementation details (in fact, maybe take implementation discussions out of the PEP for now, and maybe hash that out in a PR). I also think you're on to something when you suggest that sys may not be the best place for these new APIs. sys is already a mishmash of lots of random stuff, and the concepts defined in PEP 550 are advanced enough that many Python developers will never need to worry about them. Putting them in sys leads to cognitive overload. I'm not sure I'd put them in builtins either, but a new module makes a lot of sense to me. Plus, it means that we can choose more natural names for the APIs since they'll be namespaced away in a separate module. Cheers, -Barry

On Thu, Aug 24, 2017 at 10:04:58AM -0400, Barry Warsaw wrote:
I'm not happy about "context" either. I'd prefer something more pedantic, like: TaskLocalStorage, TaskLocalStorageStack, even when generators aren't tasks. At least that's what people are used to from ThreadLocalStorage. The .NET termiology is explained here: https://blogs.msdn.microsoft.com/pfxteam/2012/06/15/executioncontext-vs-sync... But that is more of an OO approach --- there are more "subclasses" of ExecutionContexts like SecurityContext, HostExecutionContext, CallContext and there's colorful terminology like "flowing the Execution Context". Stefan Krah

Hi Jim, In short, yes, we can "dumb down" PEP 550 to a chain of maps. PEP 550 does the following on top of that dumbed down version: 0. Adds execution_context "chain" root to PyThreadState. 1. Extends (async-)generator objects to support this chaining -- each generator has its own "env" to accumulate its changes. 2. ContextKey is an object that we use to work with EC. Compared to using strings, using an object allows us to implement caching (important for numpy and decimal-like libs) and avoids name clashes. 3. Yes, efficiency is important. If you start an asyncio.Task, or schedule an asyncio callback, or want to run some code in a separate OS thread, you need to capture the current EC -- make a shallow copy of all LCs in it. That's expensive, and the PEP solves this problem by using special datastructures (a), and providing just enough APIs to work with the EC so that those datastructures are not exposed to the end user (b). 4. Provides common APIs that will be used by asyncio, decimal, numpy, etc.
This was never proposed :) I decided to put new APIs to the sys module as we usually are conservative about adding new globals, and the feature is low-level (like working with frames).
If the semantics are right, and collections.ChainMap is rejected only for efficiency, please say so in the PEP.
`collections.ChainMap` on its own is not a solution, it's one of possible implementations. Efficiency is indeed the reason why using ChainMap is not an option (see (3) above). This whole "capturing of execution context" topic is not covered well enough in the PEP, and is something that we'll fix in the next version (soon). Yury

On Tue, Aug 22, 2017 at 2:56 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
Hi Jim,
In short, yes, we can "dumb down" PEP 550 to a chain of maps.
I think it's also good to think about the actual problem(s) that are being solved, without going too deeply into the implementation. It might be useful to look at all the motivating use cases and to make sure this is really the best way to provide a solution to them.
PEP 550 does the following on top of that dumbed down version:
[...]
How exactly is caching dependent on the proposed ContextKey thing? To avoid a dict-lookup or similar to get the cached value? But now we need to look up the key object from somewhere? [...]
4. Provides common APIs that will be used by asyncio, decimal, numpy, etc.
Which APIs? The C API you mean? Something that is not in Jim's list? Something that is (not) in the PEP? People need to get a clear picture of what is being proposed. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

Jim J. Jewett wrote:
You're definitely not alone! I think I get the gist of the proposal, and its motivation, but I'm definitely confused by the terminology. As I stated elsewhere, the word "context" has a well-established meaning in Python, with context managers, their protocols, and contextlib. When talking with another Pythonista three years from now, I don't want to have to resolve which context they're talking about based on context. ;) I think you have a point too about designing the abstract behavior and API first, and then worry about implementation details (in fact, maybe take implementation discussions out of the PEP for now, and maybe hash that out in a PR). I also think you're on to something when you suggest that sys may not be the best place for these new APIs. sys is already a mishmash of lots of random stuff, and the concepts defined in PEP 550 are advanced enough that many Python developers will never need to worry about them. Putting them in sys leads to cognitive overload. I'm not sure I'd put them in builtins either, but a new module makes a lot of sense to me. Plus, it means that we can choose more natural names for the APIs since they'll be namespaced away in a separate module. Cheers, -Barry

On Thu, Aug 24, 2017 at 10:04:58AM -0400, Barry Warsaw wrote:
I'm not happy about "context" either. I'd prefer something more pedantic, like: TaskLocalStorage, TaskLocalStorageStack, even when generators aren't tasks. At least that's what people are used to from ThreadLocalStorage. The .NET termiology is explained here: https://blogs.msdn.microsoft.com/pfxteam/2012/06/15/executioncontext-vs-sync... But that is more of an OO approach --- there are more "subclasses" of ExecutionContexts like SecurityContext, HostExecutionContext, CallContext and there's colorful terminology like "flowing the Execution Context". Stefan Krah
participants (5)
-
Barry Warsaw
-
Jim J. Jewett
-
Koos Zevenhoven
-
Stefan Krah
-
Yury Selivanov