[replying to the list]
On Sun, Aug 13, 2017 at 6:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 13 August 2017 at 16:01, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
[..]
As Nathaniel suggestion, getting/setting/deleting individual items in
the current context would be implemented as methods on the ContextItem
objects, allowing the return value of "get_context_items" to be a
plain dictionary, rather than a special type that directly supported
updates to the underlying context.
The current PEP 550 design returns a "snapshot" of the current EC with
sys.get_execution_context().
I.e. if you do
ec = sys.get_execution_context()
ec['a'] = 'b'
# sys.get_execution_context_item('a') will return None
You did get a snapshot and you modified it -- but your modifications
are not visible anywhere. You can run a function in that modified EC
with `ec.run(function)` and that function will see that new 'a' key,
but that's it. There's no "magical" updates to the underlying context.
In that case, I think "get_execution_context()" is quite misleading as
a name, and is going to be prone to exactly the confusion we currently
have with the mapping returned by locals(), which is that regardless
of whether writes to it affect the target namespace or not, it's going
to be surprising in at least some situations.
So despite being initially in favour of exposing a mapping-like API at
the Python level, I'm now coming around to Armin Ronacher's point of
view: the copy-on-write semantics for the active context are
sufficiently different from any other mapping type in Python that we
should just avoid the use of __setitem__ and __delitem__ as syntactic
sugar entirely.
I agree. I'll be redesigning the PEP to use the following API (please
ignore the naming peculiarities, there are so many proposals at this
point that I'll just stick to something I have in my head):
1. sys.new_execution_context_key('description') -> sys.ContextItem (or
maybe we should just expose the sys.ContextItem type and let people
instantiate it?)
A key (or "token") to use with the execution context. Besides
eliminating the names collision issue, it'll also have a slightly
better performance, because its __hash__ method will always return a
constant. (Strings cache their __hash__, but other types don't).
2. ContextItem.has(), ContextItem.get(), ContextItem.set(),
ContextItem.delete() -- pretty self-explanatory.
3. sys.get_active_context() -> sys.ExecutionContext -- an immutable
object, has no methods to modify the context.
3a. sys.ExecutionContext.run(callable, *args) -- run a callable(*args)
in some execution context.
3b. sys.ExecutionContext.items() -- an iterator of ContextItem ->
value for introspection and debugging purposes.
4. No sys.set_execution_context() method. At this point I'm not sure
it's a good idea to allow users to change the current execution
context to something else entirely. For use cases like enabling
concurrent.futures to run your function within the current EC, you
just use the sys.get_active_context()/ExecutionContext.run
combination. If anything, we can add this function later.
Instead, we'd lay out the essential primitive operations that *only*
the interpreter can provide and define procedural interfaces for
those, and if anyone wanted to build a higher level object-oriented
interface on top of those primitives, they'd be free to do so, with
the procedural API acting as the abstraction layer that decouples "how
interpreters actually implement it" (e.g. copy-on-write mappings) from
"how libraries and frameworks model it for their own use" (e.g. rich
application context objects). That way, each interpreter would also be
free to define their *internal* object model in whichever way made the
most sense for them, rather than enshrining a point-in-time snaphot of
CPython's preferred implementation model as part of the language
definition.
I agree. I like that this idea gives us more flexibility with the
exact implementation strategy.
[..]
The essential capabilities for active context manipulation would then be:
- get_active_context_token()
- set_active_context(context_token)
As I mentioned above, at this point I'm not entirely sure that we even
need "set_active_context". The only useful thing for it that I can
imagine is creating a decorator that isolates any changes of the
context, but the only usecase for this I see is unittests.
But even for unittests, a better solution is to use a decorator that
detects keys that were added but not deleted during the test (leaks).
- implicitly saving and reverting the active context around various operations
Usually we need to save/revert one particular context item, not the
whole context.
- accessing the active context id for suspended coroutines and
generators (so parent contexts can opt-in to seeing changes made in
child contexts)
Yes, this might be useful, let's keep it.
Running commands in a particular context *wouldn't* be a primitive
operation given those building blocks, since you can implement that
for yourself using the above primitives:
def run_in_context(target_context_token, func, *args, **kwds):
old_context_token = get_active_context_token()
set_active_context(target_context_token)
try:
func(*args, **kwds)
finally:
set_active_context(old_context_token)
I'd still prefer to implement this as part of the spec. There are
some tricks that I want to use to make ExecutionContext.run() much
faster than a pure Python version. This is a highly performance
critical part of the PEP -- call_soon in asyncio is a VERY frequent
thing.
Besides, having ExecutionContext.run eliminates the need to
sys.set_active_context() -- again, we need to discuss this, but I see
less and less utility for it now.
The public manipulation API here would be deliberately based on opaque
tokens to make it clear that creating and mutating execution contexts
is entirely within the realm of the interpreter implementation, and
user level code can only control *which* execution context is active
in the current thread, not create arbitrary new execution contexts of
its own (at least, not without writing a CPython-specific C
extension).
For manipulation of values within the active context, looking at other
comparable APIs, I think the main prior art within the language would
be:
1. threading.local(), which uses the descriptor protocol to handle
arbitrary attributes
2. Cell variable references in function `__closure__` attributes,
which also uses the descriptor protocol by way of the "cell_contents"
attribute
In 3.7, those two examples are being brought closer by way of
`cell_contents` becoming a read/write attribute:
>>> def f(i):
... def g():
... nonlocal i
... return i
... return g
...
>>> g = f(0)
>>> g()
0
>>> cell = g.__closure__[0]
>>> cell.cell_contents
0
>>> cell.cell_contents = 5
>>> g()
5
>>> del cell.cell_contents
>>> g()
Traceback (most recent call last):
...
NameError: free variable 'i' referenced before assignment in enclosing scope
>>> cell.cell_contents = 0
>>> g()
0
This is very similar to the way manipulation of entries within a
thread local namespace works, but with each cell containing exactly
one attribute.
For context items, I agree with Nathaniel that the cell-style
one-value-per-item approach is likely to be the way to go. To
emphasise that changes to that attribute only affect the *active*
context, I think "active_value" would be a good name:
>>> request_id =
sys.create_context_item("my_web_framework.request_id", "Request
identifier for my_web_framework")
>>> request_id.active_value
Traceback (most recent call last):
...
RuntimeError: Context item "my_web_framework.request" not set in
context <context token>
>>> request_id.active_value = "12345"
>>> request_id.active_value
'12345'
I myself prefer a functional API to to __getattr__. I don't like the
"del local.x" syntax. I don't think we are forced to follow the
threading.local() API here, aren't we?
Yury
Finally, given opaque context tokens, and context items that worked
like closure cells (only accessing the active context rather than
lexically scoped variables), the one introspection primitive the
*interpreter* would need to provide is either:
1. Given a context token, return a mapping from context items to their
defined values in the given context
2. A way to get a listing of the context items defined in the active context
Since either of those can be defined in terms of the other, my own
preference goes to the first one, since using it to implement the
second alternative just requires a simple
`sys.get_active_context_token()` call, while implementing the first
one in terms of the second one requires a helper like
`run_in_context()` above to manipulate the active context in the
current thread.
The first one also makes it fairly straightforward to *diff* a given
context against the active one - get the mappings for both contexts,
check which keys they have in common, compare the values for the
common keys, and then report on
- keys that appear in one context but not the other
- values which differ between them for common keys
- (optionally) values which are the same for common keys
Cheers,
Nick.