[Python-ideas] PEP 550 v2

Fri Aug 18 12:17:11 EDT 2017

On Fri, Aug 18, 2017 at 1:09 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 17 August 2017 at 01:22, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>>> Coroutine Object Modifications
>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> To achieve this, a small set of modifications to the coroutine object
>>>> is needed:
>>>>
>>>> * New ``cr_local_context`` attribute.  This attribute is readable
>>>>   and writable for Python code.
>>>
>>> For ease of introspection, it's probably worth using a common
>>> `__local_context__` attribute name across all the different types that
>>> support one, and encouraging other object implementations to do the
>>> same.
>>>
>>> This isn't like cr_await and gi_yieldfrom, where we wanted to use
>>> different names because they refer to different kinds of objects.
>>
>> We also have cr_code and gi_code, which are used for introspection
>> purposes but refer to CodeObject.
>
> Right, hence https://bugs.python.org/issue31230 :)
>
> (That suggestion is prompted by the fact that if we'd migrated gi_code
> to __code__ in 3.0, the same way we migrated func_code, then cr_code
> and ag_code would almost certainly have followed the same
> dunder-naming convention, and
> https://github.com/python/cpython/pull/3077 would never have been
> necessary)
>
>> I myself don't like the mess the C-style convention created for our
>> Python code (think of what the "dis" and "inspect" modules have to go
>> through), so I'm +0 for having "__local_context__".
>
> I'm starting to think this should be __private_context__ (to convey
> the *intent* of the attribute), rather than naming it after the type
> that it's expected to store.

I've been thinking a lot about the terminology, and I have another
variant to consider:  ExecutionContext is a stack of LogicalContexts.
Coroutines/generators will thus have a __logical_context__ attribute.
I think that the "logical" term better conveys the meaning than
"private" or "dynamic".

>
> Thinking about this particular attribute name did prompt the question
> of how we want PEP 550 to interact with the exec builtin, though, as
> well as raising some questions around a number of other code execution
> cases:
>
> 1. What is the execution context for top level code in a module?

Whatever the execution context of the current thread that is importing
the code is. Which would usually be the main thread.

> 2. What is the execution context for the import machinery in an import
> statement?
> 3. What is the execution context for the import machinery when invoked
> via importlib?

Whatever the execution context that invoked the import machinery, be
it "__import__()" or "import" statement or "importlib.load_module"

> 4. What is the execution context for the import machinery when invoked
> via the C API?
> 5. What is the execution context for the import machinery when invoked
> via the runpy module?
> 6. What is the execution context for things like the timeit module,
> templating engines, etc?
> 7. What is the execution context for codecs and codec error handlers?
> 8. What is the execution context for __del__ methods and weakref callbacks?

In general, EC behaves just like TLS for all these cases, there's
literally no difference.

> 9. What is the execution context for trace hooks and other really low
> level machinery?
> 10. What is the execution context for displayhook and excepthook?

Speaking of sys.displayhook and sys.stdio -- this API is fundamentally
incompatible with PEP 550 or any possible context isolation.  These
things are essentially *global* variables in the sys module, and
there's tons of code out there that *expects* them to behave like
globals.  If a user changes displayhook they expect it to work across
all threads.

If we want to make displayhooks/sys.stdio to become context-aware we
will need new APIs for them with new properties/expectations.  Simply
forcing them to use execution context would be backwards incompatible.

PEP 550 won't try to change how displayhooks, excepthooks, trace
functions, sys.stdout etc work -- this is out of its scope.  We can't
refactor half of sys module as part of one PEP.

>
> I think a number of those (top level module code executed via the
> import system, the timeit module, templating engines) can be addressed
> by saying that the exec builtin always creates a completely fresh
> execution context by default (with no access to the parent's execution
> context), and will gain a new keyword-only parameter that allows you
> to specify an execution context to use. That way, exec'ed code will be
> independent by default, but users of exec() will be able to opt in to
> handing it like a normal function call by passing in the current
> context.

"exec" uses outer globals/locals if you don't pass them explicitly --
the code isn't isolated by default. Isolation for "exec" is opt-in:

   ]]] a = 1
   ]]] exec('print(a); b = 2')
   1
   ]]] b
   2

Therefore, with regards to PEP 550, it should execute the code with
the current EC/LC.  We should also add a new keyword arguments to
provide custom LC and EC (same as we do for locals/globals).

> The default REPL, the code module and the IDLE shell window
> would need to be updated so that they use a shared context for
> evaluating the user supplied code snippets, while keeping their own
> context separate.
>
> While top-level code would always run in a completely fresh context
> for imports, the runpy module would expose the same setting as the
> exec builtin, so the executed code would be isolated by default, but
> you could opt in to using a particular execution context if you wanted
> to.
>
> Codecs and codec error handlers I think will be best handled in a way
> similar to generators, where they have their own private context (so
> they can't alter the caller's context), but can *read* the caller's
> context (so the context can be used as a way of providing
> context-dependent codec settings).
>
> That "read-only" access model also feels like the right option for the
> import machinery - regardless of whether it's accessed via the import
> statement, importlib, the C API, or the runpy module, the import
> machinery should be able to *read* the dynamic context, but not make
> persistent changes to it.
>
> Since they can be executed at arbitrary points in the code, it feels
> to me that __del__ methods and weakref callbacks should *always* be
> executed in a completely pristine execution context, with no access
> whatsoever to any thread's dynamic context.
>
> I think we should leave the execution context alone for the really low
> level hooks, and simply point out that yes, these have the ability to
> do weird things to the execution context, just as they have the power
> to do weird things to local variables, so they need to be handles with
> care.
>
> For displayhook and excepthook, I don't have a particularly strong
> intuition, so my default recommendation would be the read-only access
> proposed for generators, codecs, and the import machinery.

I really think that in 3.7 we should just implement PEP 550 with its
current scope, and defer system refactorings to 3.8. Many of such
refactorings will probably deserve their own PEP, as, for example,
changing sys.stdout semantics is a really complex topic.  At this
point we try to solve a problem of making a replacement for TLS that
supports generators and async.

Yury