[Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules?

Sat Jan 13 10:48:48 EST 2018

On 01/12/2018 06:52 PM, Brett Cannon wrote:
> So obviously implementing get_code() for the extension module loader 
> would be great. :) So the question becomes how?

Marcel took a quick look at it already. It seems it's quite a simple 
addition, and it makes tests developed for PEP 547 pass. Hopefully we 
can have a PR early next week :)

> On Thu, 11 Jan 2018 at 22:57 Nick Coghlan <ncoghlan at gmail.com 
> <mailto:ncoghlan at gmail.com>> wrote:
> 
>     (cc'ed a couple of folks that I expect will be interested in this
>     question, but may not be subscribed to import-sig)
> 
>     The current version of PEP 547 (supporting the -m switch for extension
>     modules) works by defining a new optional "exec_in_module" API for
>     loaders to implement, and then updating runpy._run_module_as_main to
>     call it.
> 
>     However, reviewing Mario Corchero's patches for
>     https://bugs.python.org/issue9325 (adding "-m" switch support to
>     assorted modules) has highlighted a potential challenge with that
>     approach: it turns out the most useful private API in runpy for
>     emulating the -m switch is "mod_name, mod_spec, code =
>     runpy._get_module_details(module_name)".
> 
>     That means that if we can figure out a way to have
>     ExtensionFileLoader.get_code() emit a Python code object that
>     delegates to Py_mod_exec, then we'd be well on our way to supporting
>     "python -m <extension module>" without making *any changes to runpy*
>     (or the other modules that are gaining "-m" equivalents).
> 
>     If we did decide to go down that path, the main way I could see it
>     working without any new features in the C interface is to structure
>     things such that the extension module would still run in its own
>     namespace, with the interface adaptation code returned from get_code()
>     (after compilation) looking something like:
> 
>          ns = globals()
>          if ns is not locals():
>              raise RuntimeError("Cannot execute extension module
>     {<interpolated_name>} with separate local namespace")
>          module = _imp.create_dynamic(<interpolated_spec_details>)
>          module.__dict__.update(ns)
>          _imp.exec_dynamic(module)
>          ns.update(module.__dict__)
> 
>     The biggest advantages of this approach are that it would still work
>     for Cython (and other) modules that defined Py_mod_create, and it
>     would implicitly interoperate (at least to some degree) with anything
>     that relied on the "get code and exec it" model of interacting with
>     Python modules.
> 
>     Alternatively, we could instead push the decision on how to handle
>     this case down to extension module authors as follows:
> 
>     1. Define a new Py_mod_exec_in_namespace slot that accepts a target
>     namespace as its parameter instead of a pre-existing module
>     2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API
>     3. When Py_mod_exec_in_namespace is defined, make the adapter code
>     look something like:
> 
>          ns = globals()
>          if ns is not locals():
>              import collections
>              ns = collections.ChainMap(locals(), ns)
>          _imp.exec_dynamic_in_namespace(<interpolated_spec_details>, ns)
> 
>     (There are several ways the functionality could be split up between
>     the generated code and the _imp module, this is just an example that
>     suggests the idea is technically feasible)
> 
>     The nice thing about including the new slot in the design is that it
>     gives extension modules a way to avoid the overhead of copying
>     attributes in and out, as would be needed if relying solely on the PEP
>     489 APIs.
> 
>     Cheers,
>     Nick.
> 
>     P.S. Given these changes we could technically define "get_source()" on
>     extension modules as well, but that doesn't seem especially useful.
>