From ncoghlan at gmail.com Fri Jan 12 01:55:48 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 12 Jan 2018 16:55:48 +1000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? Message-ID: (cc'ed a couple of folks that I expect will be interested in this question, but may not be subscribed to import-sig) The current version of PEP 547 (supporting the -m switch for extension modules) works by defining a new optional "exec_in_module" API for loaders to implement, and then updating runpy._run_module_as_main to call it. However, reviewing Mario Corchero's patches for https://bugs.python.org/issue9325 (adding "-m" switch support to assorted modules) has highlighted a potential challenge with that approach: it turns out the most useful private API in runpy for emulating the -m switch is "mod_name, mod_spec, code = runpy._get_module_details(module_name)". That means that if we can figure out a way to have ExtensionFileLoader.get_code() emit a Python code object that delegates to Py_mod_exec, then we'd be well on our way to supporting "python -m " without making *any changes to runpy* (or the other modules that are gaining "-m" equivalents). If we did decide to go down that path, the main way I could see it working without any new features in the C interface is to structure things such that the extension module would still run in its own namespace, with the interface adaptation code returned from get_code() (after compilation) looking something like: ns = globals() if ns is not locals(): raise RuntimeError("Cannot execute extension module {} with separate local namespace") module = _imp.create_dynamic() module.__dict__.update(ns) _imp.exec_dynamic(module) ns.update(module.__dict__) The biggest advantages of this approach are that it would still work for Cython (and other) modules that defined Py_mod_create, and it would implicitly interoperate (at least to some degree) with anything that relied on the "get code and exec it" model of interacting with Python modules. Alternatively, we could instead push the decision on how to handle this case down to extension module authors as follows: 1. Define a new Py_mod_exec_in_namespace slot that accepts a target namespace as its parameter instead of a pre-existing module 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API 3. When Py_mod_exec_in_namespace is defined, make the adapter code look something like: ns = globals() if ns is not locals(): import collections ns = collections.ChainMap(locals(), ns) _imp.exec_dynamic_in_namespace(, ns) (There are several ways the functionality could be split up between the generated code and the _imp module, this is just an example that suggests the idea is technically feasible) The nice thing about including the new slot in the design is that it gives extension modules a way to avoid the overhead of copying attributes in and out, as would be needed if relying solely on the PEP 489 APIs. Cheers, Nick. P.S. Given these changes we could technically define "get_source()" on extension modules as well, but that doesn't seem especially useful. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Fri Jan 12 12:52:39 2018 From: brett at python.org (Brett Cannon) Date: Fri, 12 Jan 2018 17:52:39 +0000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: Message-ID: So obviously implementing get_code() for the extension module loader would be great. :) So the question becomes how? On Thu, 11 Jan 2018 at 22:57 Nick Coghlan wrote: > (cc'ed a couple of folks that I expect will be interested in this > question, but may not be subscribed to import-sig) > > The current version of PEP 547 (supporting the -m switch for extension > modules) works by defining a new optional "exec_in_module" API for > loaders to implement, and then updating runpy._run_module_as_main to > call it. > > However, reviewing Mario Corchero's patches for > https://bugs.python.org/issue9325 (adding "-m" switch support to > assorted modules) has highlighted a potential challenge with that > approach: it turns out the most useful private API in runpy for > emulating the -m switch is "mod_name, mod_spec, code = > runpy._get_module_details(module_name)". > > That means that if we can figure out a way to have > ExtensionFileLoader.get_code() emit a Python code object that > delegates to Py_mod_exec, then we'd be well on our way to supporting > "python -m " without making *any changes to runpy* > (or the other modules that are gaining "-m" equivalents). > > If we did decide to go down that path, the main way I could see it > working without any new features in the C interface is to structure > things such that the extension module would still run in its own > namespace, with the interface adaptation code returned from get_code() > (after compilation) looking something like: > > ns = globals() > if ns is not locals(): > raise RuntimeError("Cannot execute extension module > {} with separate local namespace") > module = _imp.create_dynamic() > module.__dict__.update(ns) > _imp.exec_dynamic(module) > ns.update(module.__dict__) > > The biggest advantages of this approach are that it would still work > for Cython (and other) modules that defined Py_mod_create, and it > would implicitly interoperate (at least to some degree) with anything > that relied on the "get code and exec it" model of interacting with > Python modules. > > Alternatively, we could instead push the decision on how to handle > this case down to extension module authors as follows: > > 1. Define a new Py_mod_exec_in_namespace slot that accepts a target > namespace as its parameter instead of a pre-existing module > 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API > 3. When Py_mod_exec_in_namespace is defined, make the adapter code > look something like: > > ns = globals() > if ns is not locals(): > import collections > ns = collections.ChainMap(locals(), ns) > _imp.exec_dynamic_in_namespace(, ns) > > (There are several ways the functionality could be split up between > the generated code and the _imp module, this is just an example that > suggests the idea is technically feasible) > > The nice thing about including the new slot in the design is that it > gives extension modules a way to avoid the overhead of copying > attributes in and out, as would be needed if relying solely on the PEP > 489 APIs. > > Cheers, > Nick. > > P.S. Given these changes we could technically define "get_source()" on > extension modules as well, but that doesn't seem especially useful. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From encukou at gmail.com Sat Jan 13 10:48:48 2018 From: encukou at gmail.com (Petr Viktorin) Date: Sat, 13 Jan 2018 16:48:48 +0100 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: Message-ID: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> On 01/12/2018 06:52 PM, Brett Cannon wrote: > So obviously implementing get_code() for the extension module loader > would be great. :) So the question becomes how? Marcel took a quick look at it already. It seems it's quite a simple addition, and it makes tests developed for PEP 547 pass. Hopefully we can have a PR early next week :) > On Thu, 11 Jan 2018 at 22:57 Nick Coghlan > wrote: > > (cc'ed a couple of folks that I expect will be interested in this > question, but may not be subscribed to import-sig) > > The current version of PEP 547 (supporting the -m switch for extension > modules) works by defining a new optional "exec_in_module" API for > loaders to implement, and then updating runpy._run_module_as_main to > call it. > > However, reviewing Mario Corchero's patches for > https://bugs.python.org/issue9325 (adding "-m" switch support to > assorted modules) has highlighted a potential challenge with that > approach: it turns out the most useful private API in runpy for > emulating the -m switch is "mod_name, mod_spec, code = > runpy._get_module_details(module_name)". > > That means that if we can figure out a way to have > ExtensionFileLoader.get_code() emit a Python code object that > delegates to Py_mod_exec, then we'd be well on our way to supporting > "python -m " without making *any changes to runpy* > (or the other modules that are gaining "-m" equivalents). > > If we did decide to go down that path, the main way I could see it > working without any new features in the C interface is to structure > things such that the extension module would still run in its own > namespace, with the interface adaptation code returned from get_code() > (after compilation) looking something like: > > ? ? ns = globals() > ? ? if ns is not locals(): > ? ? ? ? raise RuntimeError("Cannot execute extension module > {} with separate local namespace") > ? ? module = _imp.create_dynamic() > ? ? module.__dict__.update(ns) > ? ? _imp.exec_dynamic(module) > ? ? ns.update(module.__dict__) > > The biggest advantages of this approach are that it would still work > for Cython (and other) modules that defined Py_mod_create, and it > would implicitly interoperate (at least to some degree) with anything > that relied on the "get code and exec it" model of interacting with > Python modules. > > Alternatively, we could instead push the decision on how to handle > this case down to extension module authors as follows: > > 1. Define a new Py_mod_exec_in_namespace slot that accepts a target > namespace as its parameter instead of a pre-existing module > 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API > 3. When Py_mod_exec_in_namespace is defined, make the adapter code > look something like: > > ? ? ns = globals() > ? ? if ns is not locals(): > ? ? ? ? import collections > ? ? ? ? ns = collections.ChainMap(locals(), ns) > ? ? _imp.exec_dynamic_in_namespace(, ns) > > (There are several ways the functionality could be split up between > the generated code and the _imp module, this is just an example that > suggests the idea is technically feasible) > > The nice thing about including the new slot in the design is that it > gives extension modules a way to avoid the overhead of copying > attributes in and out, as would be needed if relying solely on the PEP > 489 APIs. > > Cheers, > Nick. > > P.S. Given these changes we could technically define "get_source()" on > extension modules as well, but that doesn't seem especially useful. > From brett at python.org Sat Jan 13 14:49:10 2018 From: brett at python.org (Brett Cannon) Date: Sat, 13 Jan 2018 19:49:10 +0000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> Message-ID: Awesome! Thanks for looking into it. On Sat, Jan 13, 2018, 07:49 Petr Viktorin, wrote: > On 01/12/2018 06:52 PM, Brett Cannon wrote: > > So obviously implementing get_code() for the extension module loader > > would be great. :) So the question becomes how? > > Marcel took a quick look at it already. It seems it's quite a simple > addition, and it makes tests developed for PEP 547 pass. Hopefully we > can have a PR early next week :) > > > > On Thu, 11 Jan 2018 at 22:57 Nick Coghlan > > wrote: > > > > (cc'ed a couple of folks that I expect will be interested in this > > question, but may not be subscribed to import-sig) > > > > The current version of PEP 547 (supporting the -m switch for > extension > > modules) works by defining a new optional "exec_in_module" API for > > loaders to implement, and then updating runpy._run_module_as_main to > > call it. > > > > However, reviewing Mario Corchero's patches for > > https://bugs.python.org/issue9325 (adding "-m" switch support to > > assorted modules) has highlighted a potential challenge with that > > approach: it turns out the most useful private API in runpy for > > emulating the -m switch is "mod_name, mod_spec, code = > > runpy._get_module_details(module_name)". > > > > That means that if we can figure out a way to have > > ExtensionFileLoader.get_code() emit a Python code object that > > delegates to Py_mod_exec, then we'd be well on our way to supporting > > "python -m " without making *any changes to runpy* > > (or the other modules that are gaining "-m" equivalents). > > > > If we did decide to go down that path, the main way I could see it > > working without any new features in the C interface is to structure > > things such that the extension module would still run in its own > > namespace, with the interface adaptation code returned from > get_code() > > (after compilation) looking something like: > > > > ns = globals() > > if ns is not locals(): > > raise RuntimeError("Cannot execute extension module > > {} with separate local namespace") > > module = _imp.create_dynamic() > > module.__dict__.update(ns) > > _imp.exec_dynamic(module) > > ns.update(module.__dict__) > > > > The biggest advantages of this approach are that it would still work > > for Cython (and other) modules that defined Py_mod_create, and it > > would implicitly interoperate (at least to some degree) with anything > > that relied on the "get code and exec it" model of interacting with > > Python modules. > > > > Alternatively, we could instead push the decision on how to handle > > this case down to extension module authors as follows: > > > > 1. Define a new Py_mod_exec_in_namespace slot that accepts a target > > namespace as its parameter instead of a pre-existing module > > 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API > > 3. When Py_mod_exec_in_namespace is defined, make the adapter code > > look something like: > > > > ns = globals() > > if ns is not locals(): > > import collections > > ns = collections.ChainMap(locals(), ns) > > _imp.exec_dynamic_in_namespace(, ns) > > > > (There are several ways the functionality could be split up between > > the generated code and the _imp module, this is just an example that > > suggests the idea is technically feasible) > > > > The nice thing about including the new slot in the design is that it > > gives extension modules a way to avoid the overhead of copying > > attributes in and out, as would be needed if relying solely on the > PEP > > 489 APIs. > > > > Cheers, > > Nick. > > > > P.S. Given these changes we could technically define "get_source()" > on > > extension modules as well, but that doesn't seem especially useful. > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmarcel.plch at gmail.com Tue Jan 16 08:43:06 2018 From: gmarcel.plch at gmail.com (Marcel Plch) Date: Tue, 16 Jan 2018 14:43:06 +0100 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> Message-ID: I took a look at the get_code() and it works basically just as is in Nick's mail. Spec and name are accessible through passed globals. Unlike the current proposal for -m switch for extension modules [PR 1761], this approach does not require multiphase initialization. That means that every module can be run with this switch just as it is right now, including the standard library: $ python -im math >>> print(e) 2.718281828459045 Should I open a bug for this, or reuse [bpo-30403]? Or does this need a PEP? You can see the required changes here: https://github.com/Traceur759/cpython/pull/6/files [PR 1761]: https://github.com/python/cpython/pull/1761 [bpo-30403]: https://bugs.python.org/issue30403 On Sat, Jan 13, 2018 at 8:49 PM, Brett Cannon wrote: > Awesome! Thanks for looking into it. > > > On Sat, Jan 13, 2018, 07:49 Petr Viktorin, wrote: >> >> On 01/12/2018 06:52 PM, Brett Cannon wrote: >> > So obviously implementing get_code() for the extension module loader >> > would be great. :) So the question becomes how? >> >> Marcel took a quick look at it already. It seems it's quite a simple >> addition, and it makes tests developed for PEP 547 pass. Hopefully we >> can have a PR early next week :) >> >> >> > On Thu, 11 Jan 2018 at 22:57 Nick Coghlan > > > wrote: >> > >> > (cc'ed a couple of folks that I expect will be interested in this >> > question, but may not be subscribed to import-sig) >> > >> > The current version of PEP 547 (supporting the -m switch for >> > extension >> > modules) works by defining a new optional "exec_in_module" API for >> > loaders to implement, and then updating runpy._run_module_as_main to >> > call it. >> > >> > However, reviewing Mario Corchero's patches for >> > https://bugs.python.org/issue9325 (adding "-m" switch support to >> > assorted modules) has highlighted a potential challenge with that >> > approach: it turns out the most useful private API in runpy for >> > emulating the -m switch is "mod_name, mod_spec, code = >> > runpy._get_module_details(module_name)". >> > >> > That means that if we can figure out a way to have >> > ExtensionFileLoader.get_code() emit a Python code object that >> > delegates to Py_mod_exec, then we'd be well on our way to supporting >> > "python -m " without making *any changes to runpy* >> > (or the other modules that are gaining "-m" equivalents). >> > >> > If we did decide to go down that path, the main way I could see it >> > working without any new features in the C interface is to structure >> > things such that the extension module would still run in its own >> > namespace, with the interface adaptation code returned from >> > get_code() >> > (after compilation) looking something like: >> > >> > ns = globals() >> > if ns is not locals(): >> > raise RuntimeError("Cannot execute extension module >> > {} with separate local namespace") >> > module = _imp.create_dynamic() >> > module.__dict__.update(ns) >> > _imp.exec_dynamic(module) >> > ns.update(module.__dict__) >> > >> > The biggest advantages of this approach are that it would still work >> > for Cython (and other) modules that defined Py_mod_create, and it >> > would implicitly interoperate (at least to some degree) with >> > anything >> > that relied on the "get code and exec it" model of interacting with >> > Python modules. >> > >> > Alternatively, we could instead push the decision on how to handle >> > this case down to extension module authors as follows: >> > >> > 1. Define a new Py_mod_exec_in_namespace slot that accepts a target >> > namespace as its parameter instead of a pre-existing module >> > 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API >> > 3. When Py_mod_exec_in_namespace is defined, make the adapter code >> > look something like: >> > >> > ns = globals() >> > if ns is not locals(): >> > import collections >> > ns = collections.ChainMap(locals(), ns) >> > _imp.exec_dynamic_in_namespace(, ns) >> > >> > (There are several ways the functionality could be split up between >> > the generated code and the _imp module, this is just an example that >> > suggests the idea is technically feasible) >> > >> > The nice thing about including the new slot in the design is that it >> > gives extension modules a way to avoid the overhead of copying >> > attributes in and out, as would be needed if relying solely on the >> > PEP >> > 489 APIs. >> > >> > Cheers, >> > Nick. >> > >> > P.S. Given these changes we could technically define "get_source()" >> > on >> > extension modules as well, but that doesn't seem especially useful. >> > >> _______________________________________________ >> Import-SIG mailing list >> Import-SIG at python.org >> https://mail.python.org/mailman/listinfo/import-sig > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig > From brett at python.org Tue Jan 16 12:08:57 2018 From: brett at python.org (Brett Cannon) Date: Tue, 16 Jan 2018 17:08:57 +0000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> Message-ID: On Tue, 16 Jan 2018 at 05:43 Marcel Plch wrote: > I took a look at the get_code() and it works basically just as is in > Nick's mail. Spec and name are accessible through passed globals. > Unlike the current proposal for -m switch for extension modules [PR > 1761], this approach does not require multiphase initialization. That > means that every module can be run with this switch just as it is > right now, including the standard library: > > $ python -im math > >>> print(e) > 2.718281828459045 > > Should I open a bug for this, or reuse [bpo-30403]? If you want, although a new issue is also totally justified. > Or does this need a PEP? > I don't think so since you're just implementing a pre-existing API and this doesn't require changing any pre-existing semantics. -Brett > > You can see the required changes here: > https://github.com/Traceur759/cpython/pull/6/files > > [PR 1761]: https://github.com/python/cpython/pull/1761 > [bpo-30403]: https://bugs.python.org/issue30403 > > On Sat, Jan 13, 2018 at 8:49 PM, Brett Cannon wrote: > > Awesome! Thanks for looking into it. > > > > > > On Sat, Jan 13, 2018, 07:49 Petr Viktorin, wrote: > >> > >> On 01/12/2018 06:52 PM, Brett Cannon wrote: > >> > So obviously implementing get_code() for the extension module loader > >> > would be great. :) So the question becomes how? > >> > >> Marcel took a quick look at it already. It seems it's quite a simple > >> addition, and it makes tests developed for PEP 547 pass. Hopefully we > >> can have a PR early next week :) > >> > >> > >> > On Thu, 11 Jan 2018 at 22:57 Nick Coghlan >> > > wrote: > >> > > >> > (cc'ed a couple of folks that I expect will be interested in this > >> > question, but may not be subscribed to import-sig) > >> > > >> > The current version of PEP 547 (supporting the -m switch for > >> > extension > >> > modules) works by defining a new optional "exec_in_module" API for > >> > loaders to implement, and then updating runpy._run_module_as_main > to > >> > call it. > >> > > >> > However, reviewing Mario Corchero's patches for > >> > https://bugs.python.org/issue9325 (adding "-m" switch support to > >> > assorted modules) has highlighted a potential challenge with that > >> > approach: it turns out the most useful private API in runpy for > >> > emulating the -m switch is "mod_name, mod_spec, code = > >> > runpy._get_module_details(module_name)". > >> > > >> > That means that if we can figure out a way to have > >> > ExtensionFileLoader.get_code() emit a Python code object that > >> > delegates to Py_mod_exec, then we'd be well on our way to > supporting > >> > "python -m " without making *any changes to > runpy* > >> > (or the other modules that are gaining "-m" equivalents). > >> > > >> > If we did decide to go down that path, the main way I could see it > >> > working without any new features in the C interface is to > structure > >> > things such that the extension module would still run in its own > >> > namespace, with the interface adaptation code returned from > >> > get_code() > >> > (after compilation) looking something like: > >> > > >> > ns = globals() > >> > if ns is not locals(): > >> > raise RuntimeError("Cannot execute extension module > >> > {} with separate local namespace") > >> > module = _imp.create_dynamic() > >> > module.__dict__.update(ns) > >> > _imp.exec_dynamic(module) > >> > ns.update(module.__dict__) > >> > > >> > The biggest advantages of this approach are that it would still > work > >> > for Cython (and other) modules that defined Py_mod_create, and it > >> > would implicitly interoperate (at least to some degree) with > >> > anything > >> > that relied on the "get code and exec it" model of interacting > with > >> > Python modules. > >> > > >> > Alternatively, we could instead push the decision on how to handle > >> > this case down to extension module authors as follows: > >> > > >> > 1. Define a new Py_mod_exec_in_namespace slot that accepts a > target > >> > namespace as its parameter instead of a pre-existing module > >> > 2. Add a new "_imp.exec_dynamic_in_namespace(spec, namespace)" API > >> > 3. When Py_mod_exec_in_namespace is defined, make the adapter code > >> > look something like: > >> > > >> > ns = globals() > >> > if ns is not locals(): > >> > import collections > >> > ns = collections.ChainMap(locals(), ns) > >> > _imp.exec_dynamic_in_namespace(, > ns) > >> > > >> > (There are several ways the functionality could be split up > between > >> > the generated code and the _imp module, this is just an example > that > >> > suggests the idea is technically feasible) > >> > > >> > The nice thing about including the new slot in the design is that > it > >> > gives extension modules a way to avoid the overhead of copying > >> > attributes in and out, as would be needed if relying solely on the > >> > PEP > >> > 489 APIs. > >> > > >> > Cheers, > >> > Nick. > >> > > >> > P.S. Given these changes we could technically define > "get_source()" > >> > on > >> > extension modules as well, but that doesn't seem especially > useful. > >> > > >> _______________________________________________ > >> Import-SIG mailing list > >> Import-SIG at python.org > >> https://mail.python.org/mailman/listinfo/import-sig > > > > > > _______________________________________________ > > Import-SIG mailing list > > Import-SIG at python.org > > https://mail.python.org/mailman/listinfo/import-sig > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jan 16 23:45:43 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Jan 2018 14:45:43 +1000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> Message-ID: On 17 January 2018 at 03:08, Brett Cannon wrote: > On Tue, 16 Jan 2018 at 05:43 Marcel Plch wrote: >> I took a look at the get_code() and it works basically just as is in >> Nick's mail. Spec and name are accessible through passed globals. >> Unlike the current proposal for -m switch for extension modules [PR >> 1761], this approach does not require multiphase initialization. That >> means that every module can be run with this switch just as it is >> right now, including the standard library: >> >> $ python -im math >> >>> print(e) >> 2.718281828459045 >> >> Should I open a bug for this, or reuse [bpo-30403]? > > If you want, although a new issue is also totally justified. > >> Or does this need a PEP? > > I don't think so since you're just implementing a pre-existing API and this > doesn't require changing any pre-existing semantics. While this is technically true, I think handling this as a revision of PEP 547 would be a better idea, as I'd like to get more folks to think through the implications of that proposed "ns.update(module.__dict__)" step. For example, it may be better to rely on PEP 562's module level __getattr__ and __dir__ instead (delegating both to the underlying "real" module) rather than duplicating the entire namespace, but that would mean that Marcel's "python -im math" example wouldn't work any more (since the __getattr__ hook doesn't get invoked for internal access from *within* the module). On the positive side, it would mean that the underlying module attributes can't accidentally overwrite the attributes in the wrapper module. Whether we use namespace duplication or PEP 562, both of them have the problem that attribute *rebinding* won't work, since they'll only affect the wrapper namespace. I think we can live with that, but we'll likely want to expose a dunder-name to let folks access the underlying "real" module (e.g. make the variable name "__module__" rather than "module", and then use the PEP 562 approach so there's no risk of accidentally overwriting it). There's also an open question around how this would interact with PEP 399 if that was implemented - assuming the wrapper module gets injected into sys.modules under the given name, should it replace that with the inner implementation module? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From encukou at gmail.com Wed Jan 17 08:25:17 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 17 Jan 2018 14:25:17 +0100 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> Message-ID: <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> On 01/17/2018 05:45 AM, Nick Coghlan wrote: > On 17 January 2018 at 03:08, Brett Cannon wrote: >> On Tue, 16 Jan 2018 at 05:43 Marcel Plch wrote: >>> I took a look at the get_code() and it works basically just as is in >>> Nick's mail. Spec and name are accessible through passed globals. >>> Unlike the current proposal for -m switch for extension modules [PR >>> 1761], this approach does not require multiphase initialization. That >>> means that every module can be run with this switch just as it is >>> right now, including the standard library: >>> >>> $ python -im math >>> >>> print(e) >>> 2.718281828459045 >>> >>> Should I open a bug for this, or reuse [bpo-30403]? >> >> If you want, although a new issue is also totally justified. >> >>> Or does this need a PEP? >> >> I don't think so since you're just implementing a pre-existing API and this >> doesn't require changing any pre-existing semantics. > > While this is technically true, I think handling this as a revision of > PEP 547 would be a better idea, as I'd like to get more folks to think > through the implications of that proposed "ns.update(module.__dict__)" > step. > > For example, it may be better to rely on PEP 562's module level > __getattr__ and __dir__ instead (delegating both to the underlying > "real" module) rather than duplicating the entire namespace, but that > would mean that Marcel's "python -im math" example wouldn't work any > more (since the __getattr__ hook doesn't get invoked for internal > access from *within* the module). On the positive side, it would mean > that the underlying module attributes can't accidentally overwrite the > attributes in the wrapper module. > > Whether we use namespace duplication or PEP 562, both of them have the > problem that attribute *rebinding* won't work, since they'll only > affect the wrapper namespace. I think we can live with that, but we'll > likely want to expose a dunder-name to let folks access the underlying > "real" module (e.g. make the variable name "__module__" rather than > "module", and then use the PEP 562 approach so there's no risk of > accidentally overwriting it). Hm, the more I think about it, the more I don't like the namespace copy. The `python -im math` is cute, but doesn't really solve any immediate problem. `from math import *` practically does the same thing, and is way more obvious. The main rationale behind making -m work for extension modules was to make them behave like pure-Python ones -- e.g. if you Cythonize something, everything will keep working as before. That's not the case here, and adding `__module__` would be just piling on workarounds. > There's also an open question around how this would interact with PEP > 399 if that was implemented - assuming the wrapper module gets > injected into sys.modules under the given name, should it replace that > with the inner implementation module? > > Cheers, > Nick. From ncoghlan at gmail.com Wed Jan 17 11:08:30 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Jan 2018 02:08:30 +1000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> Message-ID: On 17 January 2018 at 23:25, Petr Viktorin wrote: > On 01/17/2018 05:45 AM, Nick Coghlan wrote: >> Whether we use namespace duplication or PEP 562, both of them have the >> problem that attribute *rebinding* won't work, since they'll only >> affect the wrapper namespace. I think we can live with that, but we'll >> likely want to expose a dunder-name to let folks access the underlying >> "real" module (e.g. make the variable name "__module__" rather than >> "module", and then use the PEP 562 approach so there's no risk of >> accidentally overwriting it). > > Hm, the more I think about it, the more I don't like the namespace copy. > The `python -im math` is cute, but doesn't really solve any immediate > problem. `from math import *` practically does the same thing, and is way > more obvious. > The main rationale behind making -m work for extension modules was to make > them behave like pure-Python ones -- e.g. if you Cythonize something, > everything will keep working as before. That's not the case here, and adding > `__module__` would be just piling on workarounds. Aye, the namespace copy idea is cute, but I don't think it's a path we want to go down due to the state consistency management problems that it creates. I definitely prefer the idea of handling the importlib/runpy side of PEP 547 via `get_code()` though - this thread was prompted by asking myself whether or not I'd approve the PEP in its current form, and deciding that runpy et al needing to be aware of the new capability in order to benefit from it genuinely bothered me. So the question then is what the module execution code would need to look like for the following cases: - multi-phase init with Py_mod_exec only - multi-phase init with Py_mod_create as well - single-phase init Where things get tricky with this approach is that by the time the synthesised code object is running, it doesn't have access to the module itself any more, only the module namespace. We could get around that in the Py_mod_exec-only case by looking __name__ up in sys.modules, but that doesn't help with either of the other two cases where the module creation happens outside the import system's control, and would be a surprising discrepancy between extension modules and pure Python ones. As far as I can see, that leaves us with only one potential design direction we haven't explored yet: what if we provided a way for an existing namespace to be passed in when creating a module object? If we did that, then it would be possible to create a hidden module in the synthesised code such that "globals() is _private_module.__dict__". That might not get us all the way to supporting single-phase init, but it would make it feasible to define a new Py_mod_create_with_namespace slot, such that "-m" would be supported for multi-phase modules that either didn't define Py_mod_create, or else defined Py_mod_create_with_namespace. I'm fairly sure that wouldn't actually work right though, as I expect the descriptor protocol would lead to the "wrong" module getting passed in to the extension module functions :( Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From encukou at gmail.com Wed Jan 17 12:19:35 2018 From: encukou at gmail.com (Petr Viktorin) Date: Wed, 17 Jan 2018 18:19:35 +0100 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> Message-ID: On 01/17/2018 05:08 PM, Nick Coghlan wrote: > On 17 January 2018 at 23:25, Petr Viktorin wrote: >> On 01/17/2018 05:45 AM, Nick Coghlan wrote: >>> Whether we use namespace duplication or PEP 562, both of them have the >>> problem that attribute *rebinding* won't work, since they'll only >>> affect the wrapper namespace. I think we can live with that, but we'll >>> likely want to expose a dunder-name to let folks access the underlying >>> "real" module (e.g. make the variable name "__module__" rather than >>> "module", and then use the PEP 562 approach so there's no risk of >>> accidentally overwriting it). >> >> Hm, the more I think about it, the more I don't like the namespace copy. >> The `python -im math` is cute, but doesn't really solve any immediate >> problem. `from math import *` practically does the same thing, and is way >> more obvious. >> The main rationale behind making -m work for extension modules was to make >> them behave like pure-Python ones -- e.g. if you Cythonize something, >> everything will keep working as before. That's not the case here, and adding >> `__module__` would be just piling on workarounds. > > Aye, the namespace copy idea is cute, but I don't think it's a path we > want to go down due to the state consistency management problems that > it creates. > > I definitely prefer the idea of handling the importlib/runpy side of > PEP 547 via `get_code()` though - this thread was prompted by asking > myself whether or not I'd approve the PEP in its current form, and > deciding that runpy et al needing to be aware of the new capability in > order to benefit from it genuinely bothered me. > > So the question then is what the module execution code would need to > look like for the following cases: > > - multi-phase init with Py_mod_exec only > - multi-phase init with Py_mod_create as well > - single-phase init > > Where things get tricky with this approach is that by the time the > synthesised code object is running, it doesn't have access to the > module itself any more, only the module namespace. We could get around > that in the Py_mod_exec-only case by looking __name__ up in > sys.modules, but that doesn't help with either of the other two cases > where the module creation happens outside the import system's control, > and would be a surprising discrepancy between extension modules and > pure Python ones. > > As far as I can see, that leaves us with only one potential design > direction we haven't explored yet: what if we provided a way for an > existing namespace to be passed in when creating a module object? If > we did that, then it would be possible to create a hidden module in > the synthesised code such that "globals() is > _private_module.__dict__". That might not get us all the way to > supporting single-phase init, but it would make it feasible to define > a new Py_mod_create_with_namespace slot, such that "-m" would be > supported for multi-phase modules that either didn't define > Py_mod_create, or else defined Py_mod_create_with_namespace. > > I'm fairly sure that wouldn't actually work right though, as I expect > the descriptor protocol would lead to the "wrong" module getting > passed in to the extension module functions :( Let me suggest another potential direction we (or at least I) haven't explored yet: what about working to make the __main__ module either replaceable, or unused until we know what it should be? I remember you saying that's not feasible, so I haven't tried anything, but I don't remember an explanation. How sure are you that that rabbit hole is deeper than the one we're in now? From ncoghlan at gmail.com Wed Jan 17 22:30:37 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Jan 2018 13:30:37 +1000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> Message-ID: On 18 January 2018 at 03:19, Petr Viktorin wrote: > On 01/17/2018 05:08 PM, Nick Coghlan wrote: >> I'm fairly sure that wouldn't actually work right though, as I expect >> the descriptor protocol would lead to the "wrong" module getting >> passed in to the extension module functions :( > > Let me suggest another potential direction we (or at least I) haven't > explored yet: what about working to make the __main__ module either > replaceable, or unused until we know what it should be? > > I remember you saying that's not feasible, so I haven't tried anything, but > I don't remember an explanation. How sure are you that that rabbit hole is > deeper than the one we're in now? If I remember rightly, the main challenges with that were ensuring that: - "python -i" ended up dropping back in to the right module - multiprocessing startup still did the right thing It's likely worth taking another look at the idea in light of the startup refactoring that's happened in 3.7, but I also expect that approach to have similar problems to the "spec.loader.exec_in_module(mod)" case: it's an approach that would require changes on the code execution side, rather than being something we could enable transparently through existing importlib APIs. That said, I'd be a lot more amenable to that outcome if it gave us the ability to execute *all* extension modules, even those using Py_mod_create or single-phase initialisation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From encukou at gmail.com Mon Jan 22 05:09:39 2018 From: encukou at gmail.com (Petr Viktorin) Date: Mon, 22 Jan 2018 11:09:39 +0100 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> Message-ID: <610b9d09-ac99-ded9-61fa-c5beea3cc44b@gmail.com> On 01/18/2018 04:30 AM, Nick Coghlan wrote: > On 18 January 2018 at 03:19, Petr Viktorin wrote: >> On 01/17/2018 05:08 PM, Nick Coghlan wrote: >>> I'm fairly sure that wouldn't actually work right though, as I expect >>> the descriptor protocol would lead to the "wrong" module getting >>> passed in to the extension module functions :( >> >> Let me suggest another potential direction we (or at least I) haven't >> explored yet: what about working to make the __main__ module either >> replaceable, or unused until we know what it should be? >> >> I remember you saying that's not feasible, so I haven't tried anything, but >> I don't remember an explanation. How sure are you that that rabbit hole is >> deeper than the one we're in now? > > If I remember rightly, the main challenges with that were ensuring that: > > - "python -i" ended up dropping back in to the right module > - multiprocessing startup still did the right thing > > It's likely worth taking another look at the idea in light of the > startup refactoring that's happened in 3.7, but I also expect that > approach to have similar problems to the > "spec.loader.exec_in_module(mod)" case: it's an approach that would > require changes on the code execution side, rather than being > something we could enable transparently through existing importlib > APIs. > > That said, I'd be a lot more amenable to that outcome if it gave us > the ability to execute *all* extension modules, even those using > Py_mod_create or single-phase initialisation. I'm not sure that's something to try too hard to design for. What are the benefits? If a module is to do something useful with -m, it will need to be changed anyway. If it just sets up a namespace, using `import *` instead of -i should work fine. From ncoghlan at gmail.com Mon Jan 22 18:51:47 2018 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Jan 2018 09:51:47 +1000 Subject: [Import-SIG] PEP 547: Could we implement a usable "get_code()" for extension modules? In-Reply-To: <610b9d09-ac99-ded9-61fa-c5beea3cc44b@gmail.com> References: <8266e82a-aa1e-5dd1-c85e-1a88d84ba4df@gmail.com> <031fb2cb-e5f1-78ed-0b76-9c67f07bb55e@gmail.com> <610b9d09-ac99-ded9-61fa-c5beea3cc44b@gmail.com> Message-ID: On 22 January 2018 at 20:09, Petr Viktorin wrote: > On 01/18/2018 04:30 AM, Nick Coghlan wrote: >> If I remember rightly, the main challenges with that were ensuring that: >> >> - "python -i" ended up dropping back in to the right module >> - multiprocessing startup still did the right thing >> >> It's likely worth taking another look at the idea in light of the >> startup refactoring that's happened in 3.7, but I also expect that >> approach to have similar problems to the >> "spec.loader.exec_in_module(mod)" case: it's an approach that would >> require changes on the code execution side, rather than being >> something we could enable transparently through existing importlib >> APIs. >> >> That said, I'd be a lot more amenable to that outcome if it gave us >> the ability to execute *all* extension modules, even those using >> Py_mod_create or single-phase initialisation. > > > I'm not sure that's something to try too hard to design for. What are the > benefits? > If a module is to do something useful with -m, it will need to be changed > anyway. If it just sets up a namespace, using `import *` instead of -i > should work fine. Single phase isn't a big deal, but I think Py_mod_create compatibility will be important for Cython - otherwise folks will need to choose between "fast C level access to module globals" (via Py_mod_create and preallocated slots) and "-m switch compatibility". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia