Re: [Python-ideas] Provide a way to import module without exec body

On Fri, 1 Dec 2017 at 10:11 Neil Schemenauer <neil@python.ca> wrote:
I have always assumed the call signature for __import__() was because the import-related opcodes pushed so much logic into the function instead of doing it in opcodes (I actually blogged about this at https://snarky.ca/if-i-were-designing-imort-from-scratch/). Heck, the thing takes in locals() and yet never uses them (and its use of globals() is restricted to specific values so it really doesn't need to be quite so broad). Basically I wished __import__() looked like importlib.import_module().
I have always wanted to at least break up getting the module and fromlist as separate opcodes, so +1 for that. Name resolution could potentially be done as an opcode as it relies on execution state pulled from the globals of the module, but the logic also isn't difficult so +0 for that (i.e. making an opcode that calls something more like importlib.import_module() is more critical to me than eliminating the 'package' argument to that call, but I don't view it as a bad thing to have another opcode for that either). As for the completely separating the loading and execution, I don't have a need for what's being proposed so I don't have an opinion. I basically made sure Eric Snow structured specs so that lazy loading as currently supported works so I got what I wanted for basic lazy importing (short of the PyPI package I keep talking about writing to add a nicer API around lazy importing :) . -Brett
Regards,
Neil

On 2 December 2017 at 07:55, Brett Cannon <brett@python.org> wrote:
In PEP 451 terms, I can definitely see the value in having CREATE_MODULE and EXEC_MODULE be separate opcodes (rather than having them be jammed together in IMPORT_MODULE the way they are now). While there'd still be some import machinery on the frame stack when the module code ran (due to the way the "exec_module" API is defined), there'd be substantially less of it. There'd be some subtleties around handling backwards compatibility with __import__ overrides (essentially, CREATE_MODULE would have to revert to doing all the work, while EXEC_MODULE would become a no-op), but the basic idea seems plausible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3 December 2017 at 13:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
Re-reading my own post reminded me of another potentially harder problem: IMPORT_MODULE also hides all the import cache management from the eval loop. If you try to split creation and execution apart, then that cache management becomes the eval loop's problem (since it needs to know whether the module is already fully initialised or not after the "GET_OR_CREATE_MODULE" step. That cache locking is fairly intricate already, and exposing these to the eval loop as distinct operations wouldn't make that any easier. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2017-12-03, Nick Coghlan wrote:
Right now (half-baked ideas), I'm thinking: IMPORT_RESOLVE Gives the abs_name for a module (to feed to _find_and_load()) IMPORT_LOAD Calls _find_and_load() with abs_name as argment. The body of the module is not executed yet. Could return a spec or a module with the spec that contains the code object of the body. IMPORT_EXEC Executes the body of the module. IMPORT_FROM Calls _handle_fromlist(). Props to Brett for making importlib in such as way that this clean separation should be relatively easy to do. To handle custom __import__ hook, I think we can do the following. Have each opcode detect if __import__ is overridden. There is already such test (import_name fast path). If it is overridden, IMPORT_RESOLVE and IMPORT_LOAD will gather up info and then IMPORT_EXEC will call __import__() using compatible arguments. Inititally, the benefit of making these changes is not some performance improvement or some functionalty we didn't previously have. importlib does all this already and probably just as quickly. The benefit that the import system becomes more understandable. If we decide it is a good idea, we could expose hooks for these opcodes. Not like __import__ though. Maybe there should be a function like sys.set_import_hook(<op>, func). That will keep ceval fast as it will know if there is a hook or not, without having to crawl around in builtins. Regards, Neil

On 2 December 2017 at 07:55, Brett Cannon <brett@python.org> wrote:
In PEP 451 terms, I can definitely see the value in having CREATE_MODULE and EXEC_MODULE be separate opcodes (rather than having them be jammed together in IMPORT_MODULE the way they are now). While there'd still be some import machinery on the frame stack when the module code ran (due to the way the "exec_module" API is defined), there'd be substantially less of it. There'd be some subtleties around handling backwards compatibility with __import__ overrides (essentially, CREATE_MODULE would have to revert to doing all the work, while EXEC_MODULE would become a no-op), but the basic idea seems plausible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 3 December 2017 at 13:22, Nick Coghlan <ncoghlan@gmail.com> wrote:
Re-reading my own post reminded me of another potentially harder problem: IMPORT_MODULE also hides all the import cache management from the eval loop. If you try to split creation and execution apart, then that cache management becomes the eval loop's problem (since it needs to know whether the module is already fully initialised or not after the "GET_OR_CREATE_MODULE" step. That cache locking is fairly intricate already, and exposing these to the eval loop as distinct operations wouldn't make that any easier. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2017-12-03, Nick Coghlan wrote:
Right now (half-baked ideas), I'm thinking: IMPORT_RESOLVE Gives the abs_name for a module (to feed to _find_and_load()) IMPORT_LOAD Calls _find_and_load() with abs_name as argment. The body of the module is not executed yet. Could return a spec or a module with the spec that contains the code object of the body. IMPORT_EXEC Executes the body of the module. IMPORT_FROM Calls _handle_fromlist(). Props to Brett for making importlib in such as way that this clean separation should be relatively easy to do. To handle custom __import__ hook, I think we can do the following. Have each opcode detect if __import__ is overridden. There is already such test (import_name fast path). If it is overridden, IMPORT_RESOLVE and IMPORT_LOAD will gather up info and then IMPORT_EXEC will call __import__() using compatible arguments. Inititally, the benefit of making these changes is not some performance improvement or some functionalty we didn't previously have. importlib does all this already and probably just as quickly. The benefit that the import system becomes more understandable. If we decide it is a good idea, we could expose hooks for these opcodes. Not like __import__ though. Maybe there should be a function like sys.set_import_hook(<op>, func). That will keep ceval fast as it will know if there is a hook or not, without having to crawl around in builtins. Regards, Neil
participants (3)
-
Brett Cannon
-
Neil Schemenauer
-
Nick Coghlan