On 2017-12-01, Chris Angelico wrote:
Can you elaborate on where this is useful, please?
Introspection tools, for example, might want to look at the module without executing it. Also, it is a building block to make lazy loading of modules work. As Nick points out, importlib can do this already.
Currently, the IMPORT_NAME both loads the code for a module and also executes it. The exec happens fairly deep in the guts of importlib. This makes import.c and ceval.c mutually recursive. The locking gets complicated. There are hacks like _call_with_frames_removed() to hide the recursion going on.
Instead, we could have two separate opcodes, one that gets the module but does not exec it (i.e. a function like __import__() that returns a future) and another opcode that actually does the execution. Figuring out all the details is complicated.
- importlib is simpler
- reduce the amount of stack space used (removing recursion by "continuation passing style").
- makes profiling Python easier. Tools like valgrind get confused by call cycle between ceval.c and import.c.
- easier to implement lazy loading of modules (not necessarily a standard Python feature but will make 3rd party implementations cleaner)
I'm CCing Brett as I'm sure he has thoughts on this, given his intimate knowledge of importlib. To me, it seems like __import__() has a terribly complicated API because it does so many different things.
Maybe two opcodes is not even enough. Maybe we should have one to resolve relative imports (i.e. import.c:resolve_name), one to load but not exec a module given its absolute name (i.e. _find_and_load() without the exec), one to exec a loaded module, one or more to handle the horror of "fromlist" (i.e. _handle_fromlist()).