lazy import via __future__ or compiler analysis

This is a half baked idea that perhaps could work. Maybe call it 2-stage module load instead of lazy. Introduce a lazy module import process that modules can opt-in to. The opt-in would either be with a __future__ statement or the compiler would statically analyze the module and determine if it is safe. E.g. if the module has no module level statements besides imports. .pyc files get some other bits of information: A) whether the module has opted for lazy import (IS_LAZY) B) the modules imported by the module (i.e. top-level imports, IMPORT_LIST) Make __import__ understand this data and do lazy loading for modules that want it. Sub-modules that have import side-effects will still execute as normal and the side effects will happen when the parent module is imported.. This would consist of a recursive process, something like: def load_module(name): if not IS_LAZY(name): import as usual else: create lazy version of module 'name' for subname in IMPORT_LIST(name): load_module(subname) An additional idea from Barry W, if a module wants lazy loading but wants to do some init when the module is "woken up", define a __init__ top-level function. Python would call that function when attributes of the module are first actually used. My plan was to implement this with a Python __import__ implementation. I would unmarshal the .pyc, compute IS_LAZY and IMPORT_LIST at import time. So, not gaining a lot of speedup. It would prove if the idea works in terms of not causing application crashes, etc. I could try running it with bigger apps and see how many modules are flagged for lazy loading.

Neil Schemenauer wrote:
and `def` and `class` of course. There are a few other things that might end up marking a module as "industrious" (my thesaurus's antonym for "lazy"). There will likely be assignments of module global such as: MY_CONST = 'something' and it may even be a little more complicated: COLORS = dict( red=1, blue=2, green=3, ) REVERSE = {value: key for key, value in COLORS.items()} A naive evaluation of such a module might not notice them as lazy, but I think they could still be treated as such. Function and class decorators might also be false positives. E.g. @public def my_public_function(): pass or even @mungify class Munged: pass Maybe that's just the cost of doing business, and if they clear the lazy flag, so be it. But it feels like doing so will leave quite a bit of lazy loading opportunity left on the table. And I'm not sure you can solve all of those by moving things to a module level __init__(). Cheers, -Barry

Barry Warsaw <barry@python.org> wrote:
There are a few other things that might end up marking a module as "industrious" (my thesaurus's antonym for "lazy").
Good points. The analysis can be simple at first and then we can enhance it to be smarter about what is okay and still lazy load. We may evolve it over time too, making things that are not strictly safe still not trigger the "industrious" load lazy anyhow. Another idea is to introduce __lazy__ or some such in the global namespace of the module, if present, e.g. __lazy__ = True then the analysis doesn't do anything except return True. The module has explicitly stated that side-effects in the top-level code are okay to be done in a lazy fashion. Perhaps with a little bit of smarts in the analsis and a little sprinkling of __lazy__ flags, we can get a big chunk of modules to lazy load.

Replying here, although this was written in response to the other thread: Hey Neil, In general this won't work. It's not generally possible to know if a given statement has side effects or not. As an example, one normally wouldn't expect function or class definition to have side effects, but if a function is decorated, the decorators are evaluated at function "compilation"/import time, and may have side effects. As another example, one can put arbitrary expressions in a function annotation, and those are evaluated at import time. As a result of this, you can't even know if an import is safe, because that module may have side effects. That is, the module foo.py: import bar isn't known to be lazy, because bar may import and start the logging module, as an example. While in general you might think that global state modifying decorators or annotations are a bad idea, they are used (Flask). As a result, it's not possible to implement this without breaking behavior for certain users. While I'm not a core dev and thus can't say with certainty, I will say that that makes it very unlikely for this change (or others like it) to be implemented. It might be (from a language standpoint) to implement a `lazy` keyword (ie. `lazy import foo; lazy def bar(): ...`), but if I recall, there's been discussion of that and it's never gotten very far. --Josh On Thu, Sep 7, 2017 at 1:46 PM Neil Schemenauer <nas@arctrix.com> wrote:

On Sat, Sep 9, 2017 at 1:27 AM, Joshua Morton <joshua.morton13@gmail.com> wrote:
Laziness has to be complete - or, looking the other way, eager importing is infectious. For foo to be lazy, bar also has to be lazy; if you don't know for absolute certain that bar is lazy-loadable, then you assume it isn't, and foo becomes eagerly loaded. ChrisA

On 2017-09-09, Chris Angelico wrote:
Laziness has to be complete - or, looking the other way, eager importing is infectious. For foo to be lazy, bar also has to be lazy;
Not with the approach I'm proposing. bar will be loaded in non-lazy fashion at the right time, foo can still be lazy.

On Sat, Sep 9, 2017 at 2:36 AM, Neil Schemenauer <nas-python-ideas@arctrix.com> wrote:
Ah, that's cool then! I suppose part of the confusion I had was in the true meaning of "lazy"; obviously you have to still load up the module to some extent. I'm not entirely sure how much you defer and how much you do immediately, and it looks like you have more in the 'defer' category than I thought. ChrisA

On Fri, Sep 8, 2017 at 11:36 AM, Neil Schemenauer < nas-python-ideas@arctrix.com> wrote:
I'll bring the the conversation back here instead of co-opting the PEP 562 thread. On Sun, Sep 10, 2017 at 2:45 PM, Neil Schemenauer <neil@python.ca> wrote:
I'm not sure I follow the `exec(code, module)` part from the other thread. `exec` needs a dict to exec code into, the import protocol expects you to exec code into a module.__dict__, and even the related type.__prepare__ requires a dict so it can `exec` the class body there. Code wants a dict so functions created by the code string can bind it to function.__globals__. How do you handle lazy loading when a defined function requests a global via LOAD_NAME? Are you suggesting to change function.__globals__ to something not-a-dict, and/or change LOAD_NAME to bypass function.__globals__ and instead do something like: getattr(sys.modules[function.__globals__['__name__']], lazy_identifier) ? All this chatter about modifying opcodes, adding future statements, lazy module opt-in mechanisms, special handling of __init__ or __getattr__ or SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel like an awful lot of work to me, adding even more cognitive load to an already massively complex import system. They seem to make modules even less like other objects or types. It would be really *really* nice if ModuleType got closer to being a simple class, instead of farther away. Maybe we start treating new modules like a subclass of ModuleType instead of all the half-way or special case solutions... HEAR ME OUT :-) Demo below. (also appended to end) https://gist.github.com/anthonyrisinger/b04f40a3611fd7cde10eed6bb68e8824 ``` # from os.path import realpath as rpath # from spam.ham import eggs, sausage as saus # print(rpath) # print(rpath('.')) # print(saus) $ python deferred_namespace.py <function realpath at 0x7f03db6b99d8> /home/anthony/devel/deferred_namespace Traceback (most recent call last): File "deferred_namespace.py", line 73, in <module> class ModuleType(metaclass=MetaModuleType): File "deferred_namespace.py", line 88, in ModuleType print(saus) File "deferred_namespace.py", line 48, in __missing__ resolved = deferred.__import__() File "deferred_namespace.py", line 9, in __import__ module = __import__(*self.args) ModuleNotFoundError: No module named 'spam' ``` Lazy-loading can be achieved by giving modules a __dict__ namespace that is import-aware. This parallels heavily with classes using __prepare__ to make their namespace order-aware (ignore the fact they are now order-aware by default). What if we brought the two closer together? I feel like the python object data model already has all the tools we need. The above uses __prepare__ and a module metaclass, but it could also use a custom __dict__ descriptor for ModuleType that returns an import-aware namespace (like DeferredImportNamespace in my gist). Or ModuleType.__new__ can reassign its own __dict__ (currently read-only). In all these cases we only need to make 2 small changes to Python: * Change `__import__` to call `globals.__defer__` (or similar) when appropriate instead of importing. * Create a way to make a non-binding class type so `module.function.__get__` doesn't create a bound method. The metaclass path also opens the door for passing keyword arguments to __prepare__ and __new__: from spam.ham import eggs using methods: True ... which might mean: GeneratedModuleClassName(ModuleType, methods=True): # module code ... # methods=True passed to __prepare__ and __new__, # allowing the module to implement bound methods! ... or even: import . import CustomMetaModule from spam.ham import ( eggs, sausage as saus, ) via CustomMetaModule using { methods: True, other: feature, } ... which might mean: GeneratedModuleClassName(ModuleType, metaclass=CustomMetaModule, methods=True, other=feature): # module code ... Making modules work like a real type/class means we we get __init__, __getattr__, and every other __*__ method *for free*, especially when combined with an extension to the import protocol allowing methods=True (or similar, like above). We could even subclass the namespace for each module, allowing us to effectively revert the module's __dict__ to a normal dict, and completely remove any possible overhead. Python types are powerful, let's do more of them! At the end of the day, I believe we should strive for these 3 things: * MUST work with function.__globals__[deferred], module.__dict__[deferred], and module.deferred. * SHOULD bring modules closer to normal objects, and maybe accept the fact they are more like class defe * SHOULD NOT require opt-in! Virtually every existing module will work fine. Thanks, ```python class Import: def __init__(self, args, attr): self.args = args self.attr = attr self.done = False def __import__(self): module = __import__(*self.args) if not self.attr: return module try: return getattr(module, self.attr) except AttributeError as e: raise ImportError(f'getattr({module!r}, {self.attr!r})') from e class DeferredImportNamespace(dict): def __init__(self, *args, **kwds): super().__init__(*args, **kwds) self.deferred = {} def __defer__(self, args, *names): # If __import__ is called and globals.__defer__() is defined, names to # bind are non-empty, each name is either missing from globals.deferred # or still marked done=False, then it should call: # # globals.__defer__(args, *names) # # where `args` are the original arguments and `names` are the bindings: # # from spam.ham import eggs, sausage as saus # __defer__(('spam.ham', self, self, ['eggs', 'sausage'], 0), 'eggs', 'saus') # # Records the import and what names would have been used. for i, name in enumerate(names): if name not in self.deferred: attr = args[3][i] if args[3] else None self.deferred[name] = Import(args, attr) def __missing__(self, name): # Raise KeyError if not a deferred import. deferred = self.deferred[name] try: # Replay original __import__ call. resolved = deferred.__import__() except KeyError as e: # KeyError -> ImportError so it's not swallowed by __missing__. raise ImportError(f'{name} = __import__{deferred.args}') from e else: # TODO: Still need a way to avoid binds... or maybe opt-in? # # from spam.ham import eggs, sausage using methods=True # # Save the import to namespace! self[name] = resolved finally: # Set after import to avoid recursion. deferred.done = True # Return import to original requestor. return resolved class MetaModuleType(type): @classmethod def __prepare__(cls, name, bases, defer=True, **kwds): return DeferredImportNamespace() if defer else {} class ModuleType(metaclass=MetaModuleType): # Simulate what we want to happen in a module block! __defer__ = locals().__defer__ # from os.path import realpath as rpath __defer__(('os.path', locals(), locals(), ['realpath'], 0), 'rpath') # from spam.ham import eggs, sausage as saus __defer__(('spam.ham', locals(), locals(), ['eggs', 'sausage'], 0), 'eggs', 'saus') # Good import. print(rpath) print(rpath('.')) # Bad import. print(saus) ``` -- C Anthony

On 2017-09-11, C Anthony Risinger wrote:
I propose to make function.__namespace__ be a module (or other namespace object). function.__globals__ would be a property that calls vars(function.__namespace__). Implementing this is a lot of work, need to fix LOAD_NAME, LOAD_GLOBAL and a whole heap of other things. I have a partly done proof-of-concept implementation. It crashes immediately on Python startup at this point but so far I have not seen any insurmountable issues. Doing it while perserving backwards compatibility will be a challenge. Doing it without losing performance (LOAD_GLOBAL using the fact that f_globals is an honest 'dict') is also hard. It this point, I think there is a chance we can do it. It is a conceptual simplification of Python that gives the language more consistency and more power.
I disagree. It would make for less cognitive load as LOAD_ATTR would be very simlar to LOAD_NAME/LOAD_GLOBAL. It makes modules *more* like other objects and types. I'm busy with "real work" this week and so can't follow the discussion closely or work on my proof-of-concept prototype. I hope we can come up with an elegant solution and not some special hack just to make module properties work. Regards, Neil

On Mon, Sep 11, 2017 at 1:09 PM, Neil Schemenauer < nas-python-ideas@arctrix.com> wrote:
Oh interesting, I kinda like that.
I do agree it makes module access more uniform if both defined functions and normal code end up effectively calling getattr(...), instead of directly reaching into __dict__.
I'm not sure about this though. Anything that special cases dunder methods to sort of look like their counter part on types, eg. __init__ or __getattr__ or __getattribute__ or whatever else, is a hack to me. The only way I see to remedy this discrepancy is to make modules a real subclass of ModuleType, giving them full access to the power of the type system: ``` DottedModuleName(ModuleType, bound_methods=False): # something like this: # sys.modules[__class__.__name__] = __class__._proxy_during_import() ??? # ... module code here ... sys.modules[DottedModuleName.__name__] = DottedModuleName(DottedModuleName.__name__, DottedModuleName.__doc__) ``` I've done this a few times in the past, and it works even better on python3 (python2 function.__globals__ didn't trigger __missing__ IIRC). I guess all I'm getting at, is can we find a way to make modules a real type? So dunder methods are activated? This would make modules phenomenally powerful instead of just a namespace (or resorting to after the fact __class__ reassignment hacks).
Agree, and same, but take a look at what I posted prior. I have a ton of interest around lazy/deferred module loading, have made it work a few times in a couple ways, and am properly steeping in import lore. I have bandwidth to work towards a goal that gives modules full access to dunder methods. I'll also try to properly patch Python in the way I described. Ultimately I want deferred loading everywhere, even if it means modules can't do all the other things types can do. I'm less concerned with how we get there :-) -- C Anthony

On 2017-09-11, C Anthony Risinger wrote:
My __namespace__ idea will allow this. A module can be a singleton instance of a singleton ModuleType instance. So, you can assign a property like: <this module>.__class__.prop = <property> and have it just work. Each module would have a singleton class associated with it to store the properties. The spelling of <this module> will need to be worked out. It could be sys.modules[__name__].__class__ or perhaps we can have a weakref, so this: __module__.__class__.prop = ... Need to think about this. I have done import hooks before and I know the pain involved. importlib cleans things up a lot. However, if my early prototype work is an indication, the import stuff gets a whole lot simpler. Instead of passing around a dict and then grubbing around sys.modules because the module is actually what you want, you just pass the module around directly. Thanks for you feedback. Regards, Neil

On 2017-09-11, Neil Schemenauer wrote:
A module can be a singleton instance of a singleton ModuleType instance.
Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy APIs.

On Sep 11, 2017 2:32 PM, "Neil Schemenauer" <nas-python-ideas@arctrix.com> wrote: On 2017-09-11, Neil Schemenauer wrote:
A module can be a singleton instance of a singleton ModuleType instance.
Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy API. FYI, you should be able to try this out using a custom loader the implements a create_module() method. See importlib.abc.Finder. -eric

On 2017-09-08, Joshua Morton wrote:
In general this won't work. It's not generally possible to know if a given statement has side effects or not.
That's true but with the AST static analysis, we find anything that has potential side effects. The question if any useful subset of real modules pass these checks. If we flag everything as no lazy import safe then we don't gain anything.
Decorators are handled in my latest prototype (module is not lazy).
As another example, one can put arbitrary expressions in a function annotation, and those are evaluated at import time.
Not handled yet but no reason they can't be.
That is handled as well. We only need to know if the current module is lazy safe or not. Imports of submodules that have side-effects will have those side effects happen like they do now. The major challenge I see right now is 'from .. import' and class bases (i.e. metaclass behavior). If we do the safe thing then all from-imports make the module unsafe for lazy loading and any class definition that has a base class is also unsafe. I think the idea is not yet totally dead though. We could have a command-line option to enable it. Modules that depend on side-effects of from-import and from base classes could let the compiler know about that somehow (make it explicit). That would also a good fraction of modules to be lazy import safe. Regards, Neil

Neil Schemenauer wrote:
and `def` and `class` of course. There are a few other things that might end up marking a module as "industrious" (my thesaurus's antonym for "lazy"). There will likely be assignments of module global such as: MY_CONST = 'something' and it may even be a little more complicated: COLORS = dict( red=1, blue=2, green=3, ) REVERSE = {value: key for key, value in COLORS.items()} A naive evaluation of such a module might not notice them as lazy, but I think they could still be treated as such. Function and class decorators might also be false positives. E.g. @public def my_public_function(): pass or even @mungify class Munged: pass Maybe that's just the cost of doing business, and if they clear the lazy flag, so be it. But it feels like doing so will leave quite a bit of lazy loading opportunity left on the table. And I'm not sure you can solve all of those by moving things to a module level __init__(). Cheers, -Barry

Barry Warsaw <barry@python.org> wrote:
There are a few other things that might end up marking a module as "industrious" (my thesaurus's antonym for "lazy").
Good points. The analysis can be simple at first and then we can enhance it to be smarter about what is okay and still lazy load. We may evolve it over time too, making things that are not strictly safe still not trigger the "industrious" load lazy anyhow. Another idea is to introduce __lazy__ or some such in the global namespace of the module, if present, e.g. __lazy__ = True then the analysis doesn't do anything except return True. The module has explicitly stated that side-effects in the top-level code are okay to be done in a lazy fashion. Perhaps with a little bit of smarts in the analsis and a little sprinkling of __lazy__ flags, we can get a big chunk of modules to lazy load.

Replying here, although this was written in response to the other thread: Hey Neil, In general this won't work. It's not generally possible to know if a given statement has side effects or not. As an example, one normally wouldn't expect function or class definition to have side effects, but if a function is decorated, the decorators are evaluated at function "compilation"/import time, and may have side effects. As another example, one can put arbitrary expressions in a function annotation, and those are evaluated at import time. As a result of this, you can't even know if an import is safe, because that module may have side effects. That is, the module foo.py: import bar isn't known to be lazy, because bar may import and start the logging module, as an example. While in general you might think that global state modifying decorators or annotations are a bad idea, they are used (Flask). As a result, it's not possible to implement this without breaking behavior for certain users. While I'm not a core dev and thus can't say with certainty, I will say that that makes it very unlikely for this change (or others like it) to be implemented. It might be (from a language standpoint) to implement a `lazy` keyword (ie. `lazy import foo; lazy def bar(): ...`), but if I recall, there's been discussion of that and it's never gotten very far. --Josh On Thu, Sep 7, 2017 at 1:46 PM Neil Schemenauer <nas@arctrix.com> wrote:

On Sat, Sep 9, 2017 at 1:27 AM, Joshua Morton <joshua.morton13@gmail.com> wrote:
Laziness has to be complete - or, looking the other way, eager importing is infectious. For foo to be lazy, bar also has to be lazy; if you don't know for absolute certain that bar is lazy-loadable, then you assume it isn't, and foo becomes eagerly loaded. ChrisA

On 2017-09-09, Chris Angelico wrote:
Laziness has to be complete - or, looking the other way, eager importing is infectious. For foo to be lazy, bar also has to be lazy;
Not with the approach I'm proposing. bar will be loaded in non-lazy fashion at the right time, foo can still be lazy.

On Sat, Sep 9, 2017 at 2:36 AM, Neil Schemenauer <nas-python-ideas@arctrix.com> wrote:
Ah, that's cool then! I suppose part of the confusion I had was in the true meaning of "lazy"; obviously you have to still load up the module to some extent. I'm not entirely sure how much you defer and how much you do immediately, and it looks like you have more in the 'defer' category than I thought. ChrisA

On Fri, Sep 8, 2017 at 11:36 AM, Neil Schemenauer < nas-python-ideas@arctrix.com> wrote:
I'll bring the the conversation back here instead of co-opting the PEP 562 thread. On Sun, Sep 10, 2017 at 2:45 PM, Neil Schemenauer <neil@python.ca> wrote:
I'm not sure I follow the `exec(code, module)` part from the other thread. `exec` needs a dict to exec code into, the import protocol expects you to exec code into a module.__dict__, and even the related type.__prepare__ requires a dict so it can `exec` the class body there. Code wants a dict so functions created by the code string can bind it to function.__globals__. How do you handle lazy loading when a defined function requests a global via LOAD_NAME? Are you suggesting to change function.__globals__ to something not-a-dict, and/or change LOAD_NAME to bypass function.__globals__ and instead do something like: getattr(sys.modules[function.__globals__['__name__']], lazy_identifier) ? All this chatter about modifying opcodes, adding future statements, lazy module opt-in mechanisms, special handling of __init__ or __getattr__ or SOME_CONSTANT suggesting modules-are-almost-a-class-but-not-quite feel like an awful lot of work to me, adding even more cognitive load to an already massively complex import system. They seem to make modules even less like other objects or types. It would be really *really* nice if ModuleType got closer to being a simple class, instead of farther away. Maybe we start treating new modules like a subclass of ModuleType instead of all the half-way or special case solutions... HEAR ME OUT :-) Demo below. (also appended to end) https://gist.github.com/anthonyrisinger/b04f40a3611fd7cde10eed6bb68e8824 ``` # from os.path import realpath as rpath # from spam.ham import eggs, sausage as saus # print(rpath) # print(rpath('.')) # print(saus) $ python deferred_namespace.py <function realpath at 0x7f03db6b99d8> /home/anthony/devel/deferred_namespace Traceback (most recent call last): File "deferred_namespace.py", line 73, in <module> class ModuleType(metaclass=MetaModuleType): File "deferred_namespace.py", line 88, in ModuleType print(saus) File "deferred_namespace.py", line 48, in __missing__ resolved = deferred.__import__() File "deferred_namespace.py", line 9, in __import__ module = __import__(*self.args) ModuleNotFoundError: No module named 'spam' ``` Lazy-loading can be achieved by giving modules a __dict__ namespace that is import-aware. This parallels heavily with classes using __prepare__ to make their namespace order-aware (ignore the fact they are now order-aware by default). What if we brought the two closer together? I feel like the python object data model already has all the tools we need. The above uses __prepare__ and a module metaclass, but it could also use a custom __dict__ descriptor for ModuleType that returns an import-aware namespace (like DeferredImportNamespace in my gist). Or ModuleType.__new__ can reassign its own __dict__ (currently read-only). In all these cases we only need to make 2 small changes to Python: * Change `__import__` to call `globals.__defer__` (or similar) when appropriate instead of importing. * Create a way to make a non-binding class type so `module.function.__get__` doesn't create a bound method. The metaclass path also opens the door for passing keyword arguments to __prepare__ and __new__: from spam.ham import eggs using methods: True ... which might mean: GeneratedModuleClassName(ModuleType, methods=True): # module code ... # methods=True passed to __prepare__ and __new__, # allowing the module to implement bound methods! ... or even: import . import CustomMetaModule from spam.ham import ( eggs, sausage as saus, ) via CustomMetaModule using { methods: True, other: feature, } ... which might mean: GeneratedModuleClassName(ModuleType, metaclass=CustomMetaModule, methods=True, other=feature): # module code ... Making modules work like a real type/class means we we get __init__, __getattr__, and every other __*__ method *for free*, especially when combined with an extension to the import protocol allowing methods=True (or similar, like above). We could even subclass the namespace for each module, allowing us to effectively revert the module's __dict__ to a normal dict, and completely remove any possible overhead. Python types are powerful, let's do more of them! At the end of the day, I believe we should strive for these 3 things: * MUST work with function.__globals__[deferred], module.__dict__[deferred], and module.deferred. * SHOULD bring modules closer to normal objects, and maybe accept the fact they are more like class defe * SHOULD NOT require opt-in! Virtually every existing module will work fine. Thanks, ```python class Import: def __init__(self, args, attr): self.args = args self.attr = attr self.done = False def __import__(self): module = __import__(*self.args) if not self.attr: return module try: return getattr(module, self.attr) except AttributeError as e: raise ImportError(f'getattr({module!r}, {self.attr!r})') from e class DeferredImportNamespace(dict): def __init__(self, *args, **kwds): super().__init__(*args, **kwds) self.deferred = {} def __defer__(self, args, *names): # If __import__ is called and globals.__defer__() is defined, names to # bind are non-empty, each name is either missing from globals.deferred # or still marked done=False, then it should call: # # globals.__defer__(args, *names) # # where `args` are the original arguments and `names` are the bindings: # # from spam.ham import eggs, sausage as saus # __defer__(('spam.ham', self, self, ['eggs', 'sausage'], 0), 'eggs', 'saus') # # Records the import and what names would have been used. for i, name in enumerate(names): if name not in self.deferred: attr = args[3][i] if args[3] else None self.deferred[name] = Import(args, attr) def __missing__(self, name): # Raise KeyError if not a deferred import. deferred = self.deferred[name] try: # Replay original __import__ call. resolved = deferred.__import__() except KeyError as e: # KeyError -> ImportError so it's not swallowed by __missing__. raise ImportError(f'{name} = __import__{deferred.args}') from e else: # TODO: Still need a way to avoid binds... or maybe opt-in? # # from spam.ham import eggs, sausage using methods=True # # Save the import to namespace! self[name] = resolved finally: # Set after import to avoid recursion. deferred.done = True # Return import to original requestor. return resolved class MetaModuleType(type): @classmethod def __prepare__(cls, name, bases, defer=True, **kwds): return DeferredImportNamespace() if defer else {} class ModuleType(metaclass=MetaModuleType): # Simulate what we want to happen in a module block! __defer__ = locals().__defer__ # from os.path import realpath as rpath __defer__(('os.path', locals(), locals(), ['realpath'], 0), 'rpath') # from spam.ham import eggs, sausage as saus __defer__(('spam.ham', locals(), locals(), ['eggs', 'sausage'], 0), 'eggs', 'saus') # Good import. print(rpath) print(rpath('.')) # Bad import. print(saus) ``` -- C Anthony

On 2017-09-11, C Anthony Risinger wrote:
I propose to make function.__namespace__ be a module (or other namespace object). function.__globals__ would be a property that calls vars(function.__namespace__). Implementing this is a lot of work, need to fix LOAD_NAME, LOAD_GLOBAL and a whole heap of other things. I have a partly done proof-of-concept implementation. It crashes immediately on Python startup at this point but so far I have not seen any insurmountable issues. Doing it while perserving backwards compatibility will be a challenge. Doing it without losing performance (LOAD_GLOBAL using the fact that f_globals is an honest 'dict') is also hard. It this point, I think there is a chance we can do it. It is a conceptual simplification of Python that gives the language more consistency and more power.
I disagree. It would make for less cognitive load as LOAD_ATTR would be very simlar to LOAD_NAME/LOAD_GLOBAL. It makes modules *more* like other objects and types. I'm busy with "real work" this week and so can't follow the discussion closely or work on my proof-of-concept prototype. I hope we can come up with an elegant solution and not some special hack just to make module properties work. Regards, Neil

On Mon, Sep 11, 2017 at 1:09 PM, Neil Schemenauer < nas-python-ideas@arctrix.com> wrote:
Oh interesting, I kinda like that.
I do agree it makes module access more uniform if both defined functions and normal code end up effectively calling getattr(...), instead of directly reaching into __dict__.
I'm not sure about this though. Anything that special cases dunder methods to sort of look like their counter part on types, eg. __init__ or __getattr__ or __getattribute__ or whatever else, is a hack to me. The only way I see to remedy this discrepancy is to make modules a real subclass of ModuleType, giving them full access to the power of the type system: ``` DottedModuleName(ModuleType, bound_methods=False): # something like this: # sys.modules[__class__.__name__] = __class__._proxy_during_import() ??? # ... module code here ... sys.modules[DottedModuleName.__name__] = DottedModuleName(DottedModuleName.__name__, DottedModuleName.__doc__) ``` I've done this a few times in the past, and it works even better on python3 (python2 function.__globals__ didn't trigger __missing__ IIRC). I guess all I'm getting at, is can we find a way to make modules a real type? So dunder methods are activated? This would make modules phenomenally powerful instead of just a namespace (or resorting to after the fact __class__ reassignment hacks).
Agree, and same, but take a look at what I posted prior. I have a ton of interest around lazy/deferred module loading, have made it work a few times in a couple ways, and am properly steeping in import lore. I have bandwidth to work towards a goal that gives modules full access to dunder methods. I'll also try to properly patch Python in the way I described. Ultimately I want deferred loading everywhere, even if it means modules can't do all the other things types can do. I'm less concerned with how we get there :-) -- C Anthony

On 2017-09-11, C Anthony Risinger wrote:
My __namespace__ idea will allow this. A module can be a singleton instance of a singleton ModuleType instance. So, you can assign a property like: <this module>.__class__.prop = <property> and have it just work. Each module would have a singleton class associated with it to store the properties. The spelling of <this module> will need to be worked out. It could be sys.modules[__name__].__class__ or perhaps we can have a weakref, so this: __module__.__class__.prop = ... Need to think about this. I have done import hooks before and I know the pain involved. importlib cleans things up a lot. However, if my early prototype work is an indication, the import stuff gets a whole lot simpler. Instead of passing around a dict and then grubbing around sys.modules because the module is actually what you want, you just pass the module around directly. Thanks for you feedback. Regards, Neil

On 2017-09-11, Neil Schemenauer wrote:
A module can be a singleton instance of a singleton ModuleType instance.
Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy APIs.

On Sep 11, 2017 2:32 PM, "Neil Schemenauer" <nas-python-ideas@arctrix.com> wrote: On 2017-09-11, Neil Schemenauer wrote:
A module can be a singleton instance of a singleton ModuleType instance.
Maybe more accurate to say each module would have its own unique __class__ associated with it. So, you can add properties to the class without affecting other modules. For backwards compatibility, we can create anonymous modules as needed if people are passing 'dict' objects to the legacy API. FYI, you should be able to try this out using a custom loader the implements a create_module() method. See importlib.abc.Finder. -eric

On 2017-09-08, Joshua Morton wrote:
In general this won't work. It's not generally possible to know if a given statement has side effects or not.
That's true but with the AST static analysis, we find anything that has potential side effects. The question if any useful subset of real modules pass these checks. If we flag everything as no lazy import safe then we don't gain anything.
Decorators are handled in my latest prototype (module is not lazy).
As another example, one can put arbitrary expressions in a function annotation, and those are evaluated at import time.
Not handled yet but no reason they can't be.
That is handled as well. We only need to know if the current module is lazy safe or not. Imports of submodules that have side-effects will have those side effects happen like they do now. The major challenge I see right now is 'from .. import' and class bases (i.e. metaclass behavior). If we do the safe thing then all from-imports make the module unsafe for lazy loading and any class definition that has a base class is also unsafe. I think the idea is not yet totally dead though. We could have a command-line option to enable it. Modules that depend on side-effects of from-import and from base classes could let the compiler know about that somehow (make it explicit). That would also a good fraction of modules to be lazy import safe. Regards, Neil
participants (7)
-
Barry Warsaw
-
C Anthony Risinger
-
Chris Angelico
-
Eric Snow
-
Joshua Morton
-
Neil Schemenauer
-
Neil Schemenauer