Re: [Python-ideas] Python hook just before NameError

On Mon, Dec 29, 2014 at 08:39:41AM -0800, Rick Johnson wrote:
[...]
All you've done is to replace "import decimal" with "decimal = lazy_import('decimal')" -- what's the advantage?
It delays the actual import of the module until you try to use it. For decimal, that's not much of an advantage, but some modules are quite expensive to import the first time, and you might not want to pay that cost at application start-up. Personally, I have no use for this, but people have been talking about lazy importing recently, and I wanted to see how hard it would be to do. -- Steven

On Tue, Dec 30, 2014 at 3:58 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Even more so if this is done for _every module in the system_. I'm thinking of testing this out; creating a lazy import object for every single findable module, although before I start that, I'll see if I can hunt down a tab-completion routine rather than manually searching sys.path. In theory, enumerating modules shouldn't take too long (it's not like I have network mounts in PYTHONPATH), so if that's sufficiently fast, I might toss that into a permanent startup script, rather than having my current "import on NameError" trap. Small downside: This *does* require that the modules all exist at process start and never get renamed or deleted. I'm sure that'll be a problem in some obscure case somewhere, but probably not a practical issue :) ChrisA

On Mon, Dec 29, 2014 at 10:58 PM, Steven D'Aprano <steve@pearwood.info> wrote:
the startup savings are significant, eg. xacto (CLI generation tool) uses lazy importing to rapidly import all "tools" without triggering cascading imports: https://github.com/xtfxme/xacto/blob/master/xacto/__init__.py#L240 ...for the purpose of scanning their signatures and generating argparse subcommands. without aggressive lazy imports, a simple `./mytool --help` can take a very long time. the method xacto uses is a bit different, and imo at least, superior to using proxy/lazy objects; instead of loading a proxy object at import-time, xacto executes code in a namespace that implements `__missing__`, therefore allowing NameErrors to become `namespace.__missing__(key)`: * install meta_path hook * code tries to import something *not* already imported: from foo import bar as baz * meta importer returns a special str representing the import instead: 'foo:bar' * interpreter tries to update namespace: namespace.__setitem__('baz', 'foo:bar') * namespace detects/remembers the details, but DISCARDS THE KEY! * next time code accesses global 'baz': namespace.__missing__('baz') * namespace performs import and updates itself ...this pattern allows the module *itself* to be lazy imported, rather than a proxy, so long at the module is only imported and not interacted with (or more precisely, referenced in any way) and is also zero-overhead after the first reference. sadly this only half works in python2 (but to my surprise, works fine in python 3.4!). in python2, module-level code will trigger __missing__, but functions (with their __globals__ bound to a custom dict) WILL NOT: python 2.7: >>> from collections import defaultdict >>> ns = defaultdict(list) >>> ne = eval('lambda: name_error', ns) >>> assert ne.__globals__ is ns >>> ne() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1, in <lambda> NameError: global name 'name_error' is not defined python 3.4: >>> from collections import defaultdict >>> ns = defaultdict(list) >>> ne = eval('lambda: name_error', ns) >>> assert ne.__globals__ is ns >>> ne() [] ...just some ancillary info some folks may find useful/interesting :) -- C Anthony

On Tue, Dec 30, 2014 at 3:58 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Even more so if this is done for _every module in the system_. I'm thinking of testing this out; creating a lazy import object for every single findable module, although before I start that, I'll see if I can hunt down a tab-completion routine rather than manually searching sys.path. In theory, enumerating modules shouldn't take too long (it's not like I have network mounts in PYTHONPATH), so if that's sufficiently fast, I might toss that into a permanent startup script, rather than having my current "import on NameError" trap. Small downside: This *does* require that the modules all exist at process start and never get renamed or deleted. I'm sure that'll be a problem in some obscure case somewhere, but probably not a practical issue :) ChrisA

On Mon, Dec 29, 2014 at 10:58 PM, Steven D'Aprano <steve@pearwood.info> wrote:
the startup savings are significant, eg. xacto (CLI generation tool) uses lazy importing to rapidly import all "tools" without triggering cascading imports: https://github.com/xtfxme/xacto/blob/master/xacto/__init__.py#L240 ...for the purpose of scanning their signatures and generating argparse subcommands. without aggressive lazy imports, a simple `./mytool --help` can take a very long time. the method xacto uses is a bit different, and imo at least, superior to using proxy/lazy objects; instead of loading a proxy object at import-time, xacto executes code in a namespace that implements `__missing__`, therefore allowing NameErrors to become `namespace.__missing__(key)`: * install meta_path hook * code tries to import something *not* already imported: from foo import bar as baz * meta importer returns a special str representing the import instead: 'foo:bar' * interpreter tries to update namespace: namespace.__setitem__('baz', 'foo:bar') * namespace detects/remembers the details, but DISCARDS THE KEY! * next time code accesses global 'baz': namespace.__missing__('baz') * namespace performs import and updates itself ...this pattern allows the module *itself* to be lazy imported, rather than a proxy, so long at the module is only imported and not interacted with (or more precisely, referenced in any way) and is also zero-overhead after the first reference. sadly this only half works in python2 (but to my surprise, works fine in python 3.4!). in python2, module-level code will trigger __missing__, but functions (with their __globals__ bound to a custom dict) WILL NOT: python 2.7: >>> from collections import defaultdict >>> ns = defaultdict(list) >>> ne = eval('lambda: name_error', ns) >>> assert ne.__globals__ is ns >>> ne() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1, in <lambda> NameError: global name 'name_error' is not defined python 3.4: >>> from collections import defaultdict >>> ns = defaultdict(list) >>> ne = eval('lambda: name_error', ns) >>> assert ne.__globals__ is ns >>> ne() [] ...just some ancillary info some folks may find useful/interesting :) -- C Anthony
participants (3)
-
C Anthony Risinger
-
Chris Angelico
-
Steven D'Aprano