[Python-Dev] __import__ problems
Mart Somermaa
mrts at mrts.pri.ee
Thu Nov 27 15:40:39 CET 2008
Python programmers need to dynamically load submodules instead of
top-level modules -- given a string with module hierarchy, e.g.
'foo.bar.baz', access to the tail module 'baz' is required instead
of 'foo'.
Currently, the common hack for that is to use
>>> modname = 'foo.bar.baz' mod = __import__(modname, {}, {}, [''])
This, however, is indeed an undocumented hack and, what's worse,
causes 'baz' to be imported twice, as 'baz' and 'baz.' (with tail
dot). The problem is reported in [1] and the idiom pops up in about
2000 (sic!) lines in Google Code Search [2].
There at least two workarounds:
* the getattr approach documented in [3]
* use __import__(modname, {}, {}, [modname.rsplit(".", 1)[-1]])
As both of them are clumsy and inefficient, I created a simple patch
for __import__ [4] that adds yet another argument, 'submodule'
(maybe 'tail_module' would have been more appropriate) that caters
for that particular use case:
>>> __import__('foo.bar.baz') # submodule=False by default
<module 'foo' from 'foo/__init__.py'>
>>> __import__('foo.bar.baz', submodule=True)
<module 'foo.bar.baz' from 'foo/bar/baz.py'>
>>> __import__('foo.bar.baz', fromlist=['baz'])
<module 'foo.bar.baz' from 'foo/bar/baz.py'>
---
While I was doing that, I noticed that the import_module_level()
function that does the gruntwork behind __import__ does not entirely
match the documentation [3].
Namely, [3] states "the statement from spam.ham import eggs results in
__import__('spam.ham', globals(), locals(), ['eggs'], -1)."
This is incorrect:
>>> __import__('foo.bar', globals(), locals(), ['baz'], -1)
<module 'foo.bar' from 'foo/bar/__init__.py'>
i.e. 'bar' is imported, not 'baz' (or 'ham' and not 'eggs').
As a matter of fact, anything can be in 'fromlist' (the reason for
the API abuse seen in [2]):
>>> __import__('foo.bar.baz', globals(), locals(),
... ['this_is_a_bug'], -1)
<module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>
So, effectively, 'fromlist' is already functioning as a boolean that
indicates whether the tail or toplevel module is imported.
Proposal:
* either fix __import__ to behave as documented:
# from foo.bar import baz
>>> __import__('foo.bar', fromlist=['baz'])
<module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>
# from foo.bar import baz, baq
>>> __import__('foo.bar', fromlist=['baz', 'baq'])
(<module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>,
<module 'foo.bar.baq' from 'foo/bar/baq/__init__.py'>)
and add the 'submodule' argument to support the common
__import__ use case [4],
* or if that is not feasible, retain the current boolean behaviour
and make that explicit by renaming 'fromlist' to 'submodule' (and
require the latter to be a boolean, not a list).
Too bad I couldn't come up with this before, 3.0 would have been a
perfect opportunity to get things right (one way or the other).
---
References:
[1] http://bugs.python.org/issue2090
[2] http://google.com/codesearch?hl=en&lr=&q=__import__.*%5C%5B%27%27%5C%5D
[3] http://docs.python.org/library/functions.html#__import__
[4] http://bugs.python.org/issue4438
More information about the Python-Dev
mailing list