[Python-Dev] __import__ problems

Mart Somermaa mrts at mrts.pri.ee
Thu Nov 27 15:40:39 CET 2008


Python programmers need to dynamically load submodules instead of
top-level modules -- given a string with module hierarchy, e.g.
'foo.bar.baz', access to the tail module 'baz' is required instead
of 'foo'.

Currently, the common hack for that is to use

>>> modname = 'foo.bar.baz' mod = __import__(modname, {}, {}, [''])

This, however, is indeed an undocumented hack and, what's worse,
causes 'baz' to be imported twice, as 'baz' and 'baz.' (with tail
dot). The problem is reported in [1] and the idiom pops up in about
2000 (sic!) lines in Google Code Search [2].

There at least two workarounds:
  * the getattr approach documented in [3]
  * use __import__(modname, {}, {}, [modname.rsplit(".", 1)[-1]])

As both of them are clumsy and inefficient, I created a simple patch
for __import__ [4] that adds yet another argument, 'submodule'
(maybe 'tail_module' would have been more appropriate) that caters
for that particular use case:

>>> __import__('foo.bar.baz') # submodule=False by default
<module 'foo' from 'foo/__init__.py'>

>>> __import__('foo.bar.baz', submodule=True)
<module 'foo.bar.baz' from 'foo/bar/baz.py'>

>>> __import__('foo.bar.baz', fromlist=['baz'])
<module 'foo.bar.baz' from 'foo/bar/baz.py'>

---

While I was doing that, I noticed that the import_module_level()
function that does the gruntwork behind __import__ does not entirely
match the documentation [3].

Namely, [3] states "the statement from spam.ham import eggs results in
__import__('spam.ham', globals(), locals(), ['eggs'], -1)."

This is incorrect:

>>> __import__('foo.bar', globals(), locals(), ['baz'], -1)
<module 'foo.bar' from 'foo/bar/__init__.py'>

i.e. 'bar' is imported, not 'baz' (or 'ham' and not 'eggs').

As a matter of fact, anything can be in 'fromlist' (the reason for
the API abuse seen in [2]):

>>> __import__('foo.bar.baz', globals(), locals(),
... ['this_is_a_bug'], -1)
<module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>

So, effectively, 'fromlist' is already functioning as a boolean that
indicates whether the tail or toplevel module is imported.

Proposal:

  * either fix __import__ to behave as documented:

     # from foo.bar import baz
     >>> __import__('foo.bar', fromlist=['baz'])
     <module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>

     # from foo.bar import baz, baq
     >>> __import__('foo.bar', fromlist=['baz', 'baq'])
     (<module 'foo.bar.baz' from 'foo/bar/baz/__init__.py'>,
     <module 'foo.bar.baq' from 'foo/bar/baq/__init__.py'>)

    and add the 'submodule' argument to support the common
    __import__ use case [4],

  * or if that is not feasible, retain the current boolean behaviour
    and make that explicit by renaming 'fromlist' to 'submodule' (and
    require the latter to be a boolean, not a list).

Too bad I couldn't come up with this before, 3.0 would have been a
perfect opportunity to get things right (one way or the other).

---

References:
[1] http://bugs.python.org/issue2090
[2] http://google.com/codesearch?hl=en&lr=&q=__import__.*%5C%5B%27%27%5C%5D
[3] http://docs.python.org/library/functions.html#__import__
[4] http://bugs.python.org/issue4438


More information about the Python-Dev mailing list