[Import-SIG] Dabeaz's weird import discovery

Barry Warsaw barry at python.org
Wed Apr 22 17:59:59 CEST 2015


So I've been trying to catch up on Pycon 2015 videos.  David Beazley is always
entertaining so I figured I'd spend a little time on his three hour tour of
modules and packages:

https://www.youtube.com/watch?v=0oTh1CXRaQ0

About half an hour in, I got shipwrecked on an oddity of the import system.
That it surprised dabeaz too gave me some satisfaction, and like a Professor I
got curious and did some experimentation.

(ObMoratorium from here out: Gilligan's Island reference.)

The weirdness is evident in asyncio/__init__.py where you have a bunch of
explicit relative from-import-*'s and then seemingly out of nowhere, __all__
makes references to the named submodules.

That's damn surprising if you understand how name bindings happen in import
statements, which I thought I did. ;).  There's no explicit name binding to
those submodules so that __all__ should throw NameErrors.  E.g. in

    from string import *

you don't expect, nor do you get, 'string' bound in the current namespace.
Yet when asyncio/__init__.py does

    from .base_events import *

you *do* get a name binding for 'base_events'.

David notes the weirdness in his talk, but his explanation was unsatisfying.

Let's look at a short example:

-----snip snip-----
spam/
    __init__.py
    foo.py
    bar.py


spam/__init__py
===============
from .foo import *
print(foo)
from .bar import *
print(bar)
__all__ = foo.__all__ + bar.__all__


spam/foo.py
===========
print('foo')
__all__ = ['Foo']
class Foo:
    pass


spam/bar.py
===========
print('bar')
__all__ = ['Bar']
class Bar:
    pass


$ python3
>>> from spam import *
foo
<module 'spam.foo' from '/private/tmp/spam/foo.py'>
bar
<module 'spam.bar' from '/private/tmp/spam/bar.py'>
-----snip snip-----

As it turns out, it's not the from-import-* that does the name binding, it's
the importing of submodules.  Use any other submodule import spelling to make
it work.  This includes

    import spam.foo
    from spam.foo import Foo
    __import__('spam.foo')
    importlib.import_module('spam.foo')

Poking around in Lib/importlib/_bootstrap.py, I think you can see where this
happens.  In _find_and_load_unlocked(), 'round about line 2224 (in 3.5's
hg:95593), you see this:

    if parent:
        # Set the module as an attribute on its parent.
        parent_module = sys.modules[parent]
        setattr(parent_module, name.rpartition('.')[2], module)

It's clearly intentional, and fundamental to importlib so I don't think it's
dependent on finder or loader.  No matter how it happens, if a submodule is
imported, its parent namespace gets a name binding to the submodule.

What was the motivation for this?  Was the intent really to bind submodule
names in the parent module seemingly magically?

AFAICT, this also isn't actually documented anywhere.  I've looked in the
Language Reference under the import system[*], and import statement, nor in
the Library Reference under __import__().  There's lots of material here, so I
could be missing it.

I don't know whether any of the alternative implementations also implement
this behavior, but they'll have to.

I think this needs to be documented in the Language Reference, and after some
feedback here, I'll open a docs bug and write some text to fix it.

Cheers,
-Barry

[*] which I wrote, and I'm still surprised! :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/import-sig/attachments/20150422/a6280d4c/attachment.sig>


More information about the Import-SIG mailing list