[Python-ideas] Packages and Import

Mon Feb 12 22:43:55 CET 2007

On 2/11/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Brett Cannon" <brett at python.org> wrote:
> > On 2/11/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > >
> > > "Brett Cannon" <brett at python.org> wrote:
> > > >
> > > > On 2/11/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > > > >
> > > > > Josiah Carlson <jcarlson at uci.edu> wrote:
> > > > > > Anyways...I hear where you are coming from with your statements of 'if
> > > > > > __name__ could be anything, and we could train people to use ismain(),
> > > > > > then all of this relative import stuff could *just work*'.  It would
> > > > > > require inserting a bunch of (fake?) packages in valid Python name
> > > > > > parent paths (just in case people want to do cousin, etc., imports from
> > > > > > __main__).
> > > > > >
> > > > > > You have convinced me.
> > > > >
> > > > > And in that vein, I have implemented a bit of code that mangles the
> > > > > __name__ of the __main__ module, sets up pseudo-packages for parent
> > > > > paths with valid Python names, imports __init__.py modules in ancestor
> > > > > packages, adds an ismain() function to builtins, etc.
> > > > >
> > > > > It allows for crazy things like...
> > > > >
> > > > >     from ..uncle import cousin
> > > > >     from ..parent import sibling
> > > > >     #the above equivalent to:
> > > > >     from . import sibling
> > > > >     from .sibling import nephew
> > > > >
> > > > > ...all executed within the __main__ module (which gets a new __name__).
> > > > > Even better, it works with vanilla Python 2.5, and doesn't even require
> > > > > an import hook.
> > > > >
> > > > > The only unfortunate thing is that because you cannot predict how far up
> > > > > the tree relative imports go, you cannot know how far up the paths one
> > > > > should go in creating the ancestral packages.  My current (simple)
> > > > > implementation goes as far up as the root, or the parent of the deepest
> > > > > path with an __init__.py[cw] .
> > > > >
> > > >
> > > > Just to make sure that I understand this correctly, __name__ is set to
> > > > __main__ for the module that is being executed.  Then other modules in
> > > > the package are also called __main__, but with the proper dots and
> > > > such to resolve to the proper depth in the package?
> > >
> > > No.  Say, for example, that you had a tree like the following.
> > >
> > >     .../
> > >         pk1/
> > >             pk2/
> > >                 __init__.py
> > >                 pk3/
> > >                     __init__.py
> > >                     run.py
> > >
> > > Also say that run.py was run from the command line, and the relative
> > > import code that I have written gets executed.  The following assumes
> > > that at least a "dummy" module is inserted into sys.modules['__main__']
> > >
> > > 1) A fake package called 'pk1' with __path__ == ['../pk1'] is inserted
> > > into sys.modules.
> > > 2) 'pk1.pk2' is imported as per package rules (__init__.py is executed),
> > > and gets a __path__ == ['../pk1/pk2/'] .
> > > 3) 'pk1.pk2.pk3' is imported as per package rules (__init__.py is
> > > executed), and gets a __path__ == ['../pk1/pk2/pk3'] .
> > > 4) We fetch sys.packages['__main__'], give it a new __name__ of
> > > 'pk1.pk2.pk3.__main__', but don't give it a path.  Also insert the
> > > module into sys.modules['pk1.pk2.pk3.__main__'].
> > > 5) Add ismain() to builtins.
> > > 6) The remainder of run.py is executed.
> > >
> >
> > Ah, OK.  Didn't realize you had gone ahead and done step 5.
>
> Yep, it was easy:
>
>     def ismain():
>         try:
>             raise ZeroDivisionError()
>         except ZeroDivisionError:
>             f = sys.exc_info()[2].tb_frame.f_back
>         try:
>             return sys.modules[f.f_globals['__name__']] is sys.modules['__main__']
>         except KeyError:
>             return False
>
> With the current semantics, reload would also need to be changed to
> update both __main__ and whatever.__main__ in sys.modules.
>
>
> > It's in the sandbox under import_in_py if you want the Python version.
>
> Great, found it.
>
> One issue with the code that I've been writing is that it more or less
> relies on the idea of a "root package", and that discovering the root
> package can be done in a straightforward way.  In a filesystem import,
> it looks at the path in which the __main__ module lies and ancestors up
> to the root, or the parent path of a path with an __init__.py[cw] module.
>
> For code in which __init__.py[cw] modules aren't merely placeholders to
> turn a path into a package, this could result in "undesireable" code
> being run prior to the __main__ module.
>
> It is also ambiguous when confronted with database imports in which the
> command line is something like 'python -m dbimport.sub1.sub2.runme'.  Do
> we also create/insert pseudo packages for the current path in the
> filesystem, potentially changing the "name" to something like
> "pkg1.pkg2.dbimport.sub1.sub2.runme"?  And really, this question is
> applicable to any 'python -m' command line.
>
>
> We obviously have a few options.  Among them;
> 1) make the above behavior optional with a __future__ import, must be
> done at the top of a __main__ module (ignored in all other cases)
> 2) along with 1, only perform the above when we use imports in a
> filesystem (zip imports are fine).
> 3) allow for a module variable to define how many ancestral paths are
> inserted (to prevent unwanted/unnecessary __init__ modules from being
> executed).
> 4) come up with a semantic for database and other non-filesystem imports.
> 5) toss the stuff I've hacked and more or less proposed.
>

Beats me.  =)  My brain is fried at the moment so I don't have a good
answer at the moment.

-Brett