[Python-ideas] Packages and Import

Josiah Carlson jcarlson at uci.edu
Sat Feb 10 02:57:44 CET 2007


"Brett Cannon" <brett at python.org> wrote:
> On 2/9/07, Josiah Carlson <jcarlson at uci.edu> wrote:
[snip]

> > Now, I tried the first of those lines in Python 2.5 and I was surpised
> > that having two files foo and goo, goo importing foo via the first
> > example above, didn't work.  What is even worse is that over a year ago
> > I was working on an import semantic for relative imports that would have
> > made the above do as I would have expected.
> >
> > This leads me to believe that *something* about relative imports is
> > broken, but being that I mostly skipped the earlier portions of this
> > particular thread, I'm not certain what it is.  I would *guess* that it
> > has to do with the way the current importer comes up with package
> > relative imports, and I believe it could be fixed by switching to a
> > path-relative import.
> >
> > That is, when module goo is performing 'from . import foo', you don't
> > look at goo.__name__ to determine where to look for foo, you look at
> > goo.__file__ .
> 
> But what about modules stored in a sqlite3 database?  How is that
> supposed to work?  What made basing relative imports off of __name__
> so nice is that it allowed the import machinery figure out what the
> resulting absolute module name was.  That allowed the modules to be
> stored any way that you wanted without making any assumptions about
> how the modules are stored or their location is determined by an
> importer's find_module method.

How is it done now?  Presumably if you have some module imported from a
database (who does this, really?), it gets a name like
dbimported.name1.name2, which an import hook can recognize as being
imported from a database.  Now, is dbimported.name1.name2 really the
content of an __init__.py file (if it was a package), or is it a module
in dbimported.name1?


Right now, we can't distinguish (based on __name__) between the cases of...

    foo/
        __init__.py

and
    foo.py

But we can, trivially, distinguish just by *examining* __file__ (at
least for files from a filesystem). For example:

    >>> import bar
    >>> import baz
    >>> bar.__name__, bar.__file__
    ('bar', 'bar.py')
    >>> baz.__name__, baz.__file__
    ('baz', 'baz\\__init__.py')
    >>>

It's pretty obvious to me which one is a package and which one is a
module.

If we codify the requirement that __file__ must end with '.../__init__.X'
(where X can be; py, pyw, pyc, so, dll, pyd, etc.) if the thing we
imported is a package, then the import hooks don't need to use the
__file__ attribute for anything other than discerning between "is this a
package, or is this a module", and can then handle the __name__ mangling
as per import semantics.  The only trick is if someone were to
specifically import an __init__ module (from .package import __init__),
but even then, the results are garbage (you get the module's __init__
method).

> But it isn't a file path, it's an absolute module name that you are after.

If one were to just do "path" manipulations, afterwards you can
translate that to an absolute name (perhaps based on the path of
__main__). Pretending that you have a path can make the semantic
non-ambigous, but I prefer the alternate I just described.


> > It also naturally leads to a __name__ semantic that Guido had suggested
> > to me when I was talking about relative imports:
> >
> >     goo.__name__ == '__main__'
> >     foo.__name__ == '__main__.foo'
> >     baz.__name__ == '__main__..bar.baz'
> >
> > Which could more or less be used with the current importer; it just
> > needs a special-casing of 'from . import ...' in the __main__ module.
> 
> And I am trying to avoid special-casing for this.

And it's only really an issue because we can't currently discern between
module or package, right?  So let us choose some semantic for
determining "is this a module or package?", perhaps set by whatever did
the importing, and skip the special cases for __main__, etc.

We can use the __file__ semantic I described earlier.  Or we can specify
a new attribute on modules; __package__.  If __package__ == __name__,
then the current module is a package.  If __package__ != __name__, then
the current module is not a package.  Regardless, when doing relative
imports, the 'name' we start out with is __package__.

For example, say we have a file called foo.py that we have run from the
command line.  It's __name__ should be '__main__', as per Python history. 
However, __package__ will be ''.  When foo.py performs 'from . import
goo', we know precisely what "package" we are in, the same package (and
"path") as the '__main__' module.

Two remaining cases:
1) If goo is a module; goo.py sits next to foo.py (or equivalently in a
database, etc.)
    goo.__package__ == ''
    goo.__name__ == 'goo'
2) If goo is a package;
    goo.__package__ == 'goo'
    goo.__name__ == 'goo'


On the other hand, say foo.py sits in 'bin', did 'from ..pa import bar',
but bar did 'from ..bin import foo', we now have an issue.  How do we
determine that the foo that bar imports is the same foo that was run
from the command line?  However, this is a problem regardless of your
'package or module' semantic, if you fix the 'from .. import baz' being
run from the '__main__' module.


 - Josiah




More information about the Python-ideas mailing list