[Python-ideas] Packages and Import

Wed Feb 7 20:21:08 CET 2007

On 2/4/07, Ron Adam <rrr at ronadam.com> wrote:
>
> After exploring this a bit further on comp.lang.python, I was able to organize
> these ideas better.  The more I thought about it, the more '+'s I found, and
> about the only '-'s I can think of is the work required to actually make a patch
> to do it.
>
> It's also good to keep in mind that since most people still rely on the old
> relative import behavior, most people have not run into some of the issues I
> mention here.  But they will at some point.
>
> I did mean to keep this short, but clarity won out. (At least it's clear to me,
> but that's an entirely subjective opinion on my part.)
>
> Maybe someone will adopt this and make a real PEP out of it.  :-)
>
> Cheers,
>    Ron
>
>
>
> PROPOSAL
> ========
>
> Make pythons concept of a package, (currently an informal type), be stronger
> than that of the underlying file system search path and directory structure.
>

So you mean make packages more of an official thing than just having a
__path__ attribute on a module, right?

>
> Where the following hold true in python 3.X, or when absolute_import behavior is
> imported from __future__ in python 2.X:
>
>
> (1) Python first determines if a module or package is part of a package and then
> runs that module or package in the context of the package they belong to. (see
> items below)
>

Don't quite follow this statement.  What do you mean by "runs" here?
You mean when using runpy or something and having the name set to
'__main__'?

>
> (2)  import this_package.module
>       import this_package.sub_package
>
> If this_package is the same name as the current package, then do not look on
> sys.path. Use the location of this_package.
>

Already does this (at least in my pure Python implementation).
Searches are done on __path__ when you are within a package.

>
> (3)  import other_package.module
>       import other_package.sub_package
>
> If other_package is a different name from the current package (this_package),
> then do not look in this_package and exclude searches in sys.path locations that
> are inside this_package including the current directory.
>

This change would require importers to do more.  Since the absolute
import semantics automatically make this kind of import start at the
top-level (i.e., sys.path), each import for an entry on sys.path would
need to be told what package it is currently in, check if it handles
that package, and then skip it if it does have it.

That seems like a lot of work that I know I don't want to have to
implement for every importer I ever write.

>
> (4)  import module
>       import package
>
> Module and package are not in a package, so don't look in any packages, even
> this one or sys.path locations inside of packages.
>

This is already done.  Absolute imports would cause this to do a
shallow check on sys.path for the module or package name.

>
> (5) For behaviors other than these, like when you do actually want to run a
> module belonging to a package in a different context, a mechanism such as a
> command line switch, or a settable import attribute should be used.
>
>
> MOTIVATION
> ==========
>
> (A) Added reliability.
>
> There will be much less chance of errors (silent or otherwise) due to
> path/import conflicts which are sometimes difficult to diagnose.
>

Probably, but I don't know if the implementation complexity warrants
worrying about this.  But then again how many people have actually
needed to implement the import machinery.  =)  I could be labeled as
jaded.

> There may also be some added security benefits as well because it would much
> harder for someone to create a same named module or package and insert it by
> putting it on the path. Or by altering sys.path to do the same. [*]
>
> [* - If this can happen there are probably more serious security issues, but not
> everyone has the most secure setup, so this point is still probably a good
> point. General reliable execution of modules is the first concern, this may be a
> side benefit of that.]
>
>
> (B) Reduce the need for special checks and editing sys.path.
>
> Currently some authors have edit sys.path or do special if os.path.exists()
> checks to ensure proper operations in some situations such as running tests.
> These suggestions would reduce the need for such special testing and modifications.
>

This might minimize some sys.path hacks in some instances, but it also
complicates imports overall in terms of implementation and semantics.

Where is point C?
>
> (D) Easier editing and testing.
>
> While you are editing modules in a package, you could then run the module
> directly (as you can with old style relative imports) and still get the correct
> package-relative behavior instead of something else. (like an exception or wrong
> output). Many editors support running the file being edited, including idle.
> It's also can be difficult to write scripts for the editors to determine the
> correct context to run a module in.
>

How is this directly solved, though?  You mentioned "running" a module
as if it is in a package, but there is no direct explanation of how
you would want to change the import machinery to pull this off.
Basically you need a way to have either modules with the name __main__
be able to get the canonical name for import purposes.  Or you need to
leave __name__ alone and set some other global or something to flag
that it is the __main__ module.

Regardless, I am not seeing how you are proposing to go about solving
this problem.

I understand the desire to fix this __main__ issue with absolute
imports and I totally support it, but I just need a more concrete
solution in front of me (assuming I am not totally blind and it is
actually in this doc).

-Brett