
After exploring this a bit further on comp.lang.python, I was able to organize these ideas better. The more I thought about it, the more '+'s I found, and about the only '-'s I can think of is the work required to actually make a patch to do it.
It's also good to keep in mind that since most people still rely on the old relative import behavior, most people have not run into some of the issues I mention here. But they will at some point.
I did mean to keep this short, but clarity won out. (At least it's clear to me, but that's an entirely subjective opinion on my part.)
Maybe someone will adopt this and make a real PEP out of it. :-)
Cheers, Ron
PROPOSAL ========
Make pythons concept of a package, (currently an informal type), be stronger than that of the underlying file system search path and directory structure.
Where the following hold true in python 3.X, or when absolute_import behavior is imported from __future__ in python 2.X:
(1) Python first determines if a module or package is part of a package and then runs that module or package in the context of the package they belong to. (see items below)
(2) import this_package.module import this_package.sub_package
If this_package is the same name as the current package, then do not look on sys.path. Use the location of this_package.
(3) import other_package.module import other_package.sub_package
If other_package is a different name from the current package (this_package), then do not look in this_package and exclude searches in sys.path locations that are inside this_package including the current directory.
(4) import module import package
Module and package are not in a package, so don't look in any packages, even this one or sys.path locations inside of packages.
(5) For behaviors other than these, like when you do actually want to run a module belonging to a package in a different context, a mechanism such as a command line switch, or a settable import attribute should be used.
MOTIVATION ==========
(A) Added reliability.
There will be much less chance of errors (silent or otherwise) due to path/import conflicts which are sometimes difficult to diagnose.
There may also be some added security benefits as well because it would much harder for someone to create a same named module or package and insert it by putting it on the path. Or by altering sys.path to do the same. [*]
[* - If this can happen there are probably more serious security issues, but not everyone has the most secure setup, so this point is still probably a good point. General reliable execution of modules is the first concern, this may be a side benefit of that.]
(B) Reduce the need for special checks and editing sys.path.
Currently some authors have edit sys.path or do special if os.path.exists() checks to ensure proper operations in some situations such as running tests. These suggestions would reduce the need for such special testing and modifications.
(D) Easier editing and testing.
While you are editing modules in a package, you could then run the module directly (as you can with old style relative imports) and still get the correct package-relative behavior instead of something else. (like an exception or wrong output). Many editors support running the file being edited, including idle. It's also can be difficult to write scripts for the editors to determine the correct context to run a module in.
(E) Consistency with from ... import ... relative imports.
A relative import also needs to find it's home package(s). These suggestions are consistent with relative import needs and would also enable relative imports to work if a module is run directly from and editor (like idle) while editing it. [*]
[* - Consistency isn't a major issue, but it's nice to have.]
(F) It would make things much easier for me. ;-)
(Insert "Me Too's" here.)
DISCUSSION ==========
(I) Python has a certain minimalist quality where it tries to do a lot with a minimum amount of resources. (Which I generally love.) But in the case of packages, that might not be the best thing. It is not difficult for python to detect if a module is located in a package.
With the move to explicit absolute/relative imports, it would make since if Python also were a little smarter in this area. Packages are becoming used more often and so it may also be useful to formalize the concept of a package in a stronger way.
(II) Many of the problems associated with imports are a side effect of using the OS's directory structure to represent a python "package" structure. This creates some external dependence on the operating system that can effect how python programs run. Some of these issues include:
- Importing the wrong package or module.
- Getting an error due to a package or module not being found.
- Getting an error due to a package not being loaded or initialized first.
- Having to run modules or packages within a very specific OS file context.
- Needing a package location to be in the systems search path.
By making the concept of a package have priority over the OS's search path and directory structure, the dependence on the OS's environment is lessoned and it would insure a module runs in the correct context or give meaningful exceptions in more cases than presently.
(III) If a package was represented as a combined single file. Then the working directory would always be the package directory. The suggestions presented here would have that effect and also reduce or eliminate most if not all of these problem situations.
(IV) The suggested changes would change the precise meaning of an absolute import.
Given the following example of an un-dotted import:
>>> import foo
The current meaning is:
"A module or package that is located in sys.path or the current directory".
But maybe a narrower interpretation of "absolute import" would be better:
"A module or package found in a specific package."
I believe that this latter definition is what most people will think of while editing their programs. When dotted imports are used, the left most part of the name is always a top level package or module in this case.
(V) Requirements to be on the search path.
It is quite reasonable to have python modules and packages not in the search path. Conversely it is not reasonable to require all python modules or packages to be in locations listed in sys.path.
While this isn't a true requirement, it is often put forth as a solution to some of the problems that occur with respect to imports.
(VI) Clearer errors messages.
In cases where a wrong module or package is imported you often get attribute exceptions further in the code. These changes would move that up to the import statement because the wrong module would not be imported.
(VII) Setting a __package__ attribute.
Would it be a good idea to have a simple way for modules to determine parent packages, and their absolute locations? Python could set these when it starts or imports a module. That may make it easier to write alternate importers that are package aware.
PROBLEMS AND ISSUES:
- Someone needs to make it happen.
I really can't think of any more than that. But I'm sure there are some as most things like this are usually a trade off of something.