Trying to understand 'import' a bit better
frank at chagford.com
Sun Mar 4 09:15:57 CET 2012
I have been using 'import' for ages without particularly thinking about it -
it just works.
Now I am having to think about it a bit harder, and I realise it is a bit
more complicated than I had realised - not *that* complicated, but there are
I don't know the correct terminology, but I want to distinguish between the
following two scenarios -
1. A python 'program', that is self contained, has some kind of startup,
invokes certain functionality, and then closes.
2. A python 'library', that exposes functionality to other python programs,
but relies on the other program to invoke its functionality.
The first scenario has the following characteristics -
- it can consist of a single script or a number of modules
- if the latter, the modules can all be in the same directory, or in one
or more sub-directories
- if they are in sub-directories, the sub-directory must contain
__init__.py, and is referred to as a sub-package
- the startup script will normally be in the top directory, and will be
executed directly by the user
When python executes a script, it automatically places the directory
containing the script into 'sys.path'. Therefore the script can import a
top-level module using 'import <module>', and a sub-package module using
The second scenario has similar characteristics, except it will not have a
startup script. In order for a python program to make use of the library, it
has to import it. In order for python to find it, the directory containing
it has to be in sys.path. In order for python to recognise the directory as
a valid container, it has to contain __init__.py, and is referred to as a
To access a module of the package, the python program must use 'import
<package>.<module>' (or 'from <package> import <module>'), and to access a
sub-package module it must use 'import <package>.<sub-package>.<module>.
So far so uncontroversial (I hope).
The subtlety arises when the package wants to access its own modules.
Instead of using 'import <module>' it must use 'import <package>.<module>'.
This is because the directory containing the package is in sys.path, but the
package itself is not. It is possible to insert the package directory name
into sys.path as well, but as was pointed out recently, this is dangerous,
because you can end up with the same module imported twice under different
names, with potentially disastrous consequences.
Therefore, as I see it, if you are developing a project using scenario 1
above, and then want to change it to scenario 2, you have to go through the
entire project and change all import references by prepending the package
Have I got this right?
More information about the Python-list