[issue11553] Docs for: import, packages, site.py, .pth files
report at bugs.python.org
Tue Mar 15 12:39:51 CET 2011
New submission from Graham Wideman <initcontact at grahamwideman.com>:
The overall scope of this issue is that current Python documentation gives vague, sometimes incorrect, information about the set of Python features involved in modularizing functionality. This issue presents an obstacle to programmers making smooth transitions from a single module, to collections of modules and packages, then on to neatly organized common packages shared between projects.
The problem affects documentation of:
import and from...import statements
The Language Reference is way too complicated for the mainstream case. Exactly what variants of arguments are possible, and what are their effects? What are the interactions with package features, such as whether or not modules have been explicitly imported into package __init_.py?
Typical consituents; range of alternatives for adding more dirs
Multiple serious errors in the file docstring, relating to site-packages directories and .pth files
Incorrectly described in site.py, and then vaguely described in other docs.
Are .pth files processed everywhere on sys.path? Can they be interative? (No to both).
Details of package structure have evidently changed over Python versions. Current docs are unclear on points such as:
-- is __init__.py needed on subpackage directories?
-- the __all__ variable: Does it act generally to limit visibility of a module or package's attributes, or does pertain only to the 'from...import *' statement?
The description of the import statement is extensive, but dauntingly complicated for the reader trying to understand the mainstream case of simply importing modules or packages that are on sys.path. This is because the algorithm for finding modules tries numerous esoteric strategies before falling back on the plain-old-file-system method.
(Even now that I have a good understanding of the plain-old-file variations of import, I reread this and find it hard to comprehend, and disorganized and incomplete in presenting the available variations of the statement.)
Grammar issue: the grammar shown for the import statement shows:
relative_module ::= "."* module | "."+
... which implies that relative module could have zero leading dots. I believe an actual relative path is required to have at least one dot (PEP 328). Evidently, in this grammar, 'relative_module' really means "relative or absolute path to module or package", so it would be quite helpful to change to:
relative_path ::= "."+ module | "."+
from_path ::= (relative_path | module)
etc. (Really 'module' is not quite right here either since it's used to mean module-or-package.)
Module site.py implements the site-package related features. The docstring has multiple problems with consequences in other docs.
1. Does not mention user-specific site-package directories (implemented by addusersitepackages() )
2. Seriously misleading discussion of .pth files. In the docstring the example shows using pth files, called "package configuration files" in their comments, to point to actual package directories bar and foo located within the site-packages directory. This is an absolutely incorrect use of pth files: If foo and bar are packages in .../site-packages/, they do not need to be pointed to, they are already on sys.path.
If the package dirs ARE pointed to by foo.pth and bar.pth, the modules inside them will be exposed directly to sys.path, possibly precipitating name collisions. Further, programmers following this example will create packages in which import statements will appear to magically perform relative imports without leading dots, leading to confusion over how the import statement is supposed to work.
It may be that this discussion is held over from a time when "package" perhaps meant "Just a Bunch of Files in a Directory"?
3. The docstring (or other docs) should make clear that .pth files are ONLY processed within site-package directories (ie: only by site.py).
4. Bug: Minor: In addsitepackages(), the library directory for Windows (the else clause) is shown as lower-case 'lib' instead of 'Lib'. This has some possibility of causing problems when running from a case-sensitive server. In any case, if read as documentation it is misleading.
6. Modules: http://docs.python.org/py3k/tutorial/modules.html
1. Discussion (6.1.2. The Module Search Path) is good as far as it goes, but it doesn't mention the site-package directories.
2. Section 6.4. Packages: Discussion of __init__.py does describe the purpose of these files. However, the discussion in relation to subpackages should mention that for subdirectories to be accessible they must in fact be made into subpackages. I.e.: there is not form of import that can reach into a subdir of a package *unless* it's flagged as a subpackage using __init__.py. I have read elsewhere that there were some versions of Python where __init__.py was not needed on subdirs within a package.
3. Section 6.4. Packages: The discussion of __all__ should note that it works *only* in conjunction with 'from...import *', and is not a general mechanism for limiting visibility of attributes. Attributes not in the __all__ list are still accessible using other forms of import.
4. Section 6.4. Packages: "Note that when using from package import item" and following. Draws a contrast between:
from package import item vs
However, this muddles the roles of the arguments to import, and notably uses 'item' in two different ways. Instead the discussion can be presented as a comparison of:
from PackageOrModule import ModuleOrAttribute vs
... where it can be pointed out that the PackageOrModule 'dotted path' argument follows the *same* rules in both cases (except for relative paths). The *salient* contrast is that only the 'from' form has a *second* argument which can be an attribute.
5. Footnote: (somewhat unrelated, but...) says "the execution of a module-level function enters the function name in the module’s global symbol table." This is surely incorrect -- it is the execution of the function's *def* that enters the function name in the symbol table.
Standard Library Reference
1) 27.13. site — Site-specific configuration hook http://docs.python.org/py3k/library/site.html This is documentation for site.py, and is a page that might well come up in a search for '.pth'.
1a) This page may simply be importing the docstring from site.py? In any case it repeats the ommissions and errors noted above for site.py.
2) 27.1. sys — System-specific parameters and functions http://docs.python.org/py3k/library/sys.html Documentation for sys.path. OK as far as it goes, but:
2a) Could helpfully point to a discussion of the typical items to be found in sys.path under normal circumstances
2b) It does point to the site.py module documentation as the authoritative info on using .pth files (which is seriously flawed as noted above).
3) 29.3. pkgutil — Package extension utility. http://docs.python.org/py3k/library/pkgutil.html
3a) The info for pkgutil.extend_path() describes how .pkg files are similar to .pth files, and their entries should point to package directories. As I noted above, so far as I can see, package directories should be within a directory on sys,path, but should not themselves be included in the path, otherwise it breaks their capability to work properly as packages.
'Installing Python Modules' document:
http://docs.python.org/py3k/install/index.html This may well be consulted for info on how to organize source files, though it is basically the doc for Distutils.
1. Main problem is that it seems quite out-of-date; "Windows has no concept of a user’s home directory, " and so on.
2. 'How installation works' > table. For Windows suggests 'prefix' (default: C:\Python) as an installation directory. This is indeed one of the possible 'site-package' directories, but surely it is deprecated in favor of C:\Python\Lib\site-packages, which this section does not mention.
3. Does not mention user-specific site-package directories.
4. 'Modifying Python's Search Path' > "The most convenient way is to add a path configuration file to a directory that’s already on Python’s path". This is incorrect. (a) .pth files are only processed in site-package directories. (b) Clarifying an additional point of confusion -- as a consequence of (a) .pth files cannot be chained.
5. Points to docs for site.py... with it's flaws noted above.
PEP 302 New Import Hooks
Given the vagueness elsewhere in the docs, one might go hunting for ground truth in the PEPs. One must allow for PEPs having been superceeded, but nonetheless, outdated info that is now wrong is an additional stumbling block for learners.
1. Section 'Specification part 1: The importer Protocol'. Discussion says that in an import statement, a path (with no leading '.') is first treated as relative. This is now incorrect (as spelled out in PEP 328.) It would be helpful to insert a note in PEP 302 pointing out that the later revision invalidates this passage.
assignee: docs at python
nosy: docs at python, gwideman
title: Docs for: import, packages, site.py, .pth files
versions: Python 3.1, Python 3.2, Python 3.3
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list