[Python-3000] Import system questions to be considered for Py3k
Nick Coghlan
ncoghlan at gmail.com
Sun Jul 16 06:43:35 CEST 2006
Taking the "import system" to be the overall interaction between the Python
module namespace and the file system of the underlying computer, I thought I'd
start compiling a list of the questions we'll want to consider for Py3k. The
answers to some of them may be "the status quo is fine" but we should still
ask the questions.
I'll eventually capture the discussion in a Py3k PEP (although I believe many
of the questions could actually be addressed for 2.6).
The list I've got so far (including some thoughts about possible solutions):
Change to hybrid implementation
-------------------------------
This idea would try to reduce the amount of code in import.c, pushing more of
the logic into Python code. An advantage of this is that much of the PEP 302
structure for the standard import mechanisms already exists in pkgutil (since
PJE consolidated the various emulations that had been added to the standard
library). Additionally, import logic written in Python would automatically
benefit from the full Unicode filename support of the builtin open() function.
The various string manipulation operations involved would also be
significantly easier to handle.
There would be some bootstrapping issues, but I think it would be better to
try to solve them, rather than continuing to maintain the partial file system
access API reimplementation that import.c currently uses (that, for example,
doesn't provide full Unicode filename support on Windows).
Even if most of the logic stays in C code, it would be good to find a way to
use the full filesystem API, rather than the current import-only subset.
Use smarter data structures
---------------------------
Currently, the individual handlers to load a fully identified module are
exposed to Python code in a way that reflects the C-style data structures used
in the current implementation.
Simply switching to more powerful data structures for the file type handlers
(i.e. use a PyTuple for filedescr values, a PyList for _PyImport_FileTab, and
a PyDict instead of a switch statement to go from filedescr values to module
loading/initialisation functions) and manipulating them all as normal Python
objects could make the code in import.c much easier to follow.
Extensible file type handling
-----------------------------
If the file type handlers are stored in normal Python data structures as
described above, it becomes feasible to make the import system extensible to
different file types as well as to different file locations.
This could be handled on a per-package basis, e.g. via a __file_types__
special attribute in packages.
Locating support files
----------------------
Currently, locating support files is difficult because __loader__ isn't
defined for standard modules, and __file__ may not be defined properly if the
module isn't being executed via load_module(). This needs to be changed so
that there is an obvious way to locate support files located in the same
directory as the current module.
Determining the value of __file__
---------------------------------
In PEP 302, the logic to determine the value of __file__ is internal to the
load_module() method. Should this be exposed so that, e.g., runpy.run_module
can use it?
Handling sys.argv[0]
--------------------
Should new attributes be added to sys to separate out argv[0] from the command
line arguments? For example, sys.mainfile (== sys.argv[0]) and sys.args (==
sys.argv[1:]).
This has compatibility implications for code that _sets_ sys.argv, and expects
other code to see the changes.
Determining the value of sys.path[0]
------------------------------------
sys.path[0] is set by the interpreter, depending on how the interpreter was
started.
If the main module is executed by filename, then sys.path[0] is set to the
directory containing that file. If the main module is inside a package, all of
the modules in that package have an aliasing problem (reachable as both
top-level modules and by their full name).
All other means of invocation leave sys.path[0] set to '', to indicate
"current working directory". Should this actually read the name of the current
working directory from the OS when the interpreter starts, or should it
continue to reflect changes to the working directory over the course of execution?
Should there be a command line switch to set sys.path[0] directly (or avoid
having it set at all)? This would make it possible to avoid the aliasing
problem with running modules from inside package directories, as well as
allowing -m execution to be used for a module or package that is not in the
current directory, but isn't on PYTHONPATH or in site-packages, either. (The
latter would be a convenience for testing purposes, rather than something an
installed Python application should be reliant on)
Handling relative imports
-------------------------
Currently the import system has to look at __name__, and then check if
__path__ is present, in order to decide how to handle a relative import - the
handling is different depending on whether the current module is a package or not.
Defining a new special variable __pkg_name__ would allow the import system to
use consistent logic for both packages and normal modules. This would also
mean that relative imports would work correctly even when __name__ is set to
something like "__main__".
Revisiting PEP 299
------------------
The general consensus recently has been that the "if __name__ == '__main__':"
idiom for modules that can be both support modules and main modules is both
ugly and unintuitive.
PEP 299 (__main__ functions) was rejected for the 2.x series due to backward
compatibility concerns (in particular, with modules that include the line
"import __main__"). Py3k provides an opportunity to revisit that decision.
If it is taken as a given that the idiom needs to change, then the question is
whether the major change proposed by PEP 299 is a better option than a simpler
change such as a new builtin boolean variable that can be tested via something
like "if is_main:".
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-3000
mailing list