[Python-3000] Import system questions to be considered for Py3k

Brett Cannon brett at python.org
Sun Jul 16 07:53:47 CEST 2006


On 7/15/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> Taking the "import system" to be the overall interaction between the
> Python
> module namespace and the file system of the underlying computer, I thought
> I'd
> start compiling a list of the questions we'll want to consider for Py3k.
> The
> answers to some of them may be "the status quo is fine" but we should
> still
> ask the questions.
>
> I'll eventually capture the discussion in a Py3k PEP (although I believe
> many
> of the questions could actually be addressed for 2.6).
>
> The list I've got so far (including some thoughts about possible
> solutions):
>
> Change to hybrid implementation
> -------------------------------
> This idea would try to reduce the amount of code in import.c, pushing more
> of
> the logic into Python code. An advantage of this is that much of the PEP
> 302
> structure for the standard import mechanisms already exists in pkgutil
> (since
> PJE consolidated the various emulations that had been added to the
> standard
> library). Additionally, import logic written in Python would automatically
> benefit from the full Unicode filename support of the builtin open()
> function.
>
> The various string manipulation operations involved would also be
> significantly easier to handle.
>
> There would be some bootstrapping issues, but I think it would be better
> to
> try to solve them, rather than continuing to maintain the partial file
> system
> access API reimplementation that import.c currently uses (that, for
> example,
> doesn't provide full Unicode filename support on Windows).
>
> Even if most of the logic stays in C code, it would be good to find a way
> to
> use the full filesystem API, rather than the current import-only subset.


There is a strong possibility I will be rewriting the import machinery in
Python in order to make my sandboxing life easier.  Plus I realize it just
needs to be done.

And the bootstrapping issue is not a problem.  I think I have a solution for
that one.  As long as I write in C the part of importing that handles
modules that are compiled into the interpreter (so that you can get access
to posix & friends along with sys without having to actually import any
other code), then most of it can be written in Python.

-Brett

Use smarter data structures
> ---------------------------
> Currently, the individual handlers to load a fully identified module are
> exposed to Python code in a way that reflects the C-style data structures
> used
> in the current implementation.
>
> Simply switching to more powerful data structures for the file type
> handlers
> (i.e. use a PyTuple for filedescr values, a PyList for _PyImport_FileTab,
> and
> a PyDict instead of a switch statement to go from filedescr values to
> module
> loading/initialisation functions) and manipulating them all as normal
> Python
> objects could make the code in import.c much easier to follow.
>
> Extensible file type handling
> -----------------------------
> If the file type handlers are stored in normal Python data structures as
> described above, it becomes feasible to make the import system extensible
> to
> different file types as well as to different file locations.
>
> This could be handled on a per-package basis, e.g. via a __file_types__
> special attribute in packages.
>
> Locating support files
> ----------------------
> Currently, locating support files is difficult because __loader__ isn't
> defined for standard modules, and __file__ may not be defined properly if
> the
> module isn't being executed via load_module(). This needs to be changed so
> that there is an obvious way to locate support files located in the same
> directory as the current module.
>
> Determining the value of __file__
> ---------------------------------
> In PEP 302, the logic to determine the value of __file__ is internal to
> the
> load_module() method. Should this be exposed so that, e.g.,
> runpy.run_module
> can use it?
>
> Handling sys.argv[0]
> --------------------
> Should new attributes be added to sys to separate out argv[0] from the
> command
> line arguments? For example, sys.mainfile (== sys.argv[0]) and sys.args(==
> sys.argv[1:]).
>
> This has compatibility implications for code that _sets_ sys.argv, and
> expects
> other code to see the changes.
>
>
> Determining the value of sys.path[0]
> ------------------------------------
> sys.path[0] is set by the interpreter, depending on how the interpreter
> was
> started.
>
> If the main module is executed by filename, then sys.path[0] is set to the
> directory containing that file. If the main module is inside a package,
> all of
> the modules in that package have an aliasing problem (reachable as both
> top-level modules and by their full name).
>
> All other means of invocation leave sys.path[0] set to '', to indicate
> "current working directory". Should this actually read the name of the
> current
> working directory from the OS when the interpreter starts, or should it
> continue to reflect changes to the working directory over the course of
> execution?
>
> Should there be a command line switch to set sys.path[0] directly (or
> avoid
> having it set at all)? This would make it possible to avoid the aliasing
> problem with running modules from inside package directories, as well as
> allowing -m execution to be used for a module or package that is not in
> the
> current directory, but isn't on PYTHONPATH or in site-packages, either.
> (The
> latter would be a convenience for testing purposes, rather than something
> an
> installed Python application should be reliant on)
>
> Handling relative imports
> -------------------------
> Currently the import system has to look at __name__, and then check if
> __path__ is present, in order to decide how to handle a relative import -
> the
> handling is different depending on whether the current module is a package
> or not.
>
> Defining a new special variable __pkg_name__ would allow the import system
> to
> use consistent logic for both packages and normal modules. This would also
> mean that relative imports would work correctly even when __name__ is set
> to
> something like "__main__".
>
> Revisiting PEP 299
> ------------------
> The general consensus recently has been that the "if __name__ ==
> '__main__':"
> idiom for modules that can be both support modules and main modules is
> both
> ugly and unintuitive.
>
> PEP 299 (__main__ functions) was rejected for the 2.x series due to
> backward
> compatibility concerns (in particular, with modules that include the line
> "import __main__"). Py3k provides an opportunity to revisit that decision.
>
> If it is taken as a given that the idiom needs to change, then the
> question is
> whether the major change proposed by PEP 299 is a better option than a
> simpler
> change such as a new builtin boolean variable that can be tested via
> something
> like "if is_main:".
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://www.boredomandlaziness.org
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20060715/7910a272/attachment-0001.html 


More information about the Python-3000 mailing list