[Import-SIG] PEP proposal: Per-Module Import Path
Eric Snow
ericsnowcurrently at gmail.com
Tue Jul 30 05:11:59 CEST 2013
Sorry for the delay, all. I totally forgot about an upcoming trip and just
got back. It'll be a day or two before I have a minute to respond.
-eric
On Thu, Jul 18, 2013 at 4:10 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:
> Hi,
>
> Nick talked me into writing this PEP, so blame him for the idea. <wink> I
> haven't had a chance to polish it up, but the content communicates the
> proposal well enough to post here. Let me know what you think. Once some
> concensus is reached I'll commit the PEP and post to python-dev. I have a
> rough implementation that'll I'll put online when I get a chance.
>
> If Guido is not interested maybe Brett would like to be BDFL-Delegate. :)
>
> -eric
>
>
> PEP: 4XX
> Title: Per-Module Import Path
> Version: $Revision$
> Last-Modified: $Date$
> Author: Eric Snow <ericsnowcurrently at gmail.com>
> Nick Coghlan <ncoghlan at gmail.com>
> BDFL-Delegate: ???
> Discussions-To: import-sig at python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 17-Jul-2013
> Python-Version: 3.4
> Post-History: 18-Jul-2013
> Resolution:
>
>
> Abstract
> =======
>
> Path-based import of a module or package involves traversing ``sys.path``
> or a package path to locate the appropriate file or directory(s).
> Redirecting from there to other locations is useful for packaging and
> for virtual environments. However, in practice such redirection is
> currently either `limited or fragile <Existing Alternatives>`_.
>
> This proposal provides a simple filesystem-based method to redirect from
> the normal module search path to other locations recognized by the
> import system. This involves one change to path-based imports, adds one
> import-related file type, and introduces a new module attribute. One
> consequence of this PEP is the deprecation of ``.pth`` files.
>
>
> Motivation
> =========
>
> One of the problems with virtual environments is that you are likely to
> end up with duplicate installations of lots of common packages, and
> keeping them up to date can be a pain.
>
> One of the problems with archive-based distribution is that it can be
> tricky to register the archive as a Python path entry when needed
> without polluting the path of applications that don't need it.
>
> One of the problems with working directly from a source checkout is
> getting the relevant source directories onto the Python path, especially
> when you have multiple namespace package fragments spread across several
> subdirectories of a large repository.
>
> The `current solutions <Existing Alternatives>`_ all have their flaws.
> Reference files are intended to address those deficiencies.
>
>
> Specification
> ===========
>
> Change to the Import System
> -----------------------------
>
> Currently, during `path-based import` of a module, the following happens
> for each `path entry` of `sys.path` or of the `__path__` of the module's
> parent:
>
> 1. look for `<path entry>/<name>/__init__.py` (and other supported
> suffixes),
> * return `loader`;
> 2. look for `<path entry>/<name>.py` (and other supported suffixes),
> * return loader;
> 3. look for `<path entry>/<name>/`,
> * extend namespace portions path.
>
> Once the path is exhausted, if no `loader` was found and the `namespace
> portions` path is non-empty, then a `NamespaceLoader` is returned with that
> path.
>
> This proposal inserts a step before step 1 for each `path entry`:
>
> 0. look for `<path entry>/<name>.ref`
> a. get loader for `<fullname>` (absolute module name) using path found
> in `.ref` file (see below) using `the normal mechanism`[link to language
> reference],
> * stop processing the path entry if `.ref` file is empty;
> b. check for `NamespaceLoader`,
> * extend namespace portions path;
> c. otherwise, return loader.
>
> Note the following consequences:
>
> * if a ref file is found, it takes precedence over module files and
> package directories under the same path entry (see `Empty Ref Files as
> Markers`_);
> * that holds for empty ref files also;
> * the loader for a ref file, if any, comes from the full import system
> (i.e. `sys.meta_path`) rather than just the path-based import system;
> * `.ref` files can indirectly provide fragments for namespace packages.
>
> Reference Files
> ---------------
>
> A new kind of file will live alongside package directories and module
> source files: reference files. These files have the following
> characteristics:
>
> * named `<module name>.ref` in contrast to `<module name>.py` (etc.) or
> `<module name>/`;
> * placed under `sys.path` entries or package path (just like modules and
> packages).
>
> Reference File Format
> ----------------------
>
> The contents of a reference file will conform to the following format:
>
> * contain zero or more path entries, just like sys.path;
> * one path entry per line;
> * path entry order is preserved;
> * may contain comment lines starting with "#", which are ignored;
> * may contain blank lines, which are ignored;
> * must be UTF-8 encoded.
>
> Directory Path Entries
> ----------------------
>
> Directory names are by far the most common type of path entry. Here is
> how they are constrained in reference files:
>
> * may be absolute or relative;
> * must be forward slash separated regardless of platform;
> * each must be the parent directory where the module will be looked for.
>
> To be clear, reference files (just like `sys.path`) deliberately reference
> the *parent* directory to be searched (rather than the module or package
> directory). So they work transparently with `__pycache__` and allow
> searching for `.dist-info <PEP 376>`_ directories through them.
>
> Relative directory names will be resolved based on the directory
> containing the ref file, rather than the current working directory.
> Allowing relative directory names allows you to include sensible ref files
> in a source repo.
>
> Empty Ref Files as Markers
> -----------------------------
>
> Handling `.ref` files first allows for the use of empty ref files as
> markers to indicate "this is not the module you are looking for". Here are
> two situations where that helps.
>
> First, an empty ref file helps resolve conflicts between script names and
> package names. When the interpreter is started with a filename, the
> directory of that script is added to the front of `sys.path`. This may be
> a problem for later imports where the intended module or package is on a
> regular path entry.
>
> If an import references the script's name, the file will get run again by
> the import system as a module (only `__main__` was added to `sys.modules`
> earlier) [PEP 395]_. This is a further problem if you meant to import a
> module or package in another path entry.
>
> The presence of an empty ref file in the script's directory would
> essentially render it invisible to the import system. This problem and
> solution apply for all of the files or directories in the script's
> directory.
>
> Second, the namespace package mechanism has a side-effect: a directory
> without a __init__.py may be incorrectly treated as a namespace package
> fragment. The presence of an empty ref file indicates such a directory
> should be ignored.
>
> A Module Attribute to Expose Contributing Ref Files
> ---------------------------------------------
>
> Knowing the origin of a module is important when tracking down problems,
> particularly import-related ones. Currently, that entails looking at
> `<module>.__file__` and `<module.__package__>.__path__` (or `sys.path`).
>
> With this PEP there can be a chain of ref files in between the currently
> available path and a module's __file__. Having access to that list of ref
> files is important in order to determine why one file was selected over
> another as the origin for the module. When an unexpected file gets used
> for one of your imports, you'll care about this!
>
> In order to facilitate that, modules will have a new attribute:
> `__indirect__`. It will be a tuple comprised of the chain of ref files, in
> order, used to locate the module's __file__. An empty tuple or with one
> item will be the most common case. An empty tuple indicates that no ref
> files were used to locate the module.
>
> Examples
> --------
>
> XXX are these useful?
>
> Top-level module (`import spam`)::
>
> ~/venvs/ham/python/site-packages/
> spam.ref
>
> spam.ref:
> # use the system installed module
> /python/site-packages
>
> /python/site-packages:
> spam.py
>
> spam.__file__:
> "/python/site-packages/spam.py"
>
> spam.__indirect__:
> ("~/venvs/ham/python/site-packages/spam.ref",)
>
> Submodule (`python -m myproject.tests`)::
>
> ~/myproject/
> setup.py
> tests/
> __init__.py
> __main__.py
> myproject/
> __init__.py
> tests.ref
>
> tests.ref:
> ../
>
> myproject.__indirect__:
> ()
>
> myproject.tests.__file__:
> "~/myproject/tests/__init__.py"
>
> myproject.tests.__indirect__:
> ("~/myproject/myproject/tests.ref",)
>
> Multiple Path Entries::
>
> myproj/
> __init__.py
> mod.ref
>
> mod.ref:
> # fall back to the old one
> /python/site-packages/mod-new/
> /python/site-packages/mod-old/
>
> /python/site-packages/
> mod-old/
> mod.py
>
> myproj.mod.__file__:
> "/python/site-packages/mod-old/mod.py"
>
> myproj.mod.__indirect__:
> ("myproj/mod.ref",)
>
> Chained Ref Files::
>
> venvs/ham/python/site-packages/
> spam.ref
>
> venvs/ham/python/site-packages/spam.ref:
> # use the system installed module
> /python/site-packages
>
> /python/site-packages/
> spam.ref
>
> /python/site-packages/spam.ref:
> # use the clone
> ~/clones/myproj/
>
> ~/clones/myproj/
> spam.py
>
> spam.__file__:
> "~/clones/myproj/spam.py"
>
> spam.__indirect__:
> ("venvs/ham/python/site-packages/spam.ref",
> "/python/site-packages/spam.ref")
>
> Reference Implementation
> ------------------------
>
> A reference implementation is available at <TBD>.
>
> XXX double-check zipimport support
>
>
> Deprecation of .pth Files
> =============================
>
> The `site` module facilitates the composition of `sys.path`. As part of
> that, `.pth` files are processed and entries added to `sys.path`. Ref
> files are intended as a replacement.
>
> XXX also deprecate .pkg files (see pkgutil.extend_path())?
>
> Consequently, `.pth` files will be deprecated.
>
> Deprecation Schedule
> -------------------------
>
> 1. documented: 3.4,
> 2. warnings: 3.5 and 3.6,
> 3. removal: 3.7
>
> XXX Deprecate sooner?
>
>
> Existing Alternatives
> =================
>
> .pth Files
> ----------
>
> `*.pth` files have the problem that they're global: if you add them to
> `site-packages`, they will be processed at startup by *every* Python
> application run using that Python installation. This is an undesirable side
> effect of the way `*.pth` processing is defined, but can't be changed due
> to backwards compatibility issues.
>
> Furthermore, `*.pth` files are processed at interpreter startup...
>
> .egg-link files
> --------------
>
> `*.egg-link` files are much closer to the proposed `*.ref` files. The
> difference is that `*.egg-link` files are designed to work with
> `pkg_resources` and `distribution names`, while `*.ref files` are designed
> to work with package and module names as an automatic part of the import
> system.
>
> Symlinks
> ---------
>
> Actual symlinks have the problem that they aren't really practical on
> Windows, and also that they don't support non-versioned references to
> versioned `dist-info` directories.
>
> Design Alternatives
> ===================
>
> Ignore Empty Ref Files
> ----------------------
>
> An empty ref file would be ignored rather than effectively stopping the
> processing of the path entry. This loses the benefits outlined above of
> empty ref files as markers.
>
> ImportError for Empty Ref Files
> -------------------------------
>
> An empty ref file would result in an ImportError. The only benefit to
> this would be to disallow empty ref files and make it clear when they are
> encountered.
>
> Handle Ref Files After Namespace Packages
> -----------------------------------------
>
> Rather than handling ref files first, they could be handled last. Thus
> they would have lower priority than namespace package fragments. This
> would be insignificantly more backward compatible. However, as with
> ignoring empty ref files, handling them last would prevent their use as
> markers for ignoring a path entry.
>
> Send Ref File Path Through Path Import System Only
> --------------------------------------------------
>
> As indicated above, the path entries in a ref file are passed back through
> the metapath finders to get the loader. Instead we could use just the
> path-based import system. This would prevent metapath finders from having
> a chance to handle the module under a different path.
>
> Restrict Ref File Path Entries to Directories
> ---------------------------------------------
>
> Rather than allowing anything for the path entries in a ref file, they
> could be restricted to just directories. This is by far the common case.
> However, it would add complexity without any justification for not
> allowing metapath importers a chance at the module under a new path.
>
> Restrict Directories in Ref File Path Entries to Absolute
> ---------------------------------------------------------
>
> Directory path entries in ref files can be relative or absolute. Limiting
> to just absolute directory names would be an artificial change to existing
> constraints on path entries without any justification. Furthermore, it
> would prevent simple use of ref files in code bases relative to project
> roots.
>
>
> Future Extensions
> ===============
>
> Longer term, we should also allow *versioned* `*.ref` files that can be
> used to reference modules and packages that aren't available for ordinary
> import (since they don't follow the "name.ref" format), but are available
> to tools like `pkg_resources` to handle parallel installs of different
> versions.
>
>
> References
> ==========
>
> .. [0] ...
> ()
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> ^L
> ..
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130729/9989793e/attachment-0001.html>
More information about the Import-SIG
mailing list