From ericsnowcurrently at gmail.com  Fri Jul 19 00:10:21 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 18 Jul 2013 16:10:21 -0600
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
Message-ID: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>

Hi,

Nick talked me into writing this PEP, so blame him for the idea. <wink>  I
haven't had a chance to polish it up, but the content communicates the
proposal well enough to post here.  Let me know what you think.  Once some
concensus is reached I'll commit the PEP and post to python-dev.  I have a
rough implementation that'll I'll put online when I get a chance.

If Guido is not interested maybe Brett would like to be BDFL-Delegate. :)

-eric


PEP: 4XX
Title: Per-Module Import Path
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
        Nick Coghlan <ncoghlan at gmail.com>
BDFL-Delegate: ???
Discussions-To: import-sig at python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 17-Jul-2013
Python-Version: 3.4
Post-History: 18-Jul-2013
Resolution:


Abstract
=======

Path-based import of a module or package involves traversing ``sys.path``
or a package path to locate the appropriate file or directory(s).
Redirecting from there to other locations is useful for packaging and
for virtual environments.  However, in practice such redirection is
currently either `limited or fragile <Existing Alternatives>`_.

This proposal provides a simple filesystem-based method to redirect from
the normal module search path to other locations recognized by the
import system.  This involves one change to path-based imports, adds one
import-related file type, and introduces a new module attribute.  One
consequence of this PEP is the deprecation of ``.pth`` files.


Motivation
=========

One of the problems with virtual environments is that you are likely to
end up with duplicate installations of lots of common packages, and
keeping them up to date can be a pain.

One of the problems with archive-based distribution is that it can be
tricky to register the archive as a Python path entry when needed
without polluting the path of applications that don't need it.

One of the problems with working directly from a source checkout is
getting the relevant source directories onto the Python path, especially
when you have multiple namespace package fragments spread across several
subdirectories of a large repository.

The `current solutions <Existing Alternatives>`_ all have their flaws.
Reference files are intended to address those deficiencies.


Specification
===========

Change to the Import System
-----------------------------

Currently, during `path-based import` of a module, the following happens
for each `path entry` of `sys.path` or of the `__path__` of the module's
parent:

1. look for `<path entry>/<name>/__init__.py` (and other supported
suffixes),
  * return `loader`;
2. look for `<path entry>/<name>.py` (and other supported suffixes),
  * return loader;
3. look for `<path entry>/<name>/`,
  * extend namespace portions path.

Once the path is exhausted, if no `loader` was found and the `namespace
portions` path is non-empty, then a `NamespaceLoader` is returned with that
path.

This proposal inserts a step before step 1 for each `path entry`:

0. look for `<path entry>/<name>.ref`
  a. get loader for `<fullname>` (absolute module name) using path found in
`.ref` file (see below) using `the normal mechanism`[link to language
reference],
    * stop processing the path entry if `.ref` file is empty;
  b. check for `NamespaceLoader`,
    * extend namespace portions path;
  c. otherwise, return loader.

Note the following consequences:

* if a ref file is found, it takes precedence over module files and package
directories under the same path entry (see `Empty Ref Files as Markers`_);
* that holds for empty ref files also;
* the loader for a ref file, if any, comes from the full import system
(i.e. `sys.meta_path`) rather than just the path-based import system;
* `.ref` files can indirectly provide fragments for namespace packages.

Reference Files
---------------

A new kind of file will live alongside package directories and module
source files: reference files.  These files have the following
characteristics:

* named `<module name>.ref` in contrast to `<module name>.py` (etc.) or
`<module name>/`;
* placed under `sys.path` entries or package path (just like modules and
packages).

Reference File Format
----------------------

The contents of a reference file will conform to the following format:

* contain zero or more path entries, just like sys.path;
* one path entry per line;
* path entry order is preserved;
* may contain comment lines starting with "#", which are ignored;
* may contain blank lines, which are ignored;
* must be UTF-8 encoded.

Directory Path Entries
----------------------

Directory names are by far the most common type of path entry.  Here is how
they are constrained in reference files:

* may be absolute or relative;
* must be forward slash separated regardless of platform;
* each must be the parent directory where the module will be looked for.

To be clear, reference files (just like `sys.path`) deliberately reference
the *parent* directory to be searched (rather than the module or package
directory).  So they work transparently with `__pycache__` and allow
searching for `.dist-info <PEP 376>`_ directories through them.

Relative directory names will be resolved based on the directory containing
the ref file, rather than the current working directory.  Allowing relative
directory names allows you to include sensible ref files in a source repo.

Empty Ref Files as Markers
-----------------------------

Handling `.ref` files first allows for the use of empty ref files as
markers to indicate "this is not the module you are looking for".  Here are
two situations where that helps.

First, an empty ref file helps resolve conflicts between script names and
package names.  When the interpreter is started with a filename, the
directory of that script is added to the front of `sys.path`.  This may be
a problem for later imports where the intended module or package is on a
regular path entry.

If an import references the script's name, the file will get run again by
the import system as a module (only `__main__` was added to `sys.modules`
earlier) [PEP 395]_.  This is a further problem if you meant to import a
module or package in another path entry.

The presence of an empty ref file in the script's directory would
essentially render it invisible to the import system.  This problem and
solution apply for all of the files or directories in the script's
directory.

Second, the namespace package mechanism has a side-effect: a directory
without a __init__.py may be incorrectly treated as a namespace package
fragment.  The presence of an empty ref file indicates such a directory
should be ignored.

A Module Attribute to Expose Contributing Ref Files
---------------------------------------------

Knowing the origin of a module is important when tracking down problems,
particularly import-related ones.  Currently, that entails looking at
`<module>.__file__` and `<module.__package__>.__path__` (or `sys.path`).

With this PEP there can be a chain of ref files in between the currently
available path and a module's __file__.  Having access to that list of ref
files is important in order to determine why one file was selected over
another as the origin for the module.  When an unexpected file gets used
for one of your imports, you'll care about this!

In order to facilitate that, modules will have a new attribute:
`__indirect__`.  It will be a tuple comprised of the chain of ref files, in
order, used to locate the module's __file__.  An empty tuple or with one
item will be the most common case.  An empty tuple indicates that no ref
files were used to locate the module.

Examples
--------

XXX are these useful?

Top-level module (`import spam`)::

  ~/venvs/ham/python/site-packages/
      spam.ref

  spam.ref:
      # use the system installed module
      /python/site-packages

  /python/site-packages:
      spam.py

  spam.__file__:
      "/python/site-packages/spam.py"

  spam.__indirect__:
      ("~/venvs/ham/python/site-packages/spam.ref",)

Submodule (`python -m myproject.tests`)::

  ~/myproject/
      setup.py
      tests/
          __init__.py
          __main__.py
      myproject/
          __init__.py
          tests.ref

  tests.ref:
      ../

  myproject.__indirect__:
      ()

  myproject.tests.__file__:
      "~/myproject/tests/__init__.py"

  myproject.tests.__indirect__:
      ("~/myproject/myproject/tests.ref",)

Multiple Path Entries::

  myproj/
      __init__.py
      mod.ref

  mod.ref:
      # fall back to the old one
      /python/site-packages/mod-new/
      /python/site-packages/mod-old/

  /python/site-packages/
      mod-old/
          mod.py

  myproj.mod.__file__:
      "/python/site-packages/mod-old/mod.py"

  myproj.mod.__indirect__:
      ("myproj/mod.ref",)

Chained Ref Files::

  venvs/ham/python/site-packages/
      spam.ref

  venvs/ham/python/site-packages/spam.ref:
      # use the system installed module
      /python/site-packages

  /python/site-packages/
      spam.ref

  /python/site-packages/spam.ref:
      # use the clone
      ~/clones/myproj/

  ~/clones/myproj/
      spam.py

  spam.__file__:
      "~/clones/myproj/spam.py"

  spam.__indirect__:
      ("venvs/ham/python/site-packages/spam.ref",
"/python/site-packages/spam.ref")

Reference Implementation
------------------------

A reference implementation is available at <TBD>.

XXX double-check zipimport support


Deprecation of .pth Files
=============================

The `site` module facilitates the composition of `sys.path`.  As part of
that, `.pth` files are processed and entries added to `sys.path`.  Ref
files are intended as a replacement.

XXX also deprecate .pkg files (see pkgutil.extend_path())?

Consequently, `.pth` files will be deprecated.

Deprecation Schedule
-------------------------

1. documented: 3.4,
2. warnings: 3.5 and 3.6,
3. removal: 3.7

XXX Deprecate sooner?


Existing Alternatives
=================

.pth Files
----------

`*.pth` files have the problem that they're global: if you add them to
`site-packages`, they will be processed at startup by *every* Python
application run using that Python installation. This is an undesirable side
effect of the way `*.pth` processing is defined, but can't be changed due
to backwards compatibility issues.

Furthermore, `*.pth` files are processed at interpreter startup...

.egg-link files
--------------

`*.egg-link` files are much closer to the proposed `*.ref` files. The
difference is that `*.egg-link` files are designed to work with
`pkg_resources` and `distribution names`, while `*.ref files` are designed
to work with package and module names as an automatic part of the import
system.

Symlinks
---------

Actual symlinks have the problem that they aren't really practical on
Windows, and also that they don't support non-versioned references to
versioned `dist-info` directories.

Design Alternatives
===================

Ignore Empty Ref Files
----------------------

An empty ref file would be ignored rather than effectively stopping the
processing of the path entry.  This loses the benefits outlined above of
empty ref files as markers.

ImportError for Empty Ref Files
-------------------------------

An empty ref file would result in an ImportError.  The only benefit to this
would be to disallow empty ref files and make it clear when they are
encountered.

Handle Ref Files After Namespace Packages
-----------------------------------------

Rather than handling ref files first, they could be handled last.  Thus
they would have lower priority than namespace package fragments.  This
would be insignificantly more backward compatible.  However, as with
ignoring empty ref files, handling them last would prevent their use as
markers for ignoring a path entry.

Send Ref File Path Through Path Import System Only
--------------------------------------------------

As indicated above, the path entries in a ref file are passed back through
the metapath finders to get the loader.  Instead we could use just the
path-based import system.  This would prevent metapath finders from having
a chance to handle the module under a different path.

Restrict Ref File Path Entries to Directories
---------------------------------------------

Rather than allowing anything for the path entries in a ref file, they
could be restricted to just directories.  This is by far the common case.
 However, it would add complexity without any justification for not
allowing metapath importers a chance at the module under a new path.

Restrict Directories in Ref File Path Entries to Absolute
---------------------------------------------------------

Directory path entries in ref files can be relative or absolute.  Limiting
to just absolute directory names would be an artificial change to existing
constraints on path entries without any justification.  Furthermore, it
would prevent simple use of ref files in code bases relative to project
roots.


Future Extensions
===============

Longer term, we should also allow *versioned* `*.ref` files that can be
used to reference modules and packages that aren't available for ordinary
import (since they don't follow the "name.ref" format), but are available
to tools like `pkg_resources` to handle parallel installs of different
versions.


References
==========

.. [0] ...
       ()


Copyright
=========

This document has been placed in the public domain.

^L
..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130718/a0d42423/attachment-0001.html>

From ericsnowcurrently at gmail.com  Fri Jul 19 00:20:26 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 18 Jul 2013 16:20:26 -0600
Subject: [Import-SIG] deprecation of pkgutil.extend_path()?
Message-ID: <CALFfu7Ahbu4f1uiYyYw47SvnBmzGzXj2g9CF0wGRMin4vpCceA@mail.gmail.com>

Was there any discussion, relative to PEP 420, about deprecating
pkgutil.extend_path()?  The PEP talks about transitioning away from the
function, but does not talk about deprecation.

I ask because the PEP I just posted to this list fills the role played by
the `.pkg` files that extend_path() uses.  Since the two are tightly
coupled, it doesn't make sense to deprecate one without doing so for the
other.

Is there any reason not to put extend_path() on a (conservative)
deprecation schedule?  Considering that it's related and that I'm proposing
deprecation of `.pth` files, I can simply include such a schedule as part
of my PEP.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130718/abd74660/attachment.html>

From eric at trueblade.com  Fri Jul 19 00:28:41 2013
From: eric at trueblade.com (Eric V. Smith)
Date: Thu, 18 Jul 2013 18:28:41 -0400
Subject: [Import-SIG] deprecation of pkgutil.extend_path()?
In-Reply-To: <CALFfu7Ahbu4f1uiYyYw47SvnBmzGzXj2g9CF0wGRMin4vpCceA@mail.gmail.com>
References: <CALFfu7Ahbu4f1uiYyYw47SvnBmzGzXj2g9CF0wGRMin4vpCceA@mail.gmail.com>
Message-ID: <51E86C19.80505@trueblade.com>

On 7/18/2013 6:20 PM, Eric Snow wrote:
> Was there any discussion, relative to PEP 420, about deprecating
> pkgutil.extend_path()?  The PEP talks about transitioning away from the
> function, but does not talk about deprecation.

I don't recall any such discussion. It just never came up.

> I ask because the PEP I just posted to this list fills the role played
> by the `.pkg` files that extend_path() uses.  Since the two are tightly
> coupled, it doesn't make sense to deprecate one without doing so for the
> other.
> 
> Is there any reason not to put extend_path() on a (conservative)
> deprecation schedule?  Considering that it's related and that I'm
> proposing deprecation of `.pth` files, I can simply include such a
> schedule as part of my PEP.

I haven't read the PEP yet, so I'll have to wait to comment. I'm not
opposed to removing it, though.

-- 
Eric.

From brett at python.org  Fri Jul 19 14:37:04 2013
From: brett at python.org (Brett Cannon)
Date: Fri, 19 Jul 2013 08:37:04 -0400
Subject: [Import-SIG] deprecation of pkgutil.extend_path()?
In-Reply-To: <51E86C19.80505@trueblade.com>
References: <CALFfu7Ahbu4f1uiYyYw47SvnBmzGzXj2g9CF0wGRMin4vpCceA@mail.gmail.com>
	<51E86C19.80505@trueblade.com>
Message-ID: <CAP1=2W7qKaEd7vTmeZVSUj6bifeOwXY8hi5vXeHvuxhVzffbJA@mail.gmail.com>

On Thu, Jul 18, 2013 at 6:28 PM, Eric V. Smith <eric at trueblade.com> wrote:

> On 7/18/2013 6:20 PM, Eric Snow wrote:
> > Was there any discussion, relative to PEP 420, about deprecating
> > pkgutil.extend_path()?  The PEP talks about transitioning away from the
> > function, but does not talk about deprecation.
>
> I don't recall any such discussion. It just never came up.
>

The idea came up, but no one laid down a schedule since namespace packages
are so new.


>
> > I ask because the PEP I just posted to this list fills the role played
> > by the `.pkg` files that extend_path() uses.  Since the two are tightly
> > coupled, it doesn't make sense to deprecate one without doing so for the
> > other.
> >
> > Is there any reason not to put extend_path() on a (conservative)
> > deprecation schedule?  Considering that it's related and that I'm
> > proposing deprecation of `.pth` files, I can simply include such a
> > schedule as part of my PEP.
>
> I haven't read the PEP yet, so I'll have to wait to comment. I'm not
> opposed to removing it, though.
>

I think you can add a PendingDeprecationWarning for it (probably should
double-check nothing else in pkgutil no longer makes sense as well).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130719/bf4b1dc0/attachment.html>

From brett at python.org  Fri Jul 19 16:51:05 2013
From: brett at python.org (Brett Cannon)
Date: Fri, 19 Jul 2013 10:51:05 -0400
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
Message-ID: <CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>

If this can lead to the deprecation of .pth files then I support the idea,
but I think there are technical issues in terms of implementation that have
not been throught through yet. This is going to require an implementation
(even if it isn't in importlib._bootstrap but as a subclass of
importlib.machinery.FileFinder or something) to see how you plan to make
all of this work before this PEP can move beyond this SIG.


On Thu, Jul 18, 2013 at 6:10 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:

> Hi,
>
> Nick talked me into writing this PEP, so blame him for the idea. <wink>  I
> haven't had a chance to polish it up, but the content communicates the
> proposal well enough to post here.  Let me know what you think.  Once some
> concensus is reached I'll commit the PEP and post to python-dev.  I have a
> rough implementation that'll I'll put online when I get a chance.
>
> If Guido is not interested maybe Brett would like to be BDFL-Delegate. :)
>
> -eric
>
>
> PEP: 4XX
> Title: Per-Module Import Path
> Version: $Revision$
> Last-Modified: $Date$
> Author: Eric Snow <ericsnowcurrently at gmail.com>
>         Nick Coghlan <ncoghlan at gmail.com>
> BDFL-Delegate: ???
> Discussions-To: import-sig at python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 17-Jul-2013
> Python-Version: 3.4
> Post-History: 18-Jul-2013
> Resolution:
>
>
> Abstract
> =======
>
> Path-based import of a module or package involves traversing ``sys.path``
> or a package path to locate the appropriate file or directory(s).
> Redirecting from there to other locations is useful for packaging and
> for virtual environments.  However, in practice such redirection is
> currently either `limited or fragile <Existing Alternatives>`_.
>
> This proposal provides a simple filesystem-based method to redirect from
> the normal module search path to other locations recognized by the
> import system.  This involves one change to path-based imports, adds one
> import-related file type, and introduces a new module attribute.  One
> consequence of this PEP is the deprecation of ``.pth`` files.
>
>
> Motivation
> =========
>
> One of the problems with virtual environments is that you are likely to
> end up with duplicate installations of lots of common packages, and
> keeping them up to date can be a pain.
>
> One of the problems with archive-based distribution is that it can be
>

You say "One of the problems" at the start of 3/4 of the paragraphs in this
section. Variety is the spice of life. =) Try "Another problem is that for
archive-based", etc.


> tricky to register the archive as a Python path entry when needed
> without polluting the path of applications that don't need it.
>

How is this unique to archive-based distributions compared to any other
scenario where all distributions are blindly added to sys.path?


>
> One of the problems with working directly from a source checkout is
> getting the relevant source directories onto the Python path, especially
> when you have multiple namespace package fragments spread across several
> subdirectories of a large repository.
>
>
E.g., a source checkout for the coverage.py project might be stored in the
directory ``coveragepy``, but the actual source code is stored in
``coveragepy/coverage``, requiring ``coveragepy`` to be on sys.path in
order to access the package.


> The `current solutions <Existing Alternatives>`_ all have their flaws.
> Reference files are intended to address those deficiencies.
>
>
> Specification
> ===========
>
> Change to the Import System
> -----------------------------
>
> Currently, during `path-based import` of a module, the following happens
> for each `path entry` of `sys.path` or of the `__path__` of the module's
> parent:
>
> 1. look for `<path entry>/<name>/__init__.py` (and other supported
> suffixes),
>   * return `loader`;
> 2. look for `<path entry>/<name>.py` (and other supported suffixes),
>   * return loader;
> 3. look for `<path entry>/<name>/`,
>   * extend namespace portions path.
>
>
Please capitalize the first letter of each bullet point (here and the rest
of the PEP). Reads better since they are each separate sentences.


> Once the path is exhausted, if no `loader` was found and the `namespace
> portions` path is non-empty, then a `NamespaceLoader` is returned with that
> path.
>
> This proposal inserts a step before step 1 for each `path entry`:
>
> 0. look for `<path entry>/<name>.ref`
>

Why .ref? Why not .path?


>   a. get loader for `<fullname>` (absolute module name) using path found
> in `.ref` file (see below) using `the normal mechanism`[link to language
> reference],
>     * stop processing the path entry if `.ref` file is empty;
>

You should clarify how you plan to "get loader". You will have to find the
proper finder as well in case the .ref file references a zip file or
something which requires a different finder than the one which came across
the .ref file.


>   b. check for `NamespaceLoader`,
>     * extend namespace portions path;
>   c. otherwise, return loader.
>
> Note the following consequences:
>
> * if a ref file is found, it takes precedence over module files and
> package directories under the same path entry (see `Empty Ref Files as
> Markers`_);
> * that holds for empty ref files also;
> * the loader for a ref file, if any, comes from the full import system
> (i.e. `sys.meta_path`) rather than just the path-based import system;
> * `.ref` files can indirectly provide fragments for namespace packages.
>

This ramification for namespace packages make the changed semantic proposal
a bit trickier than you are suggesting since you are essentially doing
recursive path entry search. And is that possible? If I have a .ref file
that refers to a path which itself has a .ref will that then lead to
another search? I mean it seems like you going to be doing ``return
importlib.find_loader(fullname, paths_found_in_ref) if paths_found_in_ref
else None, []`` from within a finder which finds a .ref file, which itself
would support a recursive search.

Everything below should come before the import changes. It's hard to follow
what is really be proposed for semantics without  knowing e.g. .ref files
can have 0 or more paths and just a single path, etc.


> Reference Files
> ---------------
>
> A new kind of file will live alongside package directories and module
> source files: reference files.  These files have the following
> characteristics:
>
> * named `<module name>.ref` in contrast to `<module name>.py` (etc.) or
> `<module name>/`;
> * placed under `sys.path` entries or package path (just like modules and
> packages).
>
> Reference File Format
> ----------------------
>
> The contents of a reference file will conform to the following format:
>
> * contain zero or more path entries, just like sys.path;
> * one path entry per line;
> * path entry order is preserved;
> * may contain comment lines starting with "#", which are ignored;
> * may contain blank lines, which are ignored;
> * must be UTF-8 encoded.
>
> Directory Path Entries
> ----------------------
>
> Directory names are by far the most common type of path entry.  Here is
> how they are constrained in reference files:
>
> * may be absolute or relative;
> * must be forward slash separated regardless of platform;
> * each must be the parent directory where the module will be looked for.
>
> To be clear, reference files (just like `sys.path`) deliberately reference
> the *parent* directory to be searched (rather than the module or package
> directory).  So they work transparently with `__pycache__` and allow
> searching for `.dist-info <PEP 376>`_ directories through them.
>
> Relative directory names will be resolved based on the directory
> containing the ref file, rather than the current working directory.
>  Allowing relative directory names allows you to include sensible ref files
> in a source repo.
>
> Empty Ref Files as Markers
> -----------------------------
>
> Handling `.ref` files first allows for the use of empty ref files as
> markers to indicate "this is not the module you are looking for".  Here are
> two situations where that helps.
>

"Here" -> "There"


>
> First, an empty ref file helps resolve conflicts between script names and
> package names.  When the interpreter is started with a filename, the
> directory of that script is added to the front of `sys.path`.  This may be
> a problem for later imports where the intended module or package is on a
> regular path entry.
>
> If an import references the script's name, the file will get run again by
> the import system as a module (only `__main__` was added to `sys.modules`
> earlier) [PEP 395]_.  This is a further problem if you meant to import a
> module or package in another path entry.
>
> The presence of an empty ref file in the script's directory would
> essentially render it invisible to the import system.  This problem and
> solution apply for all of the files or directories in the script's
> directory.
>
> Second, the namespace package mechanism has a side-effect: a directory
> without a __init__.py may be incorrectly treated as a namespace package
> fragment.  The presence of an empty ref file indicates such a directory
> should be ignored.
>
> A Module Attribute to Expose Contributing Ref Files
> ---------------------------------------------
>
> Knowing the origin of a module is important when tracking down problems,
> particularly import-related ones.  Currently, that entails looking at
> `<module>.__file__` and `<module.__package__>.__path__` (or `sys.path`).
>
> With this PEP there can be a chain of ref files in between the currently
> available path and a module's __file__.  Having access to that list of ref
> files is important in order to determine why one file was selected over
> another as the origin for the module.  When an unexpected file gets used
> for one of your imports, you'll care about this!
>
> In order to facilitate that, modules will have a new attribute:
> `__indirect__`.  It will be a tuple comprised of the chain of ref files, in
> order, used to locate the module's __file__.  An empty tuple or with one
> item will be the most common case.  An empty tuple indicates that no ref
> files were used to locate the module.
>

This complicates things even further. How are you going to pass this info
along a call chain through find_loader()? Are we going to have to add
find_loader3() to support this (nasty side-effect of using tuples instead
of types.SimpleNamespace for the return value)? Some magic second value or
type from find_loader() which flags the values in the iterable are from a
.ref file and not any other possible place? This requires an API change and
there isn't any mention of how that would look or work.


>
> Examples
> --------
>
> XXX are these useful?
>

Yes if you change this to pip or setuptools and and also make it so it
shows how you could point to version-specific distributions.


>
> Top-level module (`import spam`)::
>
>   ~/venvs/ham/python/site-packages/
>       spam.ref
>
>   spam.ref:
>       # use the system installed module
>       /python/site-packages
>
>   /python/site-packages:
>       spam.py
>
>   spam.__file__:
>       "/python/site-packages/spam.py"
>
>   spam.__indirect__:
>       ("~/venvs/ham/python/site-packages/spam.ref",)
>
> Submodule (`python -m myproject.tests`)::
>
>   ~/myproject/
>       setup.py
>       tests/
>           __init__.py
>           __main__.py
>       myproject/
>           __init__.py
>           tests.ref
>
>   tests.ref:
>       ../
>
>   myproject.__indirect__:
>       ()
>
>   myproject.tests.__file__:
>       "~/myproject/tests/__init__.py"
>
>   myproject.tests.__indirect__:
>       ("~/myproject/myproject/tests.ref",)
>
> Multiple Path Entries::
>
>   myproj/
>       __init__.py
>       mod.ref
>
>   mod.ref:
>       # fall back to the old one
>       /python/site-packages/mod-new/
>       /python/site-packages/mod-old/
>
>   /python/site-packages/
>       mod-old/
>           mod.py
>
>   myproj.mod.__file__:
>       "/python/site-packages/mod-old/mod.py"
>
>   myproj.mod.__indirect__:
>       ("myproj/mod.ref",)
>
> Chained Ref Files::
>
>   venvs/ham/python/site-packages/
>       spam.ref
>
>   venvs/ham/python/site-packages/spam.ref:
>       # use the system installed module
>       /python/site-packages
>
>   /python/site-packages/
>       spam.ref
>
>   /python/site-packages/spam.ref:
>       # use the clone
>       ~/clones/myproj/
>
>   ~/clones/myproj/
>       spam.py
>
>   spam.__file__:
>       "~/clones/myproj/spam.py"
>
>   spam.__indirect__:
>       ("venvs/ham/python/site-packages/spam.ref",
> "/python/site-packages/spam.ref")
>
> Reference Implementation
> ------------------------
>
> A reference implementation is available at <TBD>.
>
> XXX double-check zipimport support
>
>
> Deprecation of .pth Files
> =============================
>
> The `site` module facilitates the composition of `sys.path`.  As part of
> that, `.pth` files are processed and entries added to `sys.path`.  Ref
> files are intended as a replacement.
>
> XXX also deprecate .pkg files (see pkgutil.extend_path())?
>
> Consequently, `.pth` files will be deprecated.
>

Link to "Existing Alternatives" discussion as to why this deprecation is
desired.


>
> Deprecation Schedule
> -------------------------
>
> 1. documented: 3.4,
> 2. warnings: 3.5 and 3.6,
> 3. removal: 3.7
>
> XXX Deprecate sooner?
>
>
> Existing Alternatives
> =================
>
> .pth Files
> ----------
>
> `*.pth` files have the problem that they're global: if you add them to
> `site-packages`, they will be processed at startup by *every* Python
> application run using that Python installation.
>

"... thanks to them being processed by the site module instead of by the
import system and individual finders."


> This is an undesirable side effect of the way `*.pth` processing is
> defined, but can't be changed due to backwards compatibility issues.
>
> Furthermore, `*.pth` files are processed at interpreter startup...
>

That's a moot point; .ref files can be as well if they are triggered as
part of an import.

A bigger concern is that they execute arbitrary Python code which could be
viewed as an unexpected security risk. Some might complain about the
difficulty then of loading non-standard importers, but that really should
be the duty of the code  performing the import and not the distribution
itself; IOW I would argue that it is up to the user to get things in line
to use a distribution in the format they choose to use it instead of the
distribution dictating how it should be bundled.


>
> .egg-link files
> --------------
>
> `*.egg-link` files are much closer to the proposed `*.ref` files. The
> difference is that `*.egg-link` files are designed to work with
> `pkg_resources` and `distribution names`, while `*.ref files` are designed
> to work with package and module names as an automatic part of the import
> system.
>
> Symlinks
> ---------
>
> Actual symlinks have the problem that they aren't really practical on
> Windows, and also that they don't support non-versioned references to
> versioned `dist-info` directories.
>
> Design Alternatives
> ===================
>
> Ignore Empty Ref Files
> ----------------------
>
> An empty ref file would be ignored rather than effectively stopping the
> processing of the path entry.  This loses the benefits outlined above of
> empty ref files as markers.
>
> ImportError for Empty Ref Files
> -------------------------------
>
> An empty ref file would result in an ImportError.  The only benefit to
> this would be to disallow empty ref files and make it clear when they are
> encountered.
>
> Handle Ref Files After Namespace Packages
> -----------------------------------------
>
> Rather than handling ref files first, they could be handled last.  Thus
> they would have lower priority than namespace package fragments.  This
> would be insignificantly more backward compatible.  However, as with
> ignoring empty ref files, handling them last would prevent their use as
> markers for ignoring a path entry.
>
> Send Ref File Path Through Path Import System Only
> --------------------------------------------------
>
> As indicated above, the path entries in a ref file are passed back through
> the metapath finders to get the loader.  Instead we could use just the
> path-based import system.  This would prevent metapath finders from having
> a chance to handle the module under a different path.
>
> Restrict Ref File Path Entries to Directories
> ---------------------------------------------
>
> Rather than allowing anything for the path entries in a ref file, they
> could be restricted to just directories.  This is by far the common case.
>  However, it would add complexity without any justification for not
> allowing metapath importers a chance at the module under a new path.
>
> Restrict Directories in Ref File Path Entries to Absolute
> ---------------------------------------------------------
>
> Directory path entries in ref files can be relative or absolute.  Limiting
> to just absolute directory names would be an artificial change to existing
> constraints on path entries without any justification.  Furthermore, it
> would prevent simple use of ref files in code bases relative to project
> roots.
>
>
> Future Extensions
> ===============
>
> Longer term, we should also allow *versioned* `*.ref` files that can be
> used to reference modules and packages that aren't available for ordinary
> import (since they don't follow the "name.ref" format), but are available
> to tools like `pkg_resources` to handle parallel installs of different
> versions.
>
>
> References
> ==========
>
> .. [0] ...
>        ()
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> ^L
> ..
>    Local Variables:
>    mode: indented-text
>    indent-tabs-mode: nil
>    sentence-end-double-space: t
>    fill-column: 70
>    coding: utf-8
>    End:
>
> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130719/42891742/attachment-0001.html>

From ncoghlan at gmail.com  Sat Jul 20 09:32:01 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 20 Jul 2013 17:32:01 +1000
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
Message-ID: <CADiSq7f8HNOW0hzBqYr=r1a46aD8r-SWFEbaW7UCkUkc-p-FJg@mail.gmail.com>

On 20 July 2013 00:51, Brett Cannon <brett at python.org> wrote:
>> tricky to register the archive as a Python path entry when needed
>> without polluting the path of applications that don't need it.
>
> How is this unique to archive-based distributions compared to any other
> scenario where all distributions are blindly added to sys.path?

Other packages and modules are just *on* an already existing sys.path
entry. They're available, but unless you import them, they just sit
there on disk and don't bother you.

*.pth files in site-packages are different: *every* application run on
that Python installation (without -S) will have *new* entries added to
sys.path :(

>> One of the problems with working directly from a source checkout is
>> getting the relevant source directories onto the Python path, especially
>> when you have multiple namespace package fragments spread across several
>> subdirectories of a large repository.
>>
>
> E.g., a source checkout for the coverage.py project might be stored in the
> directory ``coveragepy``, but the actual source code is stored in
> ``coveragepy/coverage``, requiring ``coveragepy`` to be on sys.path in order
> to access the package.

Yep, exactly. With the PEP, you should be able to just do "echo
coveragepy > coverage.ref" in the current directory and the
interpreter will be able to follow the reference when you do "import
coverage", but it won't actually be added to sys.path.

A more complex example is the source layout for Beaker, where we have
separate Common, Client, Server, LabController and IntegrationTest
source directories that all contribute submodules to a shared "bkr"
namespace package. Getting those all on your path in a source checkout
currently means plenty of PYTHONPATH manipulation and various helper
scripts.

Because it's a namespace package, we can't do this with symlinks - a
"bkr" symlink could only reference one of the source directories, and
we have five. With ref files (and using Python 3.4+, which may happen
some day!), we'd be able to designate a recommended working directory
and put a single "bkr.ref" file under source control that included the
appropriate relative paths to make all the components available.

>> Once the path is exhausted, if no `loader` was found and the `namespace
>> portions` path is non-empty, then a `NamespaceLoader` is returned with that
>> path.
>>
>> This proposal inserts a step before step 1 for each `path entry`:
>>
>> 0. look for `<path entry>/<name>.ref`
>
>
> Why .ref? Why not .path?

I think of this proposal in general as "indirect imports", but the
notion of a "path reference" is what inspired the specific extension
(Eric just copied it from my original email).

I didn't actually consider other possible extensions (since I was
happy with .ref and it was the first one I thought of), but I think
.path in particular would be awfully confusing, since I (and others)
already refer to .pth files as "dot path" files.


>
> Everything below should come before the import changes. It's hard to follow
> what is really be proposed for semantics without  knowing e.g. .ref files
> can have 0 or more paths and just a single path, etc.

The proposal is indeed for what is effectively full blown recursive
path search, starting again from the top with sys.meta_path :)

The only new trick we should need is something to track which path
entries we have already considered so we can prevent recursive loops.

I don't know how Eric handled it in his draft implementation, but
here's the kind of thing I would do:

1. Define an IndirectReference type that can be returned instead of a loader
2. FileImporter would be aware of the new type, and if it sees it
instead of a loader:

- extracts the name of the reference file (to append to __indirect__)
- extracts the subpath (for the recursive descent)
- removes any previously seen path segments (including any from the
original path)
- if there are paths left, triggers the recursive search for a loader
(I was thinking at the sys.path_hooks level, but Eric suggested using
sys.meta_path instead)
- treat the result of the recursive search the same as you would a
search of a single path segment

I'm not currently sure what I would do about __indirect__ if the
result was a namespace package. In that case, you really want it to be
a mapping from path entries to the indirections that located them (if
any), which suggests we may want it to be a mapping whenever
indirection occurs (from __file__ to the indirection chain for the
simple case), or None if no indirection was involved in loading the
module.

You couldn't have the loader handle the recursion itself, or you'd
have trouble getting state from the previous loader to the next one
down. Thus the recursive->iterative transformation through the
"IndirectReference" type.

(Again, though, this is just me thinking out loud - Eric's the one
that created an actual draft implementation)


>> In order to facilitate that, modules will have a new attribute:
>> `__indirect__`.  It will be a tuple comprised of the chain of ref files, in
>> order, used to locate the module's __file__.  An empty tuple or with one
>> item will be the most common case.  An empty tuple indicates that no ref
>> files were used to locate the module.
>
> This complicates things even further. How are you going to pass this info
> along a call chain through find_loader()? Are we going to have to add
> find_loader3() to support this (nasty side-effect of using tuples instead of
> types.SimpleNamespace for the return value)? Some magic second value or type
> from find_loader() which flags the values in the iterable are from a .ref
> file and not any other possible place? This requires an API change and there
> isn't any mention of how that would look or work.

Good question. This was a last minute addition just before Eric posted
the draft. I still think it's something we should try to offer, and I
suspect whatever mechanism is put in place to prevent recursive loops
should be able to handle propagating this information (as described in
my sketch above).

>> This is an undesirable side effect of the way `*.pth` processing is
>> defined, but can't be changed due to backwards compatibility issues.
>>
>> Furthermore, `*.pth` files are processed at interpreter startup...
>
>
> That's a moot point; .ref files can be as well if they are triggered as part
> of an import.

The difference is that ref files will only be triggered for modules
you actually import. *Every* .pth file in site-packages is processed
at interpreter startup, with startup time implications for all Python
code run on that system.

> A bigger concern is that they execute arbitrary Python code which could be
> viewed as an unexpected security risk. Some might complain about the
> difficulty then of loading non-standard importers, but that really should be
> the duty of the code  performing the import and not the distribution itself;
> IOW I would argue that it is up to the user to get things in line to use a
> distribution in the format they choose to use it instead of the distribution
> dictating how it should be bundled.

Agreed, we should mention this (it's one of the reason Linux distros
aren't fond of *.pth files).

We should also note that, unlike *.pth files, *.ref files would work
even with the "-S" switch, since they don't rely on the site module
making additions to sys.path.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Sat Jul 20 15:55:00 2013
From: brett at python.org (Brett Cannon)
Date: Sat, 20 Jul 2013 09:55:00 -0400
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CADiSq7f8HNOW0hzBqYr=r1a46aD8r-SWFEbaW7UCkUkc-p-FJg@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
	<CADiSq7f8HNOW0hzBqYr=r1a46aD8r-SWFEbaW7UCkUkc-p-FJg@mail.gmail.com>
Message-ID: <CAP1=2W4f7aQRKz-JhSc+yxb19LqaWy36G3Sd3ygSc4ZtuOYfSA@mail.gmail.com>

On Sat, Jul 20, 2013 at 3:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 20 July 2013 00:51, Brett Cannon <brett at python.org> wrote:
> >> tricky to register the archive as a Python path entry when needed
> >> without polluting the path of applications that don't need it.
> >
> > How is this unique to archive-based distributions compared to any other
> > scenario where all distributions are blindly added to sys.path?
>
> Other packages and modules are just *on* an already existing sys.path
> entry. They're available, but unless you import them, they just sit
> there on disk and don't bother you.
>
> *.pth files in site-packages are different: *every* application run on
> that Python installation (without -S) will have *new* entries added to
> sys.path :(
>
> >> One of the problems with working directly from a source checkout is
> >> getting the relevant source directories onto the Python path, especially
> >> when you have multiple namespace package fragments spread across several
> >> subdirectories of a large repository.
> >>
> >
> > E.g., a source checkout for the coverage.py project might be stored in
> the
> > directory ``coveragepy``, but the actual source code is stored in
> > ``coveragepy/coverage``, requiring ``coveragepy`` to be on sys.path in
> order
> > to access the package.
>
> Yep, exactly. With the PEP, you should be able to just do "echo
> coveragepy > coverage.ref" in the current directory and the
> interpreter will be able to follow the reference when you do "import
> coverage", but it won't actually be added to sys.path.
>
> A more complex example is the source layout for Beaker, where we have
> separate Common, Client, Server, LabController and IntegrationTest
> source directories that all contribute submodules to a shared "bkr"
> namespace package. Getting those all on your path in a source checkout
> currently means plenty of PYTHONPATH manipulation and various helper
> scripts.
>
> Because it's a namespace package, we can't do this with symlinks - a
> "bkr" symlink could only reference one of the source directories, and
> we have five. With ref files (and using Python 3.4+, which may happen
> some day!), we'd be able to designate a recommended working directory
> and put a single "bkr.ref" file under source control that included the
> appropriate relative paths to make all the components available.
>
> >> Once the path is exhausted, if no `loader` was found and the `namespace
> >> portions` path is non-empty, then a `NamespaceLoader` is returned with
> that
> >> path.
> >>
> >> This proposal inserts a step before step 1 for each `path entry`:
> >>
> >> 0. look for `<path entry>/<name>.ref`
> >
> >
> > Why .ref? Why not .path?
>
> I think of this proposal in general as "indirect imports", but the
> notion of a "path reference" is what inspired the specific extension
> (Eric just copied it from my original email).
>
> I didn't actually consider other possible extensions (since I was
> happy with .ref and it was the first one I thought of), but I think
> .path in particular would be awfully confusing, since I (and others)
> already refer to .pth files as "dot path" files.
>
>
> >
> > Everything below should come before the import changes. It's hard to
> follow
> > what is really be proposed for semantics without  knowing e.g. .ref files
> > can have 0 or more paths and just a single path, etc.
>
> The proposal is indeed for what is effectively full blown recursive
> path search, starting again from the top with sys.meta_path :)
>
> The only new trick we should need is something to track which path
> entries we have already considered so we can prevent recursive loops.
>
> I don't know how Eric handled it in his draft implementation, but
> here's the kind of thing I would do:
>
> 1. Define an IndirectReference type that can be returned instead of a
> loader
>

Type-dependent logic is always tough to get passed python-dev. And
obviously this is special to sys.meta_path (ignoring __indirect__).


> 2. FileImporter would be aware of the new type, and if it sees it
> instead of a loader:
>
> - extracts the name of the reference file (to append to __indirect__)
> - extracts the subpath (for the recursive descent)
> - removes any previously seen path segments (including any from the
> original path)
> - if there are paths left, triggers the recursive search for a loader
> (I was thinking at the sys.path_hooks level, but Eric suggested using
> sys.meta_path instead)
>

I was thinking sys.path as well because otherwise it becomes much more
complicated. If you leave out the whole __indirect__ bit you are simply
using .ref files to call find_loader() and then either accumulating the
namespace packages that are returned or an actual loader. But with
sys.meta_path you drop out of namespace package world and are one level up
where the ability to handle that accumulation of paths goes away (thus your
IndirectReference idea).


> - treat the result of the recursive search the same as you would a
> search of a single path segment
>
> I'm not currently sure what I would do about __indirect__ if the
> result was a namespace package. In that case, you really want it to be
> a mapping from path entries to the indirections that located them (if
> any), which suggests we may want it to be a mapping whenever
> indirection occurs (from __file__ to the indirection chain for the
> simple case), or None if no indirection was involved in loading the
> module.
>

Could you do it post-import? I mean the process of finding the .ref files
is the same, so you could just look for the files, accumulate the list, and
then see which ones ended up on __path__ and see "this is where these came
from". That does away with any potentially nasty API changes to make
__indirect__ work.


>
> You couldn't have the loader handle the recursion itself, or you'd
> have trouble getting state from the previous loader to the next one
> down. Thus the recursive->iterative transformation through the
> "IndirectReference" type.
>
> (Again, though, this is just me thinking out loud - Eric's the one
> that created an actual draft implementation)
>
>
> >> In order to facilitate that, modules will have a new attribute:
> >> `__indirect__`.  It will be a tuple comprised of the chain of ref
> files, in
> >> order, used to locate the module's __file__.  An empty tuple or with one
> >> item will be the most common case.  An empty tuple indicates that no ref
> >> files were used to locate the module.
> >
> > This complicates things even further. How are you going to pass this info
> > along a call chain through find_loader()? Are we going to have to add
> > find_loader3() to support this (nasty side-effect of using tuples
> instead of
> > types.SimpleNamespace for the return value)? Some magic second value or
> type
> > from find_loader() which flags the values in the iterable are from a .ref
> > file and not any other possible place? This requires an API change and
> there
> > isn't any mention of how that would look or work.
>
> Good question. This was a last minute addition just before Eric posted
> the draft. I still think it's something we should try to offer, and I
> suspect whatever mechanism is put in place to prevent recursive loops
> should be able to handle propagating this information (as described in
> my sketch above).
>

Sure, I'm not saying that it wouldn't be useful, I'm just wondering how it
would be pulled off without yet another change in the finder API (I'm
really coming to wish we returned a types.SimpleNamespace instead of a
tuple for find_loader()).

-Brett


>
> >> This is an undesirable side effect of the way `*.pth` processing is
> >> defined, but can't be changed due to backwards compatibility issues.
> >>
> >> Furthermore, `*.pth` files are processed at interpreter startup...
> >
> >
> > That's a moot point; .ref files can be as well if they are triggered as
> part
> > of an import.
>
> The difference is that ref files will only be triggered for modules
> you actually import. *Every* .pth file in site-packages is processed
> at interpreter startup, with startup time implications for all Python
> code run on that system.
>
> > A bigger concern is that they execute arbitrary Python code which could
> be
> > viewed as an unexpected security risk. Some might complain about the
> > difficulty then of loading non-standard importers, but that really
> should be
> > the duty of the code  performing the import and not the distribution
> itself;
> > IOW I would argue that it is up to the user to get things in line to use
> a
> > distribution in the format they choose to use it instead of the
> distribution
> > dictating how it should be bundled.
>
> Agreed, we should mention this (it's one of the reason Linux distros
> aren't fond of *.pth files).
>
> We should also note that, unlike *.pth files, *.ref files would work
> even with the "-S" switch, since they don't rely on the site module
> making additions to sys.path.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130720/ca2c9a90/attachment.html>

From ncoghlan at gmail.com  Sat Jul 20 17:38:39 2013
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 21 Jul 2013 01:38:39 +1000
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CAP1=2W4f7aQRKz-JhSc+yxb19LqaWy36G3Sd3ygSc4ZtuOYfSA@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
	<CADiSq7f8HNOW0hzBqYr=r1a46aD8r-SWFEbaW7UCkUkc-p-FJg@mail.gmail.com>
	<CAP1=2W4f7aQRKz-JhSc+yxb19LqaWy36G3Sd3ygSc4ZtuOYfSA@mail.gmail.com>
Message-ID: <CADiSq7fWM6AgWVj3W=mnzqWM6o6Ukt+Gr1AXXNPZg1SGAcqO6w@mail.gmail.com>

On 20 July 2013 23:55, Brett Cannon <brett at python.org> wrote:
> On Sat, Jul 20, 2013 at 3:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I'm not currently sure what I would do about __indirect__ if the
>> result was a namespace package. In that case, you really want it to be
>> a mapping from path entries to the indirections that located them (if
>> any), which suggests we may want it to be a mapping whenever
>> indirection occurs (from __file__ to the indirection chain for the
>> simple case), or None if no indirection was involved in loading the
>> module.
>
> Could you do it post-import? I mean the process of finding the .ref files is
> the same, so you could just look for the files, accumulate the list, and
> then see which ones ended up on __path__ and see "this is where these came
> from". That does away with any potentially nasty API changes to make
> __indirect__ work.

I'm not going to think about it much more until Eric has a chance to
chime in with how he solved the reference loop detection problem for
his draft implementation. I still suspect that whatever mechanism
handles the loop detection will be able to accumulate the __indirect__
entries as it goes (regardless of the exact data structure we end up
using).

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ericsnowcurrently at gmail.com  Tue Jul 30 05:11:59 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 29 Jul 2013 21:11:59 -0600
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
Message-ID: <CALFfu7AQo20kjw6j9gsrQr7mi1z0ukbDox5Fxt0OEJJtMhGyHg@mail.gmail.com>

Sorry for the delay, all.  I totally forgot about an upcoming trip and just
got back.  It'll be a day or two before I have a minute to respond.

-eric


On Thu, Jul 18, 2013 at 4:10 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:

> Hi,
>
> Nick talked me into writing this PEP, so blame him for the idea. <wink>  I
> haven't had a chance to polish it up, but the content communicates the
> proposal well enough to post here.  Let me know what you think.  Once some
> concensus is reached I'll commit the PEP and post to python-dev.  I have a
> rough implementation that'll I'll put online when I get a chance.
>
> If Guido is not interested maybe Brett would like to be BDFL-Delegate. :)
>
> -eric
>
>
> PEP: 4XX
> Title: Per-Module Import Path
> Version: $Revision$
> Last-Modified: $Date$
> Author: Eric Snow <ericsnowcurrently at gmail.com>
>         Nick Coghlan <ncoghlan at gmail.com>
> BDFL-Delegate: ???
> Discussions-To: import-sig at python.org
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 17-Jul-2013
> Python-Version: 3.4
> Post-History: 18-Jul-2013
> Resolution:
>
>
> Abstract
> =======
>
> Path-based import of a module or package involves traversing ``sys.path``
> or a package path to locate the appropriate file or directory(s).
> Redirecting from there to other locations is useful for packaging and
> for virtual environments.  However, in practice such redirection is
> currently either `limited or fragile <Existing Alternatives>`_.
>
> This proposal provides a simple filesystem-based method to redirect from
> the normal module search path to other locations recognized by the
> import system.  This involves one change to path-based imports, adds one
> import-related file type, and introduces a new module attribute.  One
> consequence of this PEP is the deprecation of ``.pth`` files.
>
>
> Motivation
> =========
>
> One of the problems with virtual environments is that you are likely to
> end up with duplicate installations of lots of common packages, and
> keeping them up to date can be a pain.
>
> One of the problems with archive-based distribution is that it can be
> tricky to register the archive as a Python path entry when needed
> without polluting the path of applications that don't need it.
>
> One of the problems with working directly from a source checkout is
> getting the relevant source directories onto the Python path, especially
> when you have multiple namespace package fragments spread across several
> subdirectories of a large repository.
>
> The `current solutions <Existing Alternatives>`_ all have their flaws.
> Reference files are intended to address those deficiencies.
>
>
> Specification
> ===========
>
> Change to the Import System
> -----------------------------
>
> Currently, during `path-based import` of a module, the following happens
> for each `path entry` of `sys.path` or of the `__path__` of the module's
> parent:
>
> 1. look for `<path entry>/<name>/__init__.py` (and other supported
> suffixes),
>   * return `loader`;
> 2. look for `<path entry>/<name>.py` (and other supported suffixes),
>   * return loader;
> 3. look for `<path entry>/<name>/`,
>   * extend namespace portions path.
>
> Once the path is exhausted, if no `loader` was found and the `namespace
> portions` path is non-empty, then a `NamespaceLoader` is returned with that
> path.
>
> This proposal inserts a step before step 1 for each `path entry`:
>
> 0. look for `<path entry>/<name>.ref`
>   a. get loader for `<fullname>` (absolute module name) using path found
> in `.ref` file (see below) using `the normal mechanism`[link to language
> reference],
>     * stop processing the path entry if `.ref` file is empty;
>   b. check for `NamespaceLoader`,
>     * extend namespace portions path;
>   c. otherwise, return loader.
>
> Note the following consequences:
>
> * if a ref file is found, it takes precedence over module files and
> package directories under the same path entry (see `Empty Ref Files as
> Markers`_);
> * that holds for empty ref files also;
> * the loader for a ref file, if any, comes from the full import system
> (i.e. `sys.meta_path`) rather than just the path-based import system;
> * `.ref` files can indirectly provide fragments for namespace packages.
>
> Reference Files
> ---------------
>
> A new kind of file will live alongside package directories and module
> source files: reference files.  These files have the following
> characteristics:
>
> * named `<module name>.ref` in contrast to `<module name>.py` (etc.) or
> `<module name>/`;
> * placed under `sys.path` entries or package path (just like modules and
> packages).
>
> Reference File Format
> ----------------------
>
> The contents of a reference file will conform to the following format:
>
> * contain zero or more path entries, just like sys.path;
> * one path entry per line;
> * path entry order is preserved;
> * may contain comment lines starting with "#", which are ignored;
> * may contain blank lines, which are ignored;
> * must be UTF-8 encoded.
>
> Directory Path Entries
> ----------------------
>
> Directory names are by far the most common type of path entry.  Here is
> how they are constrained in reference files:
>
> * may be absolute or relative;
> * must be forward slash separated regardless of platform;
> * each must be the parent directory where the module will be looked for.
>
> To be clear, reference files (just like `sys.path`) deliberately reference
> the *parent* directory to be searched (rather than the module or package
> directory).  So they work transparently with `__pycache__` and allow
> searching for `.dist-info <PEP 376>`_ directories through them.
>
> Relative directory names will be resolved based on the directory
> containing the ref file, rather than the current working directory.
>  Allowing relative directory names allows you to include sensible ref files
> in a source repo.
>
> Empty Ref Files as Markers
> -----------------------------
>
> Handling `.ref` files first allows for the use of empty ref files as
> markers to indicate "this is not the module you are looking for".  Here are
> two situations where that helps.
>
> First, an empty ref file helps resolve conflicts between script names and
> package names.  When the interpreter is started with a filename, the
> directory of that script is added to the front of `sys.path`.  This may be
> a problem for later imports where the intended module or package is on a
> regular path entry.
>
> If an import references the script's name, the file will get run again by
> the import system as a module (only `__main__` was added to `sys.modules`
> earlier) [PEP 395]_.  This is a further problem if you meant to import a
> module or package in another path entry.
>
> The presence of an empty ref file in the script's directory would
> essentially render it invisible to the import system.  This problem and
> solution apply for all of the files or directories in the script's
> directory.
>
> Second, the namespace package mechanism has a side-effect: a directory
> without a __init__.py may be incorrectly treated as a namespace package
> fragment.  The presence of an empty ref file indicates such a directory
> should be ignored.
>
> A Module Attribute to Expose Contributing Ref Files
> ---------------------------------------------
>
> Knowing the origin of a module is important when tracking down problems,
> particularly import-related ones.  Currently, that entails looking at
> `<module>.__file__` and `<module.__package__>.__path__` (or `sys.path`).
>
> With this PEP there can be a chain of ref files in between the currently
> available path and a module's __file__.  Having access to that list of ref
> files is important in order to determine why one file was selected over
> another as the origin for the module.  When an unexpected file gets used
> for one of your imports, you'll care about this!
>
> In order to facilitate that, modules will have a new attribute:
> `__indirect__`.  It will be a tuple comprised of the chain of ref files, in
> order, used to locate the module's __file__.  An empty tuple or with one
> item will be the most common case.  An empty tuple indicates that no ref
> files were used to locate the module.
>
> Examples
> --------
>
> XXX are these useful?
>
> Top-level module (`import spam`)::
>
>   ~/venvs/ham/python/site-packages/
>       spam.ref
>
>   spam.ref:
>       # use the system installed module
>       /python/site-packages
>
>   /python/site-packages:
>       spam.py
>
>   spam.__file__:
>       "/python/site-packages/spam.py"
>
>   spam.__indirect__:
>       ("~/venvs/ham/python/site-packages/spam.ref",)
>
> Submodule (`python -m myproject.tests`)::
>
>   ~/myproject/
>       setup.py
>       tests/
>           __init__.py
>           __main__.py
>       myproject/
>           __init__.py
>           tests.ref
>
>   tests.ref:
>       ../
>
>   myproject.__indirect__:
>       ()
>
>   myproject.tests.__file__:
>       "~/myproject/tests/__init__.py"
>
>   myproject.tests.__indirect__:
>       ("~/myproject/myproject/tests.ref",)
>
> Multiple Path Entries::
>
>   myproj/
>       __init__.py
>       mod.ref
>
>   mod.ref:
>       # fall back to the old one
>       /python/site-packages/mod-new/
>       /python/site-packages/mod-old/
>
>   /python/site-packages/
>       mod-old/
>           mod.py
>
>   myproj.mod.__file__:
>       "/python/site-packages/mod-old/mod.py"
>
>   myproj.mod.__indirect__:
>       ("myproj/mod.ref",)
>
> Chained Ref Files::
>
>   venvs/ham/python/site-packages/
>       spam.ref
>
>   venvs/ham/python/site-packages/spam.ref:
>       # use the system installed module
>       /python/site-packages
>
>   /python/site-packages/
>       spam.ref
>
>   /python/site-packages/spam.ref:
>       # use the clone
>       ~/clones/myproj/
>
>   ~/clones/myproj/
>       spam.py
>
>   spam.__file__:
>       "~/clones/myproj/spam.py"
>
>   spam.__indirect__:
>       ("venvs/ham/python/site-packages/spam.ref",
> "/python/site-packages/spam.ref")
>
> Reference Implementation
> ------------------------
>
> A reference implementation is available at <TBD>.
>
> XXX double-check zipimport support
>
>
> Deprecation of .pth Files
> =============================
>
> The `site` module facilitates the composition of `sys.path`.  As part of
> that, `.pth` files are processed and entries added to `sys.path`.  Ref
> files are intended as a replacement.
>
> XXX also deprecate .pkg files (see pkgutil.extend_path())?
>
> Consequently, `.pth` files will be deprecated.
>
> Deprecation Schedule
> -------------------------
>
> 1. documented: 3.4,
> 2. warnings: 3.5 and 3.6,
> 3. removal: 3.7
>
> XXX Deprecate sooner?
>
>
> Existing Alternatives
> =================
>
> .pth Files
> ----------
>
> `*.pth` files have the problem that they're global: if you add them to
> `site-packages`, they will be processed at startup by *every* Python
> application run using that Python installation. This is an undesirable side
> effect of the way `*.pth` processing is defined, but can't be changed due
> to backwards compatibility issues.
>
> Furthermore, `*.pth` files are processed at interpreter startup...
>
> .egg-link files
> --------------
>
> `*.egg-link` files are much closer to the proposed `*.ref` files. The
> difference is that `*.egg-link` files are designed to work with
> `pkg_resources` and `distribution names`, while `*.ref files` are designed
> to work with package and module names as an automatic part of the import
> system.
>
> Symlinks
> ---------
>
> Actual symlinks have the problem that they aren't really practical on
> Windows, and also that they don't support non-versioned references to
> versioned `dist-info` directories.
>
> Design Alternatives
> ===================
>
> Ignore Empty Ref Files
> ----------------------
>
> An empty ref file would be ignored rather than effectively stopping the
> processing of the path entry.  This loses the benefits outlined above of
> empty ref files as markers.
>
> ImportError for Empty Ref Files
> -------------------------------
>
> An empty ref file would result in an ImportError.  The only benefit to
> this would be to disallow empty ref files and make it clear when they are
> encountered.
>
> Handle Ref Files After Namespace Packages
> -----------------------------------------
>
> Rather than handling ref files first, they could be handled last.  Thus
> they would have lower priority than namespace package fragments.  This
> would be insignificantly more backward compatible.  However, as with
> ignoring empty ref files, handling them last would prevent their use as
> markers for ignoring a path entry.
>
> Send Ref File Path Through Path Import System Only
> --------------------------------------------------
>
> As indicated above, the path entries in a ref file are passed back through
> the metapath finders to get the loader.  Instead we could use just the
> path-based import system.  This would prevent metapath finders from having
> a chance to handle the module under a different path.
>
> Restrict Ref File Path Entries to Directories
> ---------------------------------------------
>
> Rather than allowing anything for the path entries in a ref file, they
> could be restricted to just directories.  This is by far the common case.
>  However, it would add complexity without any justification for not
> allowing metapath importers a chance at the module under a new path.
>
> Restrict Directories in Ref File Path Entries to Absolute
> ---------------------------------------------------------
>
> Directory path entries in ref files can be relative or absolute.  Limiting
> to just absolute directory names would be an artificial change to existing
> constraints on path entries without any justification.  Furthermore, it
> would prevent simple use of ref files in code bases relative to project
> roots.
>
>
> Future Extensions
> ===============
>
> Longer term, we should also allow *versioned* `*.ref` files that can be
> used to reference modules and packages that aren't available for ordinary
> import (since they don't follow the "name.ref" format), but are available
> to tools like `pkg_resources` to handle parallel installs of different
> versions.
>
>
> References
> ==========
>
> .. [0] ...
>        ()
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> ^L
> ..
>    Local Variables:
>    mode: indented-text
>    indent-tabs-mode: nil
>    sentence-end-double-space: t
>    fill-column: 70
>    coding: utf-8
>    End:
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130729/9989793e/attachment-0001.html>

From ericsnowcurrently at gmail.com  Wed Jul 31 08:41:20 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 31 Jul 2013 00:41:20 -0600
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
Message-ID: <CALFfu7CJxRUH-Cc3c6oYeZFN9=R5txRoug1OUEnNApo5UhnvwQ@mail.gmail.com>

On Fri, Jul 19, 2013 at 8:51 AM, Brett Cannon <brett at python.org> wrote:

> If this can lead to the deprecation of .pth files then I support the idea,
> but I think there are technical issues in terms of implementation that have
> not been throught through yet. This is going to require an implementation
> (even if it isn't in importlib._bootstrap but as a subclass of
> importlib.machinery.FileFinder or something) to see how you plan to make
> all of this work before this PEP can move beyond this SIG.
>

I'll address any outstanding concerns separately and update the PEP
pursuant to outstanding recommendations.  In the meantime, below is a rough
draft of the implementation.  We can factor out any artificial complexity
I've introduced <wink/>, but it should reflect the approach that came to
mind first for me.  You'll notice that I call _find_module() directly (for
sys.meta_path traversal).

As already noted, the whole issue of how to populate __indirect__ is
tricky, both in how to feed it to the loader and how it should look for
namespace packages.  I just stuck a placeholder in there for the moment.
 The rest of the implementation is pretty trivial in comparison.  However
it does reflect my approach to handling cycles and to aggregating the list
for __indirect__.

I'll make it more clear in the PEP that refpath resolution is
depth-first--a consequence of doing normal loader lookup.  This means that
in the face of a cycle the normal package/module/ns package handling
happens rather than acting like there was an empty .ref file (but only if
no other path entries in the .ref file pan out first).  Would it be better
to treat this case the same as an empty .ref file?

-eric

diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
--- a/Lib/importlib/_bootstrap.py
+++ b/Lib/importlib/_bootstrap.py
@@ -1338,7 +1338,9 @@
         sys.path_importer_cache."""
         if path is None:
             path = sys.path
-        loader, namespace_path = cls._get_loader(fullname, path)
+        loader, namespace_path, indirect = cls._get_loader(fullname, path)
+        # XXX What to do with indirect?
+        # XXX How to handle __indirect__ for namespace packages?
         if loader is not None:
             return loader
         else:
@@ -1372,6 +1374,7 @@
         self._path_mtime = -1
         self._path_cache = set()
         self._relaxed_path_cache = set()
+        self._visited_refnames = {}

     def invalidate_caches(self):
         """Invalidate the directory mtime."""
@@ -1379,6 +1382,25 @@

     find_module = _find_module_shim

+    def process_ref_file(self, fullname, refname):
+        """Return the path specified in a .ref file."""
+        path = []
+        with open(refname, encoding='UTF-8') as reffile:
+            for line in reffile:
+                # XXX Be more lenient on leading/trailing whitespace?
+                line = line.strip()
+                # ignore comments and blank lines
+                if line.startswith("#"):
+                    continue
+                if not line:
+                    continue
+                # resolve the target
+                if not line.startswith("/"):
+                    line = self.path + "/" + line
+                target = os.path.join(*line.split("/"))
+                path.append(target)
+        return path
+
     def find_loader(self, fullname):
         """Try to find a loader for the specified module, or the namespace
         package portions. Returns (loader, list-of-portions)."""
@@ -1398,6 +1420,29 @@
         else:
             cache = self._path_cache
             cache_module = tail_module
+        # Handle indirection files first.
+        if fullname not in self._visited_refnames:
+            indirect = []
+            visited = set()
+            self._visited_refnames[fullname] = (indirect, visited)
+        else:
+            indirect, visited = self._visited_refnames[fullname]
+        refname = _path_join(self.path, tail_module) + '.ref'
+        if refname not in visited and refname in cache:
+            visited.add(refname)
+            indirect.append(refname)
+            refpath = self.process_ref_file(fullname, refname)
+            if not refpath:
+                # An empty ref file is a marker for skipping this path
entry.
+                indirect.pop()
+                return (None, [])
+            loader = _find_module(fullname, refpath)
+            if isinstance(loader, NamespaceLoader):
+                return (None, loader._path, indirect)
+            else:
+                return (loader, [], indirect)
+        if indirect and indirect[0] == refname:
+            del self._visited_refnames[fullname]
         # Check if the module is the name of a directory (and thus a
package).
         if cache_module in cache:
             base_path = _path_join(self.path, tail_module)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130731/ac5963b5/attachment.html>

From brett at python.org  Wed Jul 31 14:29:37 2013
From: brett at python.org (Brett Cannon)
Date: Wed, 31 Jul 2013 08:29:37 -0400
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CALFfu7CJxRUH-Cc3c6oYeZFN9=R5txRoug1OUEnNApo5UhnvwQ@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
	<CALFfu7CJxRUH-Cc3c6oYeZFN9=R5txRoug1OUEnNApo5UhnvwQ@mail.gmail.com>
Message-ID: <CAP1=2W4BuwOe1OxRg8t7jZfTEiRHP204QoQ8tOaVo8OadXMu1g@mail.gmail.com>

On Wed, Jul 31, 2013 at 2:41 AM, Eric Snow <ericsnowcurrently at gmail.com>wrote:

>
>
>
> On Fri, Jul 19, 2013 at 8:51 AM, Brett Cannon <brett at python.org> wrote:
>
>> If this can lead to the deprecation of .pth files then I support the
>> idea, but I think there are technical issues in terms of implementation
>> that have not been throught through yet. This is going to require an
>> implementation (even if it isn't in importlib._bootstrap but as a subclass
>> of importlib.machinery.FileFinder or something) to see how you plan to make
>> all of this work before this PEP can move beyond this SIG.
>>
>
> I'll address any outstanding concerns separately and update the PEP
> pursuant to outstanding recommendations.  In the meantime, below is a rough
> draft of the implementation.  We can factor out any artificial complexity
> I've introduced <wink/>, but it should reflect the approach that came to
> mind first for me.  You'll notice that I call _find_module() directly (for
> sys.meta_path traversal).
>
> As already noted, the whole issue of how to populate __indirect__ is
> tricky, both in how to feed it to the loader and how it should look for
> namespace packages.  I just stuck a placeholder in there for the moment.
>

You also changed the return value signature of find_loader() which is
really cheating since you can't do that (I'm really kicking myself for not
thinking through the implications of returning a tuple for find_loader()).


> The rest of the implementation is pretty trivial in comparison.  However
> it does reflect my approach to handling cycles and to aggregating the list
> for __indirect__.
>
> I'll make it more clear in the PEP that refpath resolution is
> depth-first--a consequence of doing normal loader lookup.  This means that
> in the face of a cycle the normal package/module/ns package handling
> happens rather than acting like there was an empty .ref file (but only if
> no other path entries in the .ref file pan out first).  Would it be better
> to treat this case the same as an empty .ref file?
>

I would argue a cycle is an error and you should raise ImportError.

-Brett


>
> -eric
>
>
> diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
> --- a/Lib/importlib/_bootstrap.py
> +++ b/Lib/importlib/_bootstrap.py
> @@ -1338,7 +1338,9 @@
>          sys.path_importer_cache."""
>          if path is None:
>              path = sys.path
> -        loader, namespace_path = cls._get_loader(fullname, path)
> +        loader, namespace_path, indirect = cls._get_loader(fullname, path)
> +        # XXX What to do with indirect?
> +        # XXX How to handle __indirect__ for namespace packages?
>          if loader is not None:
>              return loader
>          else:
> @@ -1372,6 +1374,7 @@
>          self._path_mtime = -1
>          self._path_cache = set()
>          self._relaxed_path_cache = set()
> +        self._visited_refnames = {}
>
>      def invalidate_caches(self):
>          """Invalidate the directory mtime."""
> @@ -1379,6 +1382,25 @@
>
>      find_module = _find_module_shim
>
> +    def process_ref_file(self, fullname, refname):
> +        """Return the path specified in a .ref file."""
> +        path = []
> +        with open(refname, encoding='UTF-8') as reffile:
> +            for line in reffile:
> +                # XXX Be more lenient on leading/trailing whitespace?
> +                line = line.strip()
> +                # ignore comments and blank lines
> +                if line.startswith("#"):
> +                    continue
> +                if not line:
> +                    continue
> +                # resolve the target
> +                if not line.startswith("/"):
> +                    line = self.path + "/" + line
> +                target = os.path.join(*line.split("/"))
> +                path.append(target)
> +        return path
> +
>      def find_loader(self, fullname):
>          """Try to find a loader for the specified module, or the namespace
>          package portions. Returns (loader, list-of-portions)."""
> @@ -1398,6 +1420,29 @@
>          else:
>              cache = self._path_cache
>              cache_module = tail_module
> +        # Handle indirection files first.
> +        if fullname not in self._visited_refnames:
> +            indirect = []
> +            visited = set()
> +            self._visited_refnames[fullname] = (indirect, visited)
> +        else:
> +            indirect, visited = self._visited_refnames[fullname]
> +        refname = _path_join(self.path, tail_module) + '.ref'
> +        if refname not in visited and refname in cache:
> +            visited.add(refname)
> +            indirect.append(refname)
> +            refpath = self.process_ref_file(fullname, refname)
> +            if not refpath:
> +                # An empty ref file is a marker for skipping this path
> entry.
> +                indirect.pop()
> +                return (None, [])
> +            loader = _find_module(fullname, refpath)
> +            if isinstance(loader, NamespaceLoader):
> +                return (None, loader._path, indirect)
> +            else:
> +                return (loader, [], indirect)
> +        if indirect and indirect[0] == refname:
> +            del self._visited_refnames[fullname]
>          # Check if the module is the name of a directory (and thus a
> package).
>          if cache_module in cache:
>              base_path = _path_join(self.path, tail_module)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130731/650e4b5f/attachment.html>

From ericsnowcurrently at gmail.com  Wed Jul 31 22:40:00 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 31 Jul 2013 14:40:00 -0600
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CAP1=2W4BuwOe1OxRg8t7jZfTEiRHP204QoQ8tOaVo8OadXMu1g@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
	<CALFfu7CJxRUH-Cc3c6oYeZFN9=R5txRoug1OUEnNApo5UhnvwQ@mail.gmail.com>
	<CAP1=2W4BuwOe1OxRg8t7jZfTEiRHP204QoQ8tOaVo8OadXMu1g@mail.gmail.com>
Message-ID: <CALFfu7BuCgM1b-AAo8NKaaGD8vWSTA_=c1Sgq4WRSh85vZ3Crg@mail.gmail.com>

On Wed, Jul 31, 2013 at 6:29 AM, Brett Cannon <brett at python.org> wrote:

> On Wed, Jul 31, 2013 at 2:41 AM, Eric Snow <ericsnowcurrently at gmail.com>wrote:
>
>> I'll make it more clear in the PEP that refpath resolution is
>> depth-first--a consequence of doing normal loader lookup.  This means that
>> in the face of a cycle the normal package/module/ns package handling
>> happens rather than acting like there was an empty .ref file (but only if
>> no other path entries in the .ref file pan out first).  Would it be better
>> to treat this case the same as an empty .ref file?
>>
>
> I would argue a cycle is an error and you should raise ImportError.
>

I'll make a point of this.  An ImportError makes sense.  Either way I'll
provide some rationale.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130731/4bd4b86a/attachment.html>

From ericsnowcurrently at gmail.com  Wed Jul 31 22:37:38 2013
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 31 Jul 2013 14:37:38 -0600
Subject: [Import-SIG] PEP proposal: Per-Module Import Path
In-Reply-To: <CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
References: <CALFfu7C6QoG6YEiV0C7sk_Hrc_yFZ3D=gL=9Y6DpF5PkumtHvg@mail.gmail.com>
	<CAP1=2W5duL+e+zdF9bnN2QtO0LsXhkDLz8VY-mMCad=Qqn13TA@mail.gmail.com>
Message-ID: <CALFfu7A6qYrEiwoNT9dLTR1CR+t7o7R8BgKGEt3OQMeGxYhjbA@mail.gmail.com>

On Fri, Jul 19, 2013 at 8:51 AM, Brett Cannon <brett at python.org> wrote:

> If this can lead to the deprecation of .pth files then I support the idea,
> but I think there are technical issues in terms of implementation that have
> not been throught through yet. This is going to require an implementation
> (even if it isn't in importlib._bootstrap but as a subclass of
> importlib.machinery.FileFinder or something) to see how you plan to make
> all of this work before this PEP can move beyond this SIG.
>
>
> On Thu, Jul 18, 2013 at 6:10 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:
>
>> Hi,
>>
>> Nick talked me into writing this PEP, so blame him for the idea. <wink>
>>  I haven't had a chance to polish it up, but the content communicates the
>> proposal well enough to post here.  Let me know what you think.  Once some
>> concensus is reached I'll commit the PEP and post to python-dev.  I have a
>> rough implementation that'll I'll put online when I get a chance.
>>
>> If Guido is not interested maybe Brett would like to be BDFL-Delegate. :)
>>
>> -eric
>>
>>
>> PEP: 4XX
>> Title: Per-Module Import Path
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Eric Snow <ericsnowcurrently at gmail.com>
>>         Nick Coghlan <ncoghlan at gmail.com>
>> BDFL-Delegate: ???
>> Discussions-To: import-sig at python.org
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 17-Jul-2013
>> Python-Version: 3.4
>> Post-History: 18-Jul-2013
>> Resolution:
>>
>>
>> Abstract
>> =======
>>
>> Path-based import of a module or package involves traversing ``sys.path``
>> or a package path to locate the appropriate file or directory(s).
>> Redirecting from there to other locations is useful for packaging and
>> for virtual environments.  However, in practice such redirection is
>> currently either `limited or fragile <Existing Alternatives>`_.
>>
>> This proposal provides a simple filesystem-based method to redirect from
>> the normal module search path to other locations recognized by the
>> import system.  This involves one change to path-based imports, adds one
>> import-related file type, and introduces a new module attribute.  One
>> consequence of this PEP is the deprecation of ``.pth`` files.
>>
>>
>> Motivation
>> =========
>>
>> One of the problems with virtual environments is that you are likely to
>> end up with duplicate installations of lots of common packages, and
>> keeping them up to date can be a pain.
>>
>> One of the problems with archive-based distribution is that it can be
>>
>
> You say "One of the problems" at the start of 3/4 of the paragraphs in
> this section. Variety is the spice of life. =) Try "Another problem is that
> for archive-based", etc.
>

This is basically a copy-and-paste from what Nick originally write, but I
kind of liked the enumeration feel of it.  However I'm not attached to it.
:)


>
>> tricky to register the archive as a Python path entry when needed
>> without polluting the path of applications that don't need it.
>>
>
> How is this unique to archive-based distributions compared to any other
> scenario where all distributions are blindly added to sys.path?
>

Thanks to Nick for explaining.  I'll update the PEP to be more clear on
this, since it's a pretty critical point of the proposal.


>
>
>>
>> One of the problems with working directly from a source checkout is
>> getting the relevant source directories onto the Python path, especially
>> when you have multiple namespace package fragments spread across several
>> subdirectories of a large repository.
>>
>>
> E.g., a source checkout for the coverage.py project might be stored in the
> directory ``coveragepy``, but the actual source code is stored in
> ``coveragepy/coverage``, requiring ``coveragepy`` to be on sys.path in
> order to access the package.
>

That's a good example.  I'll add it to the PEP.  I'm also considering at
least a variation on the Beaker example Nick gave.


>
>
>> The `current solutions <Existing Alternatives>`_ all have their flaws.
>> Reference files are intended to address those deficiencies.
>>
>>
>> Specification
>> ===========
>>
>> Change to the Import System
>> -----------------------------
>>
>> Currently, during `path-based import` of a module, the following happens
>> for each `path entry` of `sys.path` or of the `__path__` of the module's
>> parent:
>>
>> 1. look for `<path entry>/<name>/__init__.py` (and other supported
>> suffixes),
>>   * return `loader`;
>> 2. look for `<path entry>/<name>.py` (and other supported suffixes),
>>   * return loader;
>> 3. look for `<path entry>/<name>/`,
>>   * extend namespace portions path.
>>
>>
> Please capitalize the first letter of each bullet point (here and the rest
> of the PEP). Reads better since they are each separate sentences.
>

I wrote it in the way that felt most naturally structurally, but I'll admit
that I haven't brushed up on my MLA recommendations in many years. :)
 I''ll change it.


>
>> Once the path is exhausted, if no `loader` was found and the `namespace
>> portions` path is non-empty, then a `NamespaceLoader` is returned with that
>> path.
>>
>> This proposal inserts a step before step 1 for each `path entry`:
>>
>> 0. look for `<path entry>/<name>.ref`
>>
>
> Why .ref? Why not .path?
>

It doesn't matter to me too much, though I agree with Nick that .path is
too close to .pth.  Perhaps .pyp or .pypath would work.  (I believe .pyp
was also associated with another, albeit rejected, import-related  PEP.)
 Having the suffix clearly infer the purpose would be nice.  Regardless,
I'll be sure to add more rationale and considered alternatives for the
suffix.


>
>>   a. get loader for `<fullname>` (absolute module name) using path found
>> in `.ref` file (see below) using `the normal mechanism`[link to language
>> reference],
>>     * stop processing the path entry if `.ref` file is empty;
>>
>
> You should clarify how you plan to "get loader". You will have to find the
> proper finder as well in case the .ref file references a zip file or
> something which requires a different finder than the one which came across
> the .ref file.
>

Hopefully this part is more clear.  I plan on elaborating on the
relationship of the PEP with importlib and FileFinder specifically.  I will
explain there on how we get the loader (importlib.find_loader) and on why
we recurse against sys.meta_path rather than just sys.path.


>
>>   b. check for `NamespaceLoader`,
>>     * extend namespace portions path;
>>   c. otherwise, return loader.
>>
>> Note the following consequences:
>>
>> * if a ref file is found, it takes precedence over module files and
>> package directories under the same path entry (see `Empty Ref Files as
>> Markers`_);
>> * that holds for empty ref files also;
>> * the loader for a ref file, if any, comes from the full import system
>> (i.e. `sys.meta_path`) rather than just the path-based import system;
>> * `.ref` files can indirectly provide fragments for namespace packages.
>>
>
> This ramification for namespace packages make the changed semantic
> proposal a bit trickier than you are suggesting since you are essentially
> doing recursive path entry search. And is that possible? If I have a .ref
> file that refers to a path which itself has a .ref will that then lead to
> another search? I mean it seems like you going to be doing ``return
> importlib.find_loader(fullname, paths_found_in_ref) if paths_found_in_ref
> else None, []`` from within a finder which finds a .ref file, which itself
> would support a recursive search.
>

Hopefully my rough patch clarified this a bit.  As long as special-casing
NamespaceLoader and using its _path attribute are okay, namespace packages
should be fine.  I expect that there aren't any side effects from just
getting a NamespaceLoader instance from importlib.find_loader(), but I'll
double-check.  I'll probably also propose renaming NamespaceLoader's
"_path" attribute to "path" (the "public" version).

Regardless, I plan on having more explanation on nested .ref files, just so
there's no confusion.


> Everything below should come before the import changes. It's hard to
> follow what is really be proposed for semantics without  knowing e.g. .ref
> files can have 0 or more paths and just a single path, etc.
>

Makes sense.  I'll fix this.


>
>
>> Reference Files
>> ---------------
>>
>> A new kind of file will live alongside package directories and module
>> source files: reference files.  These files have the following
>> characteristics:
>>
>> * named `<module name>.ref` in contrast to `<module name>.py` (etc.) or
>> `<module name>/`;
>> * placed under `sys.path` entries or package path (just like modules and
>> packages).
>>
>> Reference File Format
>> ----------------------
>>
>> The contents of a reference file will conform to the following format:
>>
>> * contain zero or more path entries, just like sys.path;
>> * one path entry per line;
>> * path entry order is preserved;
>> * may contain comment lines starting with "#", which are ignored;
>> * may contain blank lines, which are ignored;
>> * must be UTF-8 encoded.
>>
>> Directory Path Entries
>> ----------------------
>>
>> Directory names are by far the most common type of path entry.  Here is
>> how they are constrained in reference files:
>>
>> * may be absolute or relative;
>> * must be forward slash separated regardless of platform;
>> * each must be the parent directory where the module will be looked for.
>>
>> To be clear, reference files (just like `sys.path`) deliberately
>> reference the *parent* directory to be searched (rather than the module or
>> package directory).  So they work transparently with `__pycache__` and
>> allow searching for `.dist-info <PEP 376>`_ directories through them.
>>
>> Relative directory names will be resolved based on the directory
>> containing the ref file, rather than the current working directory.
>>  Allowing relative directory names allows you to include sensible ref files
>> in a source repo.
>>
>> Empty Ref Files as Markers
>> -----------------------------
>>
>> Handling `.ref` files first allows for the use of empty ref files as
>> markers to indicate "this is not the module you are looking for".  Here are
>> two situations where that helps.
>>
>
> "Here" -> "There"
>

Here.  There.  It's somewhere, right? <wink>


>
>
>>
>> First, an empty ref file helps resolve conflicts between script names and
>> package names.  When the interpreter is started with a filename, the
>> directory of that script is added to the front of `sys.path`.  This may be
>> a problem for later imports where the intended module or package is on a
>> regular path entry.
>>
>> If an import references the script's name, the file will get run again by
>> the import system as a module (only `__main__` was added to `sys.modules`
>> earlier) [PEP 395]_.  This is a further problem if you meant to import a
>> module or package in another path entry.
>>
>> The presence of an empty ref file in the script's directory would
>> essentially render it invisible to the import system.  This problem and
>> solution apply for all of the files or directories in the script's
>> directory.
>>
>> Second, the namespace package mechanism has a side-effect: a directory
>> without a __init__.py may be incorrectly treated as a namespace package
>> fragment.  The presence of an empty ref file indicates such a directory
>> should be ignored.
>>
>> A Module Attribute to Expose Contributing Ref Files
>> ---------------------------------------------
>>
>> Knowing the origin of a module is important when tracking down problems,
>> particularly import-related ones.  Currently, that entails looking at
>> `<module>.__file__` and `<module.__package__>.__path__` (or `sys.path`).
>>
>> With this PEP there can be a chain of ref files in between the currently
>> available path and a module's __file__.  Having access to that list of ref
>> files is important in order to determine why one file was selected over
>> another as the origin for the module.  When an unexpected file gets used
>> for one of your imports, you'll care about this!
>>
>> In order to facilitate that, modules will have a new attribute:
>> `__indirect__`.  It will be a tuple comprised of the chain of ref files, in
>> order, used to locate the module's __file__.  An empty tuple or with one
>> item will be the most common case.  An empty tuple indicates that no ref
>> files were used to locate the module.
>>
>
> This complicates things even further. How are you going to pass this info
> along a call chain through find_loader()? Are we going to have to add
> find_loader3() to support this (nasty side-effect of using tuples instead
> of types.SimpleNamespace for the return value)? Some magic second value or
> type from find_loader() which flags the values in the iterable are from a
> .ref file and not any other possible place? This requires an API change and
> there isn't any mention of how that would look or work.
>

This is the big open question in my mind.  I suppose having find_loader()
return a SimpleNamespace would help.  Then the indirect path we aggregate
in find_loader() could be passed as a new argument to loaders (when
instantiated in either FileFinder.find_loader() or in
PathFinder.find_module().

Here are the options I see, some more realistic than others:

1. Build __indirect__ after the fact (in init_module_attrs()?).
2. Change FileFinder.find_loader() to return a types.SimpleNamespace
instance.
3. Change FileFinder.find_loader() to return a namedtuple subclass with an
extra "loader" attribute.
4. Piggy-back the indirect path on the loader returned by
FileFinder.find_loader() in an "_indirect" attribute (or in the loader spot
in the case of namespace packages).
5. Something along the lines of Nick's IndirectReference.
6. Wrap the loader in a proxy that also sets __indirect__ when
load_module() is called.
7. Totally refactor the import system so that ModuleSpec objects are passed
to metapath finders rather than (name, path) and simply store the indirect
path on the spec (which is used directly to load the module rather than the
loader).

4 feels too much like a hack, particularly when we have other options.  7
would need a PEP of its own (forthcoming <wink>).

>
I see 2 as the best one.  Is it really too late to change the return type
of FileFinder.find_loader()? If we simply can't bear the backward
compatibility risk (no matter how small <wink>), I'd advocate for one of 1,
3, 5, or 6.

I haven't looked closely at what it would take, but I have a feeling that 1
would be more tricky than the others.  3, 5, and 6 would work, but at the
cost of increased complexity and all that it entails.


>
>>
>> Examples
>> --------
>>
>> XXX are these useful?
>>
>
> Yes if you change this to pip or setuptools and and also make it so it
> shows how you could point to version-specific distributions.
>

Good point.  I'll change these to have a more practical focus.


>
>
>>
>> Top-level module (`import spam`)::
>>
>>   ~/venvs/ham/python/site-packages/
>>       spam.ref
>>
>>   spam.ref:
>>       # use the system installed module
>>       /python/site-packages
>>
>>   /python/site-packages:
>>       spam.py
>>
>>   spam.__file__:
>>       "/python/site-packages/spam.py"
>>
>>   spam.__indirect__:
>>       ("~/venvs/ham/python/site-packages/spam.ref",)
>>
>> Submodule (`python -m myproject.tests`)::
>>
>>   ~/myproject/
>>       setup.py
>>       tests/
>>           __init__.py
>>           __main__.py
>>       myproject/
>>           __init__.py
>>           tests.ref
>>
>>   tests.ref:
>>       ../
>>
>>   myproject.__indirect__:
>>       ()
>>
>>   myproject.tests.__file__:
>>       "~/myproject/tests/__init__.py"
>>
>>   myproject.tests.__indirect__:
>>       ("~/myproject/myproject/tests.ref",)
>>
>> Multiple Path Entries::
>>
>>   myproj/
>>       __init__.py
>>       mod.ref
>>
>>   mod.ref:
>>       # fall back to the old one
>>       /python/site-packages/mod-new/
>>       /python/site-packages/mod-old/
>>
>>   /python/site-packages/
>>       mod-old/
>>           mod.py
>>
>>   myproj.mod.__file__:
>>       "/python/site-packages/mod-old/mod.py"
>>
>>   myproj.mod.__indirect__:
>>       ("myproj/mod.ref",)
>>
>> Chained Ref Files::
>>
>>   venvs/ham/python/site-packages/
>>       spam.ref
>>
>>   venvs/ham/python/site-packages/spam.ref:
>>       # use the system installed module
>>       /python/site-packages
>>
>>   /python/site-packages/
>>       spam.ref
>>
>>   /python/site-packages/spam.ref:
>>       # use the clone
>>       ~/clones/myproj/
>>
>>   ~/clones/myproj/
>>       spam.py
>>
>>   spam.__file__:
>>       "~/clones/myproj/spam.py"
>>
>>   spam.__indirect__:
>>       ("venvs/ham/python/site-packages/spam.ref",
>> "/python/site-packages/spam.ref")
>>
>> Reference Implementation
>> ------------------------
>>
>> A reference implementation is available at <TBD>.
>>
>> XXX double-check zipimport support
>>
>>
>> Deprecation of .pth Files
>> =============================
>>
>> The `site` module facilitates the composition of `sys.path`.  As part of
>> that, `.pth` files are processed and entries added to `sys.path`.  Ref
>> files are intended as a replacement.
>>
>> XXX also deprecate .pkg files (see pkgutil.extend_path())?
>>
>> Consequently, `.pth` files will be deprecated.
>>
>
> Link to "Existing Alternatives" discussion as to why this deprecation is
> desired.
>

Sounds good.


>
>
>>
>> Deprecation Schedule
>> -------------------------
>>
>> 1. documented: 3.4,
>> 2. warnings: 3.5 and 3.6,
>> 3. removal: 3.7
>>
>> XXX Deprecate sooner?
>>
>>
>> Existing Alternatives
>> =================
>>
>> .pth Files
>> ----------
>>
>> `*.pth` files have the problem that they're global: if you add them to
>> `site-packages`, they will be processed at startup by *every* Python
>> application run using that Python installation.
>>
>
> "... thanks to them being processed by the site module instead of by the
> import system and individual finders."
>

Okay.


>
>
>> This is an undesirable side effect of the way `*.pth` processing is
>> defined, but can't be changed due to backwards compatibility issues.
>>
>> Furthermore, `*.pth` files are processed at interpreter startup...
>>
>
> That's a moot point; .ref files can be as well if they are triggered as
> part of an import.
>

Nick covered this pretty well.  I will update the PEP to make this more
clear.


>
> A bigger concern is that they execute arbitrary Python code which could be
> viewed as an unexpected security risk. Some might complain about the
> difficulty then of loading non-standard importers, but that really should
> be the duty of the code  performing the import and not the distribution
> itself; IOW I would argue that it is up to the user to get things in line
> to use a distribution in the format they choose to use it instead of the
> distribution dictating how it should be bundled.
>

Yeah, this is definitely a big selling point for getting rid of .pth.


>
>
>>
>> .egg-link files
>> --------------
>>
>> `*.egg-link` files are much closer to the proposed `*.ref` files. The
>> difference is that `*.egg-link` files are designed to work with
>> `pkg_resources` and `distribution names`, while `*.ref files` are designed
>> to work with package and module names as an automatic part of the import
>> system.
>>
>> Symlinks
>> ---------
>>
>> Actual symlinks have the problem that they aren't really practical on
>> Windows, and also that they don't support non-versioned references to
>> versioned `dist-info` directories.
>>
>> Design Alternatives
>> ===================
>>
>> Ignore Empty Ref Files
>> ----------------------
>>
>> An empty ref file would be ignored rather than effectively stopping the
>> processing of the path entry.  This loses the benefits outlined above of
>> empty ref files as markers.
>>
>> ImportError for Empty Ref Files
>> -------------------------------
>>
>> An empty ref file would result in an ImportError.  The only benefit to
>> this would be to disallow empty ref files and make it clear when they are
>> encountered.
>>
>> Handle Ref Files After Namespace Packages
>> -----------------------------------------
>>
>> Rather than handling ref files first, they could be handled last.  Thus
>> they would have lower priority than namespace package fragments.  This
>> would be insignificantly more backward compatible.  However, as with
>> ignoring empty ref files, handling them last would prevent their use as
>> markers for ignoring a path entry.
>>
>> Send Ref File Path Through Path Import System Only
>> --------------------------------------------------
>>
>> As indicated above, the path entries in a ref file are passed back
>> through the metapath finders to get the loader.  Instead we could use just
>> the path-based import system.  This would prevent metapath finders from
>> having a chance to handle the module under a different path.
>>
>> Restrict Ref File Path Entries to Directories
>> ---------------------------------------------
>>
>> Rather than allowing anything for the path entries in a ref file, they
>> could be restricted to just directories.  This is by far the common case.
>>  However, it would add complexity without any justification for not
>> allowing metapath importers a chance at the module under a new path.
>>
>> Restrict Directories in Ref File Path Entries to Absolute
>> ---------------------------------------------------------
>>
>> Directory path entries in ref files can be relative or absolute.
>>  Limiting to just absolute directory names would be an artificial change to
>> existing constraints on path entries without any justification.
>>  Furthermore, it would prevent simple use of ref files in code bases
>> relative to project roots.
>>
>>
>> Future Extensions
>> ===============
>>
>> Longer term, we should also allow *versioned* `*.ref` files that can be
>> used to reference modules and packages that aren't available for ordinary
>> import (since they don't follow the "name.ref" format), but are available
>> to tools like `pkg_resources` to handle parallel installs of different
>> versions.
>>
>>
>> References
>> ==========
>>
>> .. [0] ...
>>        ()
>>
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>> ^L
>> ..
>>    Local Variables:
>>    mode: indented-text
>>    indent-tabs-mode: nil
>>    sentence-end-double-space: t
>>    fill-column: 70
>>    coding: utf-8
>>    End:
>>
>

On Sat, Jul 20, 2013 at 1:32 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> We should also note that, unlike *.pth files, *.ref files would work
> even with the "-S" switch, since they don't rely on the site module
> making additions to sys.path.

Good point.  I'll add a note.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130731/0945ce0c/attachment-0001.html>