[Import-SIG] PEP proposal: Per-Module Import Path

Wed Jul 31 08:41:20 CEST 2013

On Fri, Jul 19, 2013 at 8:51 AM, Brett Cannon <brett at python.org> wrote:

> If this can lead to the deprecation of .pth files then I support the idea,
> but I think there are technical issues in terms of implementation that have
> not been throught through yet. This is going to require an implementation
> (even if it isn't in importlib._bootstrap but as a subclass of
> importlib.machinery.FileFinder or something) to see how you plan to make
> all of this work before this PEP can move beyond this SIG.
>

I'll address any outstanding concerns separately and update the PEP
pursuant to outstanding recommendations.  In the meantime, below is a rough
draft of the implementation.  We can factor out any artificial complexity
I've introduced <wink/>, but it should reflect the approach that came to
mind first for me.  You'll notice that I call _find_module() directly (for
sys.meta_path traversal).

As already noted, the whole issue of how to populate __indirect__ is
tricky, both in how to feed it to the loader and how it should look for
namespace packages.  I just stuck a placeholder in there for the moment.
 The rest of the implementation is pretty trivial in comparison.  However
it does reflect my approach to handling cycles and to aggregating the list
for __indirect__.

I'll make it more clear in the PEP that refpath resolution is
depth-first--a consequence of doing normal loader lookup.  This means that
in the face of a cycle the normal package/module/ns package handling
happens rather than acting like there was an empty .ref file (but only if
no other path entries in the .ref file pan out first).  Would it be better
to treat this case the same as an empty .ref file?

-eric

diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
--- a/Lib/importlib/_bootstrap.py
+++ b/Lib/importlib/_bootstrap.py
@@ -1338,7 +1338,9 @@
         sys.path_importer_cache."""
         if path is None:
             path = sys.path
-        loader, namespace_path = cls._get_loader(fullname, path)
+        loader, namespace_path, indirect = cls._get_loader(fullname, path)
+        # XXX What to do with indirect?
+        # XXX How to handle __indirect__ for namespace packages?
         if loader is not None:
             return loader
         else:
@@ -1372,6 +1374,7 @@
         self._path_mtime = -1
         self._path_cache = set()
         self._relaxed_path_cache = set()
+        self._visited_refnames = {}

     def invalidate_caches(self):
         """Invalidate the directory mtime."""
@@ -1379,6 +1382,25 @@

     find_module = _find_module_shim

+    def process_ref_file(self, fullname, refname):
+        """Return the path specified in a .ref file."""
+        path = []
+        with open(refname, encoding='UTF-8') as reffile:
+            for line in reffile:
+                # XXX Be more lenient on leading/trailing whitespace?
+                line = line.strip()
+                # ignore comments and blank lines
+                if line.startswith("#"):
+                    continue
+                if not line:
+                    continue
+                # resolve the target
+                if not line.startswith("/"):
+                    line = self.path + "/" + line
+                target = os.path.join(*line.split("/"))
+                path.append(target)
+        return path
+
     def find_loader(self, fullname):
         """Try to find a loader for the specified module, or the namespace
         package portions. Returns (loader, list-of-portions)."""
@@ -1398,6 +1420,29 @@
         else:
             cache = self._path_cache
             cache_module = tail_module
+        # Handle indirection files first.
+        if fullname not in self._visited_refnames:
+            indirect = []
+            visited = set()
+            self._visited_refnames[fullname] = (indirect, visited)
+        else:
+            indirect, visited = self._visited_refnames[fullname]
+        refname = _path_join(self.path, tail_module) + '.ref'
+        if refname not in visited and refname in cache:
+            visited.add(refname)
+            indirect.append(refname)
+            refpath = self.process_ref_file(fullname, refname)
+            if not refpath:
+                # An empty ref file is a marker for skipping this path
entry.
+                indirect.pop()
+                return (None, [])
+            loader = _find_module(fullname, refpath)
+            if isinstance(loader, NamespaceLoader):
+                return (None, loader._path, indirect)
+            else:
+                return (loader, [], indirect)
+        if indirect and indirect[0] == refname:
+            del self._visited_refnames[fullname]
         # Check if the module is the name of a directory (and thus a
package).
         if cache_module in cache:
             base_path = _path_join(self.path, tail_module)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130731/ac5963b5/attachment.html>