[Import-SIG] Do we really need to read .pth files in PEP 382?

P.J. Eby pje at telecommunity.com
Fri Jun 24 18:29:41 CEST 2011


Do we really need to read .pth files in PEP 382?  If so, why?

In a common usecase for PEP 382 (large namespace packages like 
zope.*), there will be a long list of .pth files present, each of 
which contains only a '*', but which still must be opened and read by 
the implementation, while adding no new information.

However, if we had instead .ns files or .nspkg files or something 
like that, their mere *existence* could be construed as implying 
namespace-ness, and require no actual opens or reads.

If we separate the "this is a namespace" funtionality from "here are 
paths" functionality, ISTM that the "here are paths" functionality is 
already adequately met by the existing .pth machinery.  (As I'm not 
aware of any real-world use cases for pkgutils' .pkg files -- but 
perhaps someone can enlighten me on that?)

Another consequence of this change is that it would simplify the PEP 
302 extension: instead of asking importers or loaders for a path, one 
could simply ask the importer whether a namespace exists, e.g.:

    finder.namespace_exists(fullname)

Returning either a subpath to put in __path__, or None if the named 
package is not a namespace.

If a regular package is found before a namespace, the normal protocol 
operates.  if a namespace is found, walk all remaining finders, 
adding any non-None path entries returned by namespace_exists(), and 
also invoking the first loader returned by a namespace-supplying finder.

Something like:

     path_iter = iter(current_path) # sys.path or a pacakge.__path__

     for path_entry in path_iter:
         finder = get_importer(path_entry)

         # This 'if' block is the only addition to the existing loop:
         if hasattr(finder, 'namespace_exists'):
             subpath = finder.namespace_exists(fullname)
             if subpath is not None:
                 break  # go handle the nspkg case

         loader = finder.find_module(fullname)
         if loader is not None:
             return loader.load_module(fullname)
     else:
         raise ImportError

     # Ok, we have a namespace package, so handle it:
     module = sys.modules[fullname] = new.module(fullname)
     sys.namespace_packages.add(fullname)
     module.__path__ = [subpath]
     loader = finder.find_module(fullname)
     if loader is not None:
         loader.load_module(fullname)

     for path_entry in path_iter:  # resume iteration
         finder = get_importer(path_entry)
         if hasattr(finder, 'namespace_exists'):
             subpath = finder.namespace_exists(fullname)
             if subpath is not None:
                 if subpath not in module.__path__:
                     module.__path__.append(subpath)
                 if loader is None:
                     loader = finder.find_module(fullname)
                     if loader is not None:
                         loader.load_module(fullname)

There are some variations possible in this algorithm; you could for 
example roll the two loops into one, by using 'loader' and 'module' 
as flags.  But the modifications needed to PEP 302 loaders are 
minimal, almost trivial.

By comparison, the current proposal seems a bit overweight, 
considering that PEP 382 does not provide any use-case rationale for 
supporting anything besides '*' in .pkg files.  In fact, there 
doesn't seem to be any reason to put the '*' in the __path__ -- 
sys.namespace_packages suffices to indicate namespace-ness.  Code 
that wishes to extend existing namespace packages (e.g. setuptools) 
can simply perform the equivalent of the second loop above on any new 
path entries, for all entries in sys.namespace_packages.  (Well, not 
*all* entries, but all those that recursively yield reachable 
namespace additions.)

Thoughts, anyone?



More information about the Import-SIG mailing list