[Import-SIG] Explicit membership in namespace packages

P.J. Eby pje at telecommunity.com
Mon Jun 27 22:13:34 CEST 2011


While reviewing the PEP 382 tests, I realized that there's something 
I've been misreading about the spec as written, because I had a 
different assumption about how things would/should work.

As written, the spec allows a namespace package to be declared in one 
place, and then files in any other matching directory are 
automatically included in that namespace.

But, in order to operate in a standalone fashion, any partial 
namespace package MUST contain its own .ns file to indicate that it 
is part of that namespace.  Otherwise, it could only be used on a 
sys.path where such a declaration already existed.

For consistency, therefore, I propose that we explicitly *require* a 
directory to include an .ns file in order to participate in a 
namespace, regardless of whether it also contains an __init__.py.

In other words, if a namespace package directory appears first on 
sys.path, then all and ONLY those sys.path subdirectories marked as 
namespace package directories will contribute modules to that package.

As it happens, I actually already wrote my meta-importer this way, 
because I was assuming that people would always NEED to mark their 
namespace package directories anyway (because they could otherwise 
not stand alone).

Anyway, I think that this, in combination with the importer protocol 
simplification and the flag file approach, makes the resulting 
implementation and application much easier to understand.  Basically 
the rules are:

* Directories containing one or more .ns files are "namespace package 
directories".  In this version of the spec, the files must be empty 
or contain only whitespace characters; future versions may specify 
optional additional information.

* Each namespace package directory should contain only unique 
filenames for that namespace, such that combining every namespace 
package directory with a given name results in no filename 
collisions.  This implies that modules, data files, AND .ns files 
must be given unique names.  (And generally, .ns files should be 
given a name based on their project's distribution name, to identify 
the source of the files.)

* When a namespace package directory is the first match for a desired 
import on sys.path (or within a parent package __path__), that 
namespace package directory's contents are effectively merged with 
those of all subsequent namespace package directories within the 
path, to form the common package contents.  Normal package 
directories or modules with the same name are ignored.

* If a module or normal (non-namespace) package directory with an 
__init__.py (and no .ns file(s)) are encountered first, importing 
proceeds normally.

* PEP 302 importers wishing to support namespace package directories 
should implement a 'namespace_subpath(fullname)' method, that returns 
either a __path__ entry to be used for the named package, or None if 
the package does not have a namespace directory present.

* The import machinery calls namespace_subpath() on an importer prior 
to calling find_module(), and then handles creating a namespace 
package module and loading the first __init__ submodule into it, with 
__path__ pre-initialized.

* For implementation efficiency, an importer is allowed to cache 
information (such as whether a directory exists and whether an 
__init__ module is present in it) between the invocation of a 
namespace_subpath() call and an immediately-subsequent find_module() 
call for the same name.  It should, however, avoid retaining such 
cached information for any longer than the next method call, and 
verify that the request is in fact for the same module/package name.

Thoughts?



More information about the Import-SIG mailing list