[Import-SIG] New draft revision for PEP 382

P.J. Eby pje at telecommunity.com
Sat Jul 9 02:01:28 CEST 2011


At 06:31 PM 7/8/2011 -0400, Barry Warsaw wrote:
>On Jul 08, 2011, at 03:51 PM, P.J. Eby wrote:
>
> >The following is my attempt at an updated draft of PEP 382, based on the
> >recently-discussed changes.
>
>Thanks!  I've been trying to catch up on the mailing list traffic today, and
>grabbed your prototype code.  I plan on committing it to MvL's pep382 hg
>branch so we have a place to play with it.

You should probably start from this version instead:

   http://pastebin.com/Wv77WYyb

It's got some work on other things like iter_modules, extend_namespaces, etc.


> >Portion
> >     A set of files in a single directory (possibly inside a zip file
> >     or other storage mechanism) that contribute modules or subpackages
> >     to a namespace package.  The contents of each portion ``sys.path``
>
>This one got cut off.

Oops.  A bad edit; ignore that sentence fragment, it was replaced by 
language in the definition that followed it.


> >Motivation
> >==========
> >
> >.. epigraph::
> >
> >     "Most packages are like modules.  Their contents are highly
> >     interdependent and can't be pulled apart.  [However,] some
> >     packages exist to provide a separate namespace. ...  It should
> >     be possible to distribute sub-packages or submodules of these
> >     [namespace packages] independently."
> >
> >     -- Jim Fulton, shortly before the release of Python 2.3 [1]_
>
>Nice find!

That was where Jim coined the term in the first place.  I went back 
looking because I remembered at least Jim, Guido and I hashing this 
out back then on a zope related mailing list.  Took a few minutes to 
find, but I think it was worth it.


>Do you need to explain a little more why __path__ is significant, and why the
>registration function is required?

Revsed paragraph:

====
In current Python versions, however, a registration function (such as
``pkgutil.extend_path()`` or ``pkg_resources.declare_namespace()``)
must be explicitly invoked in order to set up the package's
``__path__``.  (By default, a package's ``__path__`` lists only one
directory, so to allow imports from more than one directory, the
``__path__`` must be explicitly extended in code.)
====




> >Vendor packages typically must not provide overlapping files, and an
> >attempt to install a vendor package that has a file already on disk
> >will fail or cause unpredictable behavior.  As vendors might choose to
> >package distributions such that they will end up all in a single
> >directory for the namespace package, all portions would contribute
> >conflicting ``__init__.py`` files.
>
>I might word this a little differently.  Perhaps:
>
>Vendor packaging standards require every file on disk to be owned by exactly
>one vendor package.  But because each portion of a namespace package may be
>contained in a separate vendor package, multiple vendor packages would have to
>own the namespace package's __init__.py file.  For example, would the
>``zope.interface`` vendor package own ``zope/__init__.py`` or would the
>``zope.component`` vendor package own it?  Different vendors handle this
>conflict differently, and in fact, different packaging tools from the same
>vendor can handle this differently, which can cause consistency problems.

I took the original wording as directly as practical from MvLs, but I 
agree yours is clearer.  OTOH, I think the "fail or cause 
unpredictable behavior is a much stronger motivator than, "it's 
nonstandard and confusing".  ;-)

Did you have a specific rationale for your choice?  I mean, what did 
you want to gain or avoid by the change?


> >This support would work by adding a new way to desginate a directory
>
>s/desginate/designate/

Got it, thanks for the careful read!


> >as containing a namespace package portion: by including one or more
> >``*.ns`` files in it.
> >
> >This approach removes the need for an ``__init__`` module to be
> >duplicated across namespace package portions.  Instead, each portion
> >can simply include a uniquely-named ``*.ns`` file, thereby avoiding
> >filename clashes in vendor packages.
>
>I think a concrete example would really help here.  E.g.:
>
>For example, the ``zope.interface`` portion would include a
>``zope/zope.interface.ns`` file, while the ``zope.component`` portion would
>include a ``zope/zope.component.ns`` file.  The very presence of any ``.ns``
>files inside the ``zope`` directory is enough to designate ``zope`` as a
>namespace package.  No conflicting ``zope/__init__.py`` file is necessary.

The problem with this example is that it gives the impression that 
.ns files are named for packages, instead of being named for 
distributions.  So, I went with a more detailed and explict 
example.  Here's my revised version:

====

For example, if two distributions, ``Importing`` and ``ProxyTypes``
wish to contribute the modules ``peak.util.imports`` and
``peak.util.proxies`` to the ``peak.util`` namespace package, then
their source distribution directory layouts would look like this::

     ProxyTypes-0.9.tgz:
         peak/
             ProxyTypes.ns   <- 'peak' is a namespace package
             util/
                 ProxyTypes.ns   <- 'peak.util' is a namespace package
                 proxies.py

     Importing-1.10.tgz:
         peak/
             Importing.ns   <- 'peak' is a namespace package
             util/
                 Importing.ns   <- 'peak.util' is a namespace package
                 imports.py

If installed separately (e.g. one via system package, another via
a user's home directory), then the ``__path__`` of the ``peak``
main package will include both ``peak`` subdirectories, and the
``__path__`` of the ``peak.util`` namespace package will include both
``peak/util`` subdirectories.  Thus, both ``peak.util.proxies``
and ``peak.util.imports`` will be importable, despite the physical
separation of the modules.

On the other hand, if these portions are both installed to the *same*
directory, the layout will look like this::

     site-packages/   (or wherever)
         peak/
             Importing.ns
             ProxyTypes.ns   <- both portions' .ns files appear
             util/
                 Importing.ns   <- at both levels
                 ProxyTypes.ns
                 imports.py
                 proxies.py

And the ``__path__`` of the ``peak`` and ``peak.util`` packages will
only contain a single directory each.  (Assuming these are the only
contributions to ``peak`` and ``peak.util`` on ``sys.path``, of
course!)

Either way, the mere presence of the ``.ns`` files tells the import
machinery that the directory is a namespace package portion and is
importable; there is no need for any ``__init__.py`` files that would
cause installation conflicts, when both portions are installed to the
same target location.

In addition to detecting namespace portions and adding them to the
package's ``__path__``, the import machinery will also add any
imported namespace packages to ``sys.namespace_packages`` (initially
an empty set), so that namespace packages can be identified or
iterated over.

====

I think this also gets more of the clarity about __path__ that you 
asked for, too.


> >This new method is called just before the importer's ``find_module()``
> >is normally invoked.  If the importer determines that `fullname` is
> >a namespace package portion under its jurisdiction, then the importer
> >returns an importer-specific path to that namespace portion.
>
>Please define exactly what ``fullname`` is.

Ugh.  Do I have to?  ;-)

Will it work if I just change that to "just before the importer's 
``find_module(fullname)`` is normally invoked", so it's more clearly implied?


> >(Note that this implies that any non-namespace packages with the same
> >name are skipped, and not included in the resulting package's
> >``__path__``.  In other words, a namespace package's initial
> >``__path__`` only includes namespace portions, never non-namespace
> >package directories.)
>
>Would you expect this to be common?  Did you have any examples in mind, or was
>it just covering-the-bases?

Just covering the bases.


> >Standard Library Changes/Additions
> >----------------------------------
> >
> >The ``pkgutil`` module should be updated to handle this
> >specification appropriately, including any necessary changes to
> >``extend_path()``, ``iter_modules()``, etc.  A new generic API for
> >calling ``namespace_subpath()`` on importers should be added as well.
>
>Is there any reason not to put extend_path() on the road to deprecation?

I don't know.  Is there?  As I said, I considered that an open question.



> >Specifically the proposed changes and additions are:
> >
> >* A new ``namespace_subpath(importer, fullname)`` generic, allowing
> >   implementations to be registered for existing importers.
>
>Is this the registration mechanism?

Registration for what?  I meant that this is analogous to other 
pkgutil generic functions that let you call a PEP 302 extension 
protocol on an importer, whether or not the importer directly 
implements that protocol.  For example, 
pkgutil.iter_importer_modules() is a generic function that lets you 
ask an importer to iterate over available modules, whether it 
actually implements its own "iter_modules()" method or not.  The 
pkgutil.namespace_subpath() function would do the same for the 
(possibly-absent) namespace_subpath() method on existing importers, 
and allow third parties to register namespace support for custom 
importers that can't be directly modified to support namespace packages.

Any thoughts on how better to word that bit, without necessarily 
going into that much detail?  ;-)


>s/packagess/packages/

Got it.


> >* ``*.ns`` files must be empty or contain only ASCII whitespace
> >   characters.  This leaves open the possibility for future extension
> >   to the format.
>
>Getting back to our previous discussion on this, I might also add a comment
>format, e.g. lines starting with `#`.  Almost any extension we can come up
>with will probably need to include comments, so we might as well add them here
>now.  This will also allow folks to add copyright, or other textual
>information into .ns files as their coding conventions may dictate.
>
>Do you expect to ignore everything else, or throw an exception?  Let's be
>explicit about that.

We won't be opening the files at all, so the contents will be ignored.


>I'd be a little more forceful; the PEP should strongly recommend against
>including namespace package __init__.py files.

As I said, it's controversial.  Some people really want those 
__init__ modules, and setuptools sort-of supports them now.  I can 
make it a bit more forceful, though.


>You've done a really excellent job at both simplifying the specification, and
>providing a clear explanation of the issues and mechanisms involved.  Kudos!
>I really like this a lot, and wholeheartedly support its adoption.  I hope MvL
>will agree.

Thanks.



More information about the Import-SIG mailing list