[Import-SIG] New draft revision for PEP 382
P.J. Eby
pje at telecommunity.com
Sat Jul 9 23:49:49 CEST 2011
At 02:42 AM 7/9/2011 -0600, Eric Snow wrote:
>And if there were a "zope_part1" and a "zope_part2" directory, both
>with a zope.ns file in them, that namespace_subpath("zope") call would
>return ["/usr/lib/site-packages/zope_part1",
>"/usr/lib/site-packages/zope_part2"], right? And if both also had a
>foo.ns file in them, the same would be returned for
>namespace_subpath("foo").
No; the directory is always named for the package, just like
now. We're just saying that we replace looking for __init__.py with
looking for *.ns.
> > Finally, the ``__init__`` module code for the package (if it exists)
> > is located and executed in the new module's namespace.
> >
> > Each importer that returns a ``namespace_subpath()`` for the package
> > is asked to perform a standard ``find_module()`` for the package.
> > Since by the normal import rules, a directory containing an
> > ``__init__`` module is a package, this call should succeed if the
> > namespace package portion contains an ``__init__`` module, and the
> > importing can proceed normally from that point.
> >
>
>Is this last paragraph part of the finally?
Yes.
> If so, what does calling
>find_module at this point accompish? Do you mean load_module is also
>called for each that is found?
I'm adding this sentence to the end of that paragraph for clarification:
"""(That is, with a ``load_module()`` call to execute the first
``__init__`` module found on the package's ``__path__``.)"""
Does that make it clearer?
>Will it be too easy (or conversely
>very likely) to have __init__.py collisions?
__init__ collisions (and the recommendation to not use __init__
modules at all) are addressed later below in the implementation notes.
> > There is one caveat, however. The importers currently distributed
> > with Python expect that *they* will be the ones to initialize the
> > ``__path__`` attribute, which means that they must be changed to
> > either recognize that ``__path__`` has already been set and not
> > change it, or to handle namespace packages specially (e.g., via an
> > internal flag or checking ``sys.namespace_packages``).
> >
> > Similarly, any third-party importers wishing to support namespace
> > packages must make similar changes.
> >
>
>Seems like the caveat is dependent on the above algorithm. If the
>module's __path__ were set with the namespace_subpath() results after
>the namespace package's import was all over, would it still be an
>issue?
No, but then we couldn't support __init__ modules executing with the
correct __path__ value; notably, this would prevent __init__ modules
from manipulating their own __path__.
Honestly, throwing out __init__ support entirely would make a LOT of
things easier and simpler here, especially in the 2.x version. But
there was a vocal contingent of support for them in the original
Python-Dev discussion.
> > Specifically the proposed changes and additions are:
>
>Maybe, "Specifically the proposed changes and additions to pkgutil
>are:", to clarify the context?
Ok.
> >
> > * A new ``namespace_subpath(importer, fullname)`` generic, allowing
> > implementations to be registered for existing importers.
> >
> > * A new ``extend_namespaces(path_entry)`` function, to extend existing
> > and already-imported namespace packages' ``__path__`` attributes to
> > include any portions found in a new ``sys.path`` entry. This
> > function should be called by applications extending ``sys.path``
> > at runtime, e.g. to include a plugin directory or add an egg to the
> > path.
> >
> > The implementation of this function does a simple breadth-first walk
> > of ``sys.namespace_packages``, and performs any necessary
> > ``namespace_subpath()`` calls to identify what path entries need to
> > be added to each package's ``__path__``, given that `path_entry`
> > has been added to ``sys.path``.
> >
>
>Does the same apply to namespace sub-packages where their parent
>package has an updated __path__? So a recursion would take place in
>some cases.
Yes, that's what "breadth-first" meant here; i.e., first top-level
namespaces, then second-level namespaces, and so on. In actuality, I
erred by saying breadth-first, though, what I actually meant is
technically "pre-order traversal", i.e., parent nodes are touched
before their children. I'll tweak that to "top-down traversal"
instead of "breadth-first walk", and add:
"""(Or, in the case of sub-packages, adding a derived subpath entry,
based on their parent namespace's ``__path__``.)"""
>The iter_modules() method isn't part of PEP 302, is it? Where can I
>find out more about it?
See pkgutil; it's something I added in Python 2.5 to help tools like
pydoc better support zipfiles and other exotic importers.
> > * "Meta" importers (i.e., importers placed on ``sys.meta_path``) do
> > not need to implement ``namespace_subpath()``, because the method
> > is only called on importers corresponding to ``sys.path`` entries.'
>
>And parent.__path__ for namespace submodules?
Yes. Fixed.
> > If a meta importer wishes to support namespace packages, it must
> > do so entirely within its ``find_module()`` implementation.
> >
> > Unfortunately, it is unlikely that any such implementation will be
> > able to merge its namespace portions with those of other meta
> > importers or ``sys.path`` importers, so the meaning of "supporting
> > namespace packages" for a meta importer is currently undefined.
> >
>
>While I'm not sure meta importers need to be left out, I suppose it
>isn't critical since the work-around isn't that hard, nor widely
>needed. Thus the message here is that this PEP only applies to the
>use of sys.path_hooks and sys.path_importer_cache. It would be nice
>for that to be clear up front.
Ok, I added this:
"""(Note: the import machinery will NOT invoke this method for importers
on ``sys.meta_path``, because there is no path string associated with
such importers, and so the idea of a "subpath" makes no sense in that
case.)"""
just after this bit:
"""The Python import machinery will call this method on each importer
corresponding to a path entry in ``sys.path`` (for top-level imports)
or in a parent package ``__path__`` (for subpackage imports)."""
in the PEP 302 protocol description.
> > Further, there is an immense body of existing code (including the
> > distutils and many other packaging tools) that expect a package
> > directory's name to be the same as the package name.
>
>Correct me if I'm wrong, but I have understood that for namespace
>packages in the PEP, the directory name does not have to be the
>package name.
Consider yourself corrected. ;-)
>Back to namespace subpackages, it's unclear how they should work.
>Either a namespace package is at the top level of a sys.path entry, or
>its a module of a parent package, namespace or otherwise. The
>top-level case is pretty clear. However, the subpackage case is not.
>I don't see namespace subpackages as being too practical with
>non-namespace parent packages, but I'm probably missing something.
They aren't practical at all, no. ;-) I'll add an implementation
note explaining that even though the spec doesn't require a namespace
package's parent to also be a namespace, that there isn't any
practical use in doing so, as the child __path__ is a collection of
subpaths derived from the parent __path__, and thus it wouldn't
combine with any other contributions that weren't installed to the
same location.
Here's the text:
* In general, a namespace subpackage (e.g. ``peak.util``, ``zope.app``,
etc.) must be a child of another namespace package (e.g. ``peak``,
``zope``, etc.). This is not required by the spec or enforced by
the implementation, but in practice, it is useless to put a
namespace package inside a non-namespace package, as the child
package's ``__path__`` will be a subset of the parent's.
In other words, it will only work correctly if all the contributions
to that namespace package are installed to the same physical
location. So, if you intend to use a namespace subpackage, you
should always make its parent package a namespace as well.
More information about the Import-SIG
mailing list