[Python-Dev] Implementing PEP 382, Namespace Packages

Brett Cannon brett at python.org
Mon May 31 22:10:45 CEST 2010


On Mon, May 31, 2010 at 00:53, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> For finders, their search algorithm is changed in a couple of ways.
>> One is that modules are given priority over packages (is that
>> intentional, Martin, or just an oversight?).
>
> That's an oversight. Notice, however, that it's really not the case that
> currently directories have precedence over modules, either: if a directory
> is later on the __path__ than a module, it's still the module that gets
> imported. So the precedence takes place only when a module and a directory
> exist in the same directory.
>
> In any case, I have now fixed it.
>
>> Two, the package search
>> requires checking for a .pth file on top of an __init__.py. This will
>> change finders that could before simply do an existence check on an
>> __init__ "file"
>
> You are reading something into the PEP that isn't there yet. PEP 302
> currently isn't considered, and the question of this discussion is precisely
> how the loaders API should be changed.
>
>> (or whatever the storage back-end happened to be) and
>> make it into a list-and-search which one would hope wasn't costly, but
>> in same cases might be if the paths to files is not stored in a
>> hierarchical fashion (e.g. zip files list entire files paths in their
>> TOC or a sqlite3 DB which uses a path for keys will have to list
>> **all** keys, sort them to just the relevant directory, and then look
>> for .pth or some such approach).
>
> First, I think it's up to the specific loader mechanism whether
> PEP 382 should be supported at all. It should be possible to implement
> it if desired, but if it's not feasible (e.g. for URL loaders), pth
> files just may not get considered. The loader may well provide a
> different mechanism to support namespace packages.
>
>> Are we worried about possible
>>
>> performance implications of this search?
>
> For the specific case of zip files, I'm not. I don't think performance will
> suffer at all.
>
>> And then the search for the __init__.py begins on the newly modified
>> __path__, which I assume ends with the first __init__ found on
>> __path__, but if no file is found it's okay and essentially an empty
>> module with just module-specific attributes is used?
>
> Correct.
>
>> In other words,
>> can a .pth file replace an __init__ file in delineating a package?
>
> That's what it means by '''a directory is considered a package if it either
> contains a file named __init__.py, or a file whose name ends with ".pth".'''
>
>> Or
>> is it purely additive? I assume the latter for compatibility reasons,
>> but the PEP says "a directory is considered a package if it **either**
>> contains a file named __init__.py, **or** a file whose name ends with
>> ".pth"" (emphasis mine).
>
> Why do you think this causes an incompatibility?

It's just if a project has no __init__ older finders won't process it,
that's all. But it looks like they are going to have to change
somewhat anyway so that's not an issue.

>
>> Otherwise I assume that the search will be
>> done simply with ``os.path.isdir(os.path.join(sys_path_entry,
>> top_level_package_name)`` and all existing paths will be added to
>> __path__. Will they come before or after the directory where the *.pth
>> was found? And will any subsequent *.pth files found in other
>> directories also be executed?
>
> I may misremember, but from reading the text, it seems to say "no".
> It should work like the current pth mechanism (plus *, minus import).
>
>> As for how "*" works, is this limited to top-level packages, or will
>> sub-packages participate as well? I assume the former, but it is not
>> directly stated in the PEP.
>
> And indeed, the latter is intended. You should be able to create namespace
> packages on all levels.
>
>> If the latter, is a dotted package name
>> changed to ``os.sep.join(sy_path_entry, package_name.replace('".",
>> os.sep)``?
>
> No. Instead, the parent package's __path__ is being searched for
> directories; sys.path is not considered anymore. I have fixed the text.
>
>> For sys.path_hooks, I am assuming import will simply skip over passing
>> that as it is a marker that __path__ represents a namsepace package
>> and not in any way functional. Although with sys.namespace_packages,
>> is leaving the "*" in __path__ truly necessary?
>
> It would help with efficiency, no?

Not sure how having "*" in __path__ helps with efficiency. What are
you thinking it will help with specifically?

>
>> For the search of paths to use to extend, are we limiting ourselves to
>> actual file system entries on sys.path (as pkgutil does), or do we
>> want to support other storage back-ends? To do the latter I would
>> suggest having a successful path discovery be when a finder can be
>> created for the hypothetical directory from sys.path_hooks.
>
> Again: PEP 302 isn't really considered yet. Proposals are welcome.
>
>> The PEP (seems to) ask finders to look for a .pth file(s), calculate
>> __path__, and then get a loader for the __init__. You could have
>> finders grow a find_namespace method which returns the contents of the
>> requisite .pth file(s).
>
> I must be misunderstanding the concept of finders. Why is it that it would
> be their function to process the pth files, and not the function of the
> loader?

I'm thinking from the perspective of finding an __init__ module that
exists somewhere else than where the .pth file was discovered. Someone
has to find the .pth files and provide their contents. Someone else
needs to find the __init__ module for the package (if it exists). Then
someone needs to load a namespace package, potentially from the
__init__ module. It's that second step -- find the __init__ module --
that makes me think the finder is more involved. It doesn't have to be
by definition, but seeing the word "find" just makes me think
"finder".


More information about the Python-Dev mailing list