
On Sat, May 29, 2010 at 15:56, P.J. Eby <pje@telecommunity.com> wrote:
At 09:29 PM 5/29/2010 +0200, Martin v. Löwis wrote:
Am 29.05.2010 21:06, schrieb P.J. Eby:
At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote:
In it he says that PEP 382 is being deferred until it can address PEP 302 loaders. I can't find any follow-up to this. I don't see any discussion in PEP 382 about PEP 302 loaders, so I assume this issue was never resolved. Does it need to be before PEP 382 is implemented? Are we wasting our time by designing and (eventually) coding before this issue is resolved?
Yes, and yes.
Is there anything we can do to help regarding that?
You could comment on the proposal I made back then, or propose a different solution.
Looking at that proposal, I don't follow how changing *loaders* (vs. importers) would help. If an importer's find_module doesn't natively support PEP 382, then there's no way to get a loader for the package in the first place. Today, namespace packages work fine with PEP 302 loaders, because the namespace-ness is really only about setting up the __path__, and detecting that you need to do this in the first place.
In the PEP 302 scheme, then, it's either importers that have to change, or the process that invokes them. Being able to ask an importer the equivalents of os.path.join, listdir, and get_data would suffice to make an import process that could do the trick.
Essentially, you'd ask each importer to first attempt to find the module, and then asking it (or the loader, if the find worked) whether packagename/*.pth exists, and then processing their contents.
I don't think there's a need to have a special method for executing a package __init__, since what you'd do in the case where there are .pth but no __init__, is to simply continue the search to the end of sys.path (or the parent package __path__), and *then* create the module with an appropriate __path__.
If at any point the find_module() call succeeds, then subsequent importers will just be asked for .pth files, which can then be processed into the __path__ of the now-loaded module.
IOW, something like this (very rough draft):
pth_contents = [] module = None
for pathitem in syspath_or_parent__path__:
importer = pkgutil.get_importer(pathitem) if importer is None: continue
if module is None: try: loader = importer.find_module(fullname) except ImportError: pass else: # errors here should propagate module = loader.load_module(fullname) if not hasattr(module, '__path__'): # found, but not a package return module
pc = get_pth_contents(importer) if pc is not None: subpath = os.path.join(pathitem, modulebasename) pth_contents.append(subpath) pth_contents.extend(pc) if '*' not in pth_contents: # got a package, but not a namespace break
if pth_contents: if module is None: # No __init__, but we have paths, so make an empty package module = # new module object w/empty __path__ modify__path__(module, pth_contents)
return module
Is it wise to modify __path__ post-import? Today people can make sure that __path__ is set to what they want before potentially reading it in their __init__ module by making the pkgutil.extend_path() call first. This would actually defer to after the import and thus not allow any __init__ code to rely on what __path__ eventually becomes.
Obviously, the details are all in the 'get_pth_contents()', and 'modify__path__()' functions, and the above process would do extra work in the case where an individual importer implements PEP 382 on its own (although why would it?).
It's also the case that this algorithm will be slow to fail imports when implemented as a meta_path hook, since it will be doing an extra pass over sys.path or the parent __path__, in addition to the one that's done by the normal __import__ machinery. (Though that's not an issue for Python 3.x, since this can be built into the core __import__).
(Technically, the 3.x version should probably ask meta_path hooks for their .pth files as well, but I'm not entirely sure that that's a meaningful thing to ask.)
The PEP 302 questions all boil down to how get_pth_contents() is implemented, and whether 'subpath' really should be created with os.path.join. Simply adding a get_pth_contents() method to the importer protocol (that returns None or a list of lines), and maybe a get_subpath(modulename) method that returns the path string that should be used for a subdirectory importer (i.e. __path__ entry), or None if no such subpath exists.
Code already out there uses os.path.join() to extend __path__ (e.g. Django), so I would stick with that unless we want to start transitioning to '/' only.