[Python-Dev] Re: [Zope3-dev] Zip import and sys.path manipulation (was Re: directory hierarchy proposal)
Guido van Rossum
guido@python.org
Mon, 16 Dec 2002 11:42:34 -0500
> > Time's running short (we're close to releasing Python 2.3 alpha 1);
> > we should decide on a name and a specific policy ASAP. I propose the
> > following:
> >
> > from pkgutil import extended_path
> > __path__ = extended_path(__path__, __name__)
> >
> > Where pkgutil.py could be something like this:
> >
> > import sys.path
> >
> > def extended_path(path, name):
(BTW I now think extend_path() is a better and simpler name for this.)
> > """Extend a package's path. (XXX more.)"""
> > path = path[:]
> > for dir in sys.path:
> > if isinstance(dir, (str, unicode)):
> > dir = os.path.join(dir, name)
> > if dir not in path and os.path.isdir(dir):
> > path.append(dir)
> > return path
>
> Are you proposing this for the standard library?
Yes.
> It certainly sounds reasonable, but the documentation should make
> clear that it still fails for certain types of sys.path entry.
>
> Specifically,
>
> - string subclasses will get converted into basic strings
I find string subclasses an ugly hack that should not be used. The
meta hook (or whatever it's called now) should do this.
> - strings which don't have a path structure wil break
We can fix that by skipping anything for which os.path.isdir(dir) [for
the original value of dir] isn't true.
> (and of course, objects other than string subclasses don't get
> "extended").
We can use some kind of extension hook to deal with those. The
important point is that the idiom placed in __init__.py can remain the
same (the two lines quoted above) while the implementation of
pkgutil.extend_path() can evolve or be modified (even monkey-patched
if you're desperate) as the import hooks evolve.
The main reason for putting this in the stdlib is to avoid the need to
avoid the boilerplate code in everybody's __init__.py.
> I don't think that either of these is even remotely a
> showstopper, but they do need to be documented - specifically
> because as a module, this code can be imported by anybody, and so
> the constraints on the user should be made clear.
Of course. Though with the current state of documentation, I give
less priority to documenting every little detail of this code than to
getting it in the code base (with *some* documentation, for sure).
> In the long term, I think the two most likely things to appear on
> sys.path, apart from directory names, are
>
> - "Filesystem namespace extensions" like the zipfile support.
> That is, things which act like virtual directories.
> - raw importer objects
>
> Frankly, just dumping a raw imported onto sys.path (or a package
> path) is so much easier than implementing a string naming
> scheme, that unless there's an overwhelmingly natural string
> representation (virtual directory or URL are the only two that
> come to mind) it's simpler not to bother.
OTOH I expect that most things you'd want to place on sys.path *do*
have a natural string representation. After all, if something doesn't
have a name, it's hard to talk about. And if it has a name, it can be
placed in PYTHONPATH (with the minor quibble that URLs use ':' which
-- at least on Unix -- is also used as the separator in *PATH
variables. :-( )
> So saying that module paths should only contain pathlike strings,
> or importer objects, is probably reasonable. I wouldn't recommend
> enforcing this, just documenting it as the normal convention.
> Modules can document that they don't work with applications that
> violate this convention, and relentless experimentalists still
> have the freedom to break the recommendation (at their own risk!)
> if they want to see what happens...
Maybe importer objects should have a standard API that lets you ask
for an extended object.
--Guido van Rossum (home page: http://www.python.org/~guido/)