[Python-Dev] Re: [Zope3-dev] Zip import and sys.path manipulation (was Re: directory hierarchy proposal)

Guido van Rossum guido@python.org
Mon, 16 Dec 2002 11:42:34 -0500


> > Time's running short (we're close to releasing Python 2.3 alpha 1);
> > we should decide on a name and a specific policy ASAP.  I propose the
> > following:
> > 
> >   from pkgutil import extended_path
> >   __path__ = extended_path(__path__, __name__)
> > 
> > Where pkgutil.py could be something like this:
> > 
> >   import sys.path
> > 
> >   def extended_path(path, name):

(BTW I now think extend_path() is a better and simpler name for this.)

> >       """Extend a package's path.  (XXX more.)"""
> >       path = path[:]
> >       for dir in sys.path:
> >           if isinstance(dir, (str, unicode)):
> >               dir = os.path.join(dir, name)
> >               if dir not in path and os.path.isdir(dir):
> >                   path.append(dir)
> >       return path
> 
> Are you proposing this for the standard library?

Yes.

> It certainly sounds reasonable, but the documentation should make
> clear that it still fails for certain types of sys.path entry.
> 
> Specifically,
> 
>   - string subclasses will get converted into basic strings

I find string subclasses an ugly hack that should not be used.  The
meta hook (or whatever it's called now) should do this.

>   - strings which don't have a path structure wil break

We can fix that by skipping anything for which os.path.isdir(dir) [for
the original value of dir] isn't true.

> (and of course, objects other than string subclasses don't get
> "extended").

We can use some kind of extension hook to deal with those.  The
important point is that the idiom placed in __init__.py can remain the
same (the two lines quoted above) while the implementation of
pkgutil.extend_path() can evolve or be modified (even monkey-patched
if you're desperate) as the import hooks evolve.

The main reason for putting this in the stdlib is to avoid the need to
avoid the boilerplate code in everybody's __init__.py.

> I don't think that either of these is even remotely a
> showstopper, but they do need to be documented - specifically
> because as a module, this code can be imported by anybody, and so
> the constraints on the user should be made clear.

Of course.  Though with the current state of documentation, I give
less priority to documenting every little detail of this code than to
getting it in the code base (with *some* documentation, for sure).

> In the long term, I think the two most likely things to appear on
> sys.path, apart from directory names, are
> 
>   - "Filesystem namespace extensions" like the zipfile support.
>     That is, things which act like virtual directories.
>   - raw importer objects
> 
> Frankly, just dumping a raw imported onto sys.path (or a package
> path) is so much easier than implementing a string naming
> scheme, that unless there's an overwhelmingly natural string
> representation (virtual directory or URL are the only two that
> come to mind) it's simpler not to bother.

OTOH I expect that most things you'd want to place on sys.path *do*
have a natural string representation.  After all, if something doesn't
have a name, it's hard to talk about.  And if it has a name, it can be
placed in PYTHONPATH (with the minor quibble that URLs use ':' which
-- at least on Unix -- is also used as the separator in *PATH
variables. :-( )

> So saying that module paths should only contain pathlike strings,
> or importer objects, is probably reasonable. I wouldn't recommend
> enforcing this, just documenting it as the normal convention.
> Modules can document that they don't work with applications that
> violate this convention, and relentless experimentalists still
> have the freedom to break the recommendation (at their own risk!)
> if they want to see what happens...

Maybe importer objects should have a standard API that lets you ask
for an extended object.

--Guido van Rossum (home page: http://www.python.org/~guido/)