[Python-Dev] zipimport & import hooks

Just van Rossum just@letterror.com
Fri, 6 Dec 2002 12:01:23 +0100


I think we can separate the zipimport issue from the import hooks issue, but
only to an extent. *Some* hook need to be installed in import.c for zipimports
to work, but it doesn't neccesarily mean we must add a general new import hook
scheme *now*. But then again, it might be a *very* small step from enabling
zipimports to exposing the needed hook and have a clean and general solution.

It seems the majority doesn't like non-strings on sys.path, which to me is
understandable, especially if you take PYTHONPATH into consideration. Let's try
to focus on that.

Here's a minimal implementation idea that will allow .zip filenames on sys.path:

    zipimporters = {}  # a cache, private to import.c
    for p in sys.path:
        if p.endswith(".zip"):
            z = zipimporters.get(p)
            if z is None:
                z = zipimporters[p] = zipimporter(p)
            loader = z.find_module(name)
            ...etc...
        else:
            ...builtin import...

This gets nessesarily more complex for packages, as now we also have to cater
for __path__ items of this form: "/path/to/my/archive.zip/packagedir". In this
case we need to create a "subimporter" of "/path/to/my/archive.zip" if only
because we don't want to reread the zip archive's file index. By now the
machinery needed can be written like so:

    zipimporters = {}  # a cache, private to import.c

    def get_zipimporter(p):
        if p.endswith(".zip"):
            return zipimporter(p)
        if not os.path.exists(p):
            pos = p.rfind(".zip")
            if pos > 0:
                archive = p[:pos + 4]
                subpath = p[pos + 5:]  # skip initial sep
                z = zipimporters[archive]
                return z.subimporter(subpath)

    for p in sys.path:
        importer = zipimporters.get(p, -1)
        if importer != -1:
            importer = zipimporters[p] = get_importer(p)
        if importer is not None:
            loader = importer.find_module(name)
            ...etc...
        else:
            ...builtin import...

And *this* can easily be rewritten in a slightly more general form:

    path_importers = {}  # exposed in sys
    import_hooks = [get_zimpimporter]  # exposed in sys

    # get_zimpimporter() implementation left out for brevity

    def get_importer(p):
        for hook in import_hooks:
            importer = hook(path)
            if importer is not None:
                return None
        return None

    for p in sys.path:
        importer = path_importers.get(p, -1)
        if importer != -1:
            importer = path_importers[p] = get_importer(p)
        if importer is not None:
            loader = importer.find_module(name)
            ...etc...
        else:
            ...builtin import...
            
Which, an observant reader might notice, is a stripped down version of Gordon's
iu.py, with s/shadowpath/path_importers/ and s/ownertypes/import_hooks/ and
without his metapath. (The metapath is a great idea, but I don't think it's
possible to do without a *major* rewrite of import.c. The rest is relatively
easy to do, in fact I've got it mostly working already.)

Just