[Python-Dev] New and Improved Import Hooks

M.-A. Lemburg mal@lemburg.com
Thu, 05 Dec 2002 12:26:36 +0100


Just van Rossum wrote:
> M.-A. Lemburg wrote:
> 
> 
>>Here's a sketch:
>>
>>1. User programs register import hooks based on REs which are
>>    used to match the entries in sys.path, e.g. ".*\.zip" for
>>    ZIP importers (caching could help in improving the mapping
>>    performance).
>>
>>2. When Python sees an import request, it scans sys.path and
>>    creates hook objects for each entry which it then calls
>>    to say "go look and check whether you have module X" until
>>    one of the hooks succeeds.
>>
>>3. Python then uses the hook object to complete the import
>>    in much a similar way as e.g. SAX parsers call out to
>>    event handlers.
> 
> 
> Nice; the hooks can then be cached in a dict, as in iu.py, with the path entry
> as the key.

Right.

> This makes bootstrapping a bit harder, though, as now we also need re/sre/_sre
> to be available before hooked imports can work...

Right again. We need to provide a basic set of built-in
solutions which work without requiring any .py modules being
loaded... or maybe trim down the registration logic to just
look for suffixes in sys.path, e.g. '.zip' is for the ZIP importer
hook, '.exe' for the resource importer hook, etc.

>>The idea is to reuse as much of the existing import machinery
>>as possible -- writing these hooks in C wouldn't be too hard
>>either.
> 
> That's exactly what my patch already does: it leaves most of import.c in tact,
> it adds no duplication.

That's good :-). I haven't followed the thread too closely, but since
this debate has been going on for years, I thought, I just drop
in a few lines ;-)

> I'd argue that *implementation*-wise it's simpler to just allow the sys.path
> entry to handle the request. I also don't see a problem with it design-wise,
> apart from b/w compatibility issues (which I think are non-issues if we use str
> subclasses).
> 
> Are people against of the whole *idea* of having non-strings on sys.path, or is
> it "only" a b/c compatibility concern?

Since sys.path is used by quite a few applications directly,
the backward compatibility argument is a strong one. I also
think that adding too much magic (like having importers being
subclasses of str) will only result in people not using the
new techniques.

 From a user POV, I would like to add the path to a ZIP archive
to sys.path and be done with it. From a software vendor POV,
I may also be interested in adding support for signed/encrypted PYC
files within those ZIP archives, so appropriate hooks to
be able to add that support would be nice as well.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/