Re: [Python-Dev] Import redesign [LONG]

Dec. 7, 1999

      On Mon, 6 Dec 1999, James C. Ahlstrom wrote:
...
Greg Stein wrote:
...
...
I am not following this. What/where is the "single dictionary of module
names" ? Are you referring to a cache? Or is this about building an
archive?
An archive would look just like we have now: map a name to a module. It
would not need multiple dictionaries.
The "single dictionary of names" is in the single archive importer
instance and has nothing to do with creating the archive.  It
is currently programmed this way.
Ah. There is the problem. In Guido's suggestion for the "next path of
inquiry" :-), there is no "single dictionary of names". Instead, you have
Importer instances as items in sys.path. Each instance maintains its
dictionary, and they are not (necessarily) combined.

If we were to combine them, then we would need to maintain the ordering
requirements implied by sys.path. However, this would be problematic if
sys.path changed -- we would have to detect the situation and rebuild a
merged dict.
...
Suppose the user specifies by name 12 archive files to be searched.
That is, the user hacks site.py to add archive names to the importer.
The "single dictionary" means that the archive importer takes the 12
dictionaries in the 12 files and merges them together into one
dictionary
in order to speed up the search for a name.  The good news is you can
always just call the archive importer to get a module.  The bad news is
you can't do that for each entry on sys.path because there is no
necessary identity between archive files and sys.path.  The user
specified the archive files by name, and they may or may not be on
sys.path, and the user may or may not have specified them in the
same order as sys.path even if they are.
The importer must be inserted into sys.path to establish a precedence. If
the user wants to add 12 libraries... fine. But *all* of those modules
will fall under a precedence defined by the Importer's position on
sys.path.
...
Suppose archive files must lie on sys.path and are processed in order.
Then to find them you must know their name.  But IMHO you want to
avoid doing a readdir() on each element of sys.path and looking for
files *.pyl.
I do not believe that we will arbitrarily locate and open library files.
They must be specified explicitly.
...
Suppose archive file names in general are the known name "lib.pyl"
for the Python library, plus the names "package.pyl" where "package"
can be the name of a Python package as a single archive file.  Then
if the user tries to import foo, imputil will search along sys.path
looking for foo.pyc, foo.pyl, etc.  If it finds foo.pyl, the archive
importer will add it to its list of known archive files.  But it must
not add it to its single dictionary, because that would destroy the
information about its position along sys.path.  Instead, it must keep
a separate dictionary for each element of sys.path and search the
separate dictionaries under control of imputil.  That is, get_code()
needs a new argument for the element of sys.path being searched.
Alternatively, you could create a new importer instance for each
archive file found, but then you still have multiple dictionaries.
They are in the multiple instances.
If the user installs ".pyl" as a recognized extension (i.e. installs into
the PathImporter), then the above scenario is possible. In my
in-head-design, I had not imagined any state being retained for
extension-recognizer hooks. Of course, state can be retained simply by
using a bound-method for the hook function.

get_code() would not need to change. The foo.pyl would be consulted at the
appropriate time based on where it is found in sys.path. Note that file-
extension hooks would definitely have a complete path to the target file.
Those are not Importers, however (although they will closely follow the
get_code() hook since the extension is called from get_code).