Greg Stein wrote:
On Mon, 27 Sep 1999, Gordon McMillan wrote:
M.-A. Lemburg wrote: [msg 1]
Currently, the imputil apporach uses a simple chaining technique. Unfortunately, it doesn't allow inspecting the chain for already loaded hooks, so the same type of hook could be loaded more than once.
You're associating the hook with the strategy. That's the old style. The imputil style is to associate the hook with the actual stuff being managed. The strategy is a property of the hook.
Quite true. The chaining is simply an artifact of what has been installed as the import hook. I've always envisioned the potential for a "Importer Manager" that installs just like any other hook, but provides higher-level functions for importers to install themselves. The manager simply delegates the get_code() function to the sub-importers. Of course, the manager could use whatever technique to improve the speed of Importer selection.
With respect to speed, I think the main point is to realize that the imputil technique is not inherently slow. It just depends on how you design your Importer subclasses -- do you install one or a hundred Importers?
The imputil scheme is more about simplifying how people hook into the process (implement get_code() rather than a load/import combo). It also provides a simple capability (chaining) to allow *multiple* hooks to be installed.
As I wrote in my reply to Gordon, this setup has some drawbacks which an "Import Manager" could easily solve, e.g. by using a list of importers.
Also, there are at least two types of hooks:
hooks that redirect the import to some other data source
hooks that modify the way modules are searched
Just one way -- your second is a variant of the first. "other data source" is a functional superset which includes searching. Importers don't simply alter searching -- they must perform the actual import (from wherever).
Yes, I was just argueing for two types of functionality, not the old scheme. E.g. the Import Manager could provide a set of filters which implement signature checks or know how to un-gzip code plus a set of lookup functions for scanning directories or zip archives.
I would like the importers to take advantage of such functionality. Of course, all of this could be implemented in form of classes which the importers then use as mixin classes.
This is the big change in mindset from the "ihooks" method -- find it and import it on the spot. The net effect is an Importer either imports a module or it doesn't (and the system can fallback to try another Importer).
[ one the examples that people always like to specify was importing via URL which was actually quite difficult to use in the old scheme -- how do you separate an HTTP GET into a find/load step? Effectively, you had to double-fetch, or you had to place the whole module (which you retrieved during the find step) into your context for passing to the load. The other issue was the distinct semantics also implied that you could separate the functions -- I believe that to be quite unnecessary functionality. ]
.get_code() is fine for these kind of tasks, but there are some other areas (such as lazy imports) which work better using the split setup. This is pretty easy to implement btw, just have the Import Manager check whether the importer provides .get_code() and then have it revert to using .find_module(), .load_module() if it doesn't.
The more I think about it, the more I like the idea of an Import Manager instead of the chaining approach.
Since the first variant may well also be suited to used by the second, the simple chaining method probably won't be powerful enough to handle it.
The top level question is "is it mine to import?". Greg provides a framework that makes it easy to use alternate data sources, and alternate ways of finding things but that's not really the key thing. You're a "good" importer if you can (when appropriate) way "no it's not mine" efficiently.
Another quirk that I think needs fixing:
When I issues an import:
the whole import is handled by the importer installed at the start of the import. It is not possible to install a different importer e.g. in mx/__init__.py to handle the rest of the import (in this case the import of subpackage DateTime). I think that the importer should honor the __importer__ function (this is set by imputil) if present to let it continue the import of subsequent elements in the dotted name.
Sure you can. Your first importer is the "mx" importer. It has a dict of sub-importers. When mx/DateTime/__init__.py runs, it puts itself into that dict. The importer chain is now a tree.
Gordon's on top of it here... :-) Yes, it is simply a matter of perspective on the import process. An importer does not have to be a static entity. It also can be much more than a way to search a path... it can be highly dynamic and flexible. Whatever you like. Just implement get_code() to map a module "mx.DateTime" to a code/module object. There are a bazillion ways to do that :-)
Except that they don't work due to the fact that the builtin importer is not recursively using __import__ for the imports. An Import Manager would help with this too :-)
This means, I think, that a "general" relative-path importer (ie, one that uses the default PYTHONPATH strategy), should be careful to install itself as the penultimate importer in the chain, (ie, the last before __builtin__.imp). But putting a relative-path search strategy into the "mx" importer is fine if it can quickly determine that the target is / is not a valid name in the "mx" namespace.
Part of the Importer work was done to satisfy importing modules from the COM+ namespace. I wanted to be able to say "import COM.foo.bar". The importer would handle all "COM." imports and delegate the "foo.bar" to the underlying Python/COM framework.
In other words... yes, the Importer scheme should work *very* well for the "whatever...." type of module namespace.