Just van Rossum wrote:
At 10:47 AM -0800 1/31/99, Greg Stein wrote:
By "loader", I will presume that you mean an instance of an Importer subclass that is defining get_code(). Here is the method as defined by imputil.Importer:
def get_code(self, parent, modname, fqname):
I see in your code that you build a chain of import hooks that basically work like a list of loaders. Why don't you use sys.path for this? (Current implementation details, probably?)
Yah. I'm not sure what exactly would happen if I tried that. Much easier to throw out the concepts by skipping that aspect for now :-)
I'd suggest a loader object should be a callable, ie. get_code() should be called __call__()... Changing your pseudo code from the distutils-sig to:
for pathentry in sys.path: if type(pathentry) == StringType: module = old_import(pathentry, modname) else: module = pathentry(modname) # <-- if module: return module else: raise ImportError, modname + " not found."
It's the only public method. Makes more sense to me. It also seems that it would make it easier to implement loaders (and the above loop) in C.
get_code() is *NOT* a public method. It is for subclasses to override. The "public method" is the _import_hook method, but it isn't really public, as it gets installed into the import hook. However, it wouldn't be hard to add "__call__ = _import_hook" to the Importer class. That would provide the appropriate callable interface.
That method is the sole interface used by Importer subclasses. To define a custom import mechanism, you would just derive from imputil.Importer and override that one method.
I'm not sure if that answers your question, however. Please let me know if something is unclear so that I can correct the docstring.
The only thing I find unclear is when the parent argument should be used or not. Is it only for importing submodules?
The superclass calls it with the right arguments. Users of the Importer class do *not* call get_code (maybe I should rename it to _get_code, but I wanted it shown as "public" since it needs to be overridden). The only public method on Importer is the install() method, which should be called after the instance has been created and configured. So the real question isn't "what do I pass for the parent argument?", but "how do I use the parent argument in my response?" "parent" will be None, or a module object. If it is None, then "modname" should be looked for in a non-package context. If it is a module (implying it represents a package), then "modname" should be looked for within that module (package). The method should not attempt to look in both a package and a non-package context. The Importer subclass may call get_code() multiple times, looking for the right context.
...
Internally, packages are labelled with a module-level name: __ispkg__. That is set to 0 or 1 accordingly.
Right now a package is labeled with a __path__ variable. If it's there, it's a package. Is it neccesary to define something different?
__path__ and __file__ do not make sense for many custom import mechanisms. So, yes, there really should be an explicit marker. For example, when my small distribution system imports a module, it isn't loading it from a file. It's pulling it out of py15.pyl. __file__ just doesn't make sense.
The Importer that actually imports a module places itself into the module with "__importer__ = self". This latter variable is to allow Importers to only work on modules that they imported themselves, and helps with identifying the context of an importer (whether an import is being performed by a module within a package, or not).
Ok, so __importer__ is there instead of __path__. That makes sense. The current __path__ variable makes it look like it can have its own sys.path-like thingy, but that's not quite what it is (?). It is even abused: if it is a string, it is supposed to be the (sub)package's full name and means that any submodule is to be located like a normal module, but with the full name (eg. mypkg.sub1.foo). This is crucial for freezing and handy in other cases (it allows submodules to be builtin!). While this is cool and handy, it smells really funny. As a case study, could you show how this stuff should work with the new import mechanism?
The freeze issue is simply another aspect of improper reliance on __file__ and/or __path__. The "test" package fails in my small distribution because it uses __file__. In addition, the following standard library modules use __file__, which means they won't work when they've been extracted from archives: ihooks.py (this sets/uses them, which is probably okay) knee.py (this sets/uses them, which is probably okay) copy.py (actually, this is in a test function) test/regrtest.py (needed to locate the test directory) test/test_imageop.py (some kind of file search algorithm) test/test_imgfile.py (locating the script/test dir) test/test_support.py (kind of a dup of the function in test_imageop) test/test_zlib.py (locating the script/test dir) I recall having a bitch of a time with the win32com package because it also does some magic with the __path__ variable. It meant that I couldn't shove the files into an archive easily. I think it may have been fixed, though, because Mark ran into the same issue when he tried to freeze the package, and so I believe he fixed it. In any case, the two variables could be supported quite easily by DirectoryImporter (which is a near-clone of the builtin behavior). IMO, __file__ should be used VERY sparingly, if at all, and __path__ should just disappear (people should use a custom Importer if they want funny import behavior). As far as a case study? I'm not sure what you mean, beyond my rudimentary response in the preceeding paragraph. Hmm. Or do you mean, how does freezing work? Frozen modules simply use a FreezeImporter instance (theoretical class :-). Functionally, it would be very similar to the SimpleArchive class in the site.py in my small distro (but the TOC would be replaced by C structures, and the file ops would be replaced by memory lookups and mem-based unmarshalling). In general, I submit that my Importer mechanism also fulfills Mark's and Jack's original impetus in this matter: it can clean up Python's import mechanism, and can provide a way for each platform to introduce platform-specific mechanisms for importing. Cheers, -g -- Greg Stein, http://www.lyra.org/