[Python-Dev] difficulty of implementing phase 2 of PEP 302 in Python source

Phillip J. Eby pje at telecommunity.com
Thu Sep 28 00:31:34 CEST 2006

At 02:11 PM 9/27/2006 -0700, Brett Cannon wrote:
>But it has been suggested here that the import machinery be rewritten in 
>Python.  Now I have never touched the import code since it has always had 
>the reputation of being less than friendly to work with.  I am asking for 
>opinions from people who have worked with the import machinery before if 
>it is so bad that it is worth trying to re-implement the import semantics 
>in pure Python or if in the name of time to just work with the C 
>code.  Basically I will end up breaking up built-in, .py, .pyc, and 
>extension modules into individual importers and then have a chaining class 
>to act as a combined .pyc/.py combination importer (this will also make 
>writing out to .pyc files an optional step of the .py import).

The problem you would run into here would be supporting zip imports.  It 
would probably be more useful to have a mapping of file types to "format 
handlers", because then a filesystem importer or zip importer would then be 
able to work with any .py/.pyc/.pyo/whatever formats, along with any new 
ones that are invented, without reinventing the wheel.

Thus, whether it's file import, zip import, web import, or whatever, the 
same handlers would be reusable, and when people invent new extensions like 
.ptl, .kid, etc., they can just register format handlers instead.

Format handlers could of course be based on the PEP 302 protocol, and 
simply accept a "parent importer" with a get_data() method.  So, let's say 
you have a PyImporter:

     class PyImporter:
         def __init__(self, parent_importer):
             self.parent = parent_importer

         def find_module(self, fullname):
             path = fullname.split('.')[-1]+'.py'
                 source = self.parent.get_data(path)
             except IOError:
                 return None
                 return PySourceLoader(source)

See what I mean?  The importers and loaders thus don't have to do direct 
filesystem operations.

Of course, to fully support .pyc timestamp checking and writeback, you'd 
need some sort of "stat" or "getmtime" feature on the parent importer, as 
well as perhaps an optional "save_data" method.  These would be extensions 
to PEP 302, but welcome ones.

Anyway, based on my previous work with pkg_resource, pkgutil, zipimport, 
import.c, etc. I would say this is how I'd want to structure a 
reimplementation of the core system.  And if it were for Py3K, I'd probably 
treat sys.path and all the import hooks associated with it as a single 
meta-importer on sys.meta_path -- listed after a meta-importer for handling 
frozen and built-in modules.  (I.e., the meta-importer that uses sys.path 
and its path hooks would be last on sys.meta_path.)

In other words, sys.meta_path is really the only critical import hook from 
the raw interpreter's point of view.  sys.path, however, (along with 
sys.path_hooks and sys.path_importer_cache) is critical from the 
perspective of users, applications, etc., as there has to be some way to 
get things onto Python's path in the first place.

More information about the Python-Dev mailing list