[Python-ideas] Refactoring the import system to be object-oriented at the top level

Nick Coghlan ncoghlan at gmail.com
Mon Nov 9 13:36:18 CET 2009

Brett Cannon wrote:
> On Sun, Nov 8, 2009 at 13:20, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Brett Cannon wrote:
>>>> The underlying import machinery would migrate to instance methods of the
>>>> new class.
>>> Do you really mean methods or just instance attributes? I personally
>>> don't care personally, but it does require more of an API design
>>> otherwise.
>> I did mean methods, but I also realise how much work would be involved
>> in actually following up on this idea. (As the saying goes, real
>> innovation is 1% inspiration, 99% perspiration!)
>> If you don't move the machinery itself into instance methods then you
>> just end up having to pass the storage object around to various
>> functions. Might as well make that parameter 'self' and use methods.
> I don't quite follow. What difference does it make if they are
> instance attributes compared to methods? The data still needs to be
> stored somewhere that is unique per instance to get the semantics you
> want.
> The other thing you could do with this is provide import_module() on
> the object so it is a fully self-contained object that can do an
> entire import on its own without having to touch anything else (heck,
> you could even go so far as to have their own module cache, but that
> might be too far as all loaders currently are expected to work with
> sys.modules).

Slight miscommunication there: by "underlying import machinery" I meant
the functions that currently do the heavy lifting for imports (i.e. most
of the code in import.c), along with their equivalents in importlib. The
sys attribute equivalents would indeed just be normal attributes on the
as-yet-hypothetical ImportEngine instances.

I suspect you're right that there would be problems with the PEP 302
design currently encouraging loader and importer implementations to work
with the sys attributes directly - backwards compatibility on that front
is one of the big issues I was handwaving away in the original post.

A PEP 3115 inspired thought is it may make sense to allow loaders to
split load_module() into two distinct steps (prepare_module() and
exec_module()) and leave the sys.modules manipulation to the import engine.

That is (using the sample load_module() implementation from PEP 302),
something along the lines of:

  def prepare_module(self, fullname, mod=None):
    if mod is None:
      mod = imp.new_module(fullname)
    mod.__file__ = "<%s>" % self.__class__.__name__
    mod.__loader__ = self
    if self._is_package(fullname):
       mod.__path__ = []
    return mod

  def exec_module(self, fullname, mod):
    exec self._get_code(fullname) in mod.__dict__

The key difference here is that module caching becomes entirely the
responsibility of the import engine rather than relying on each loader
to do it correctly.

It would also give the import engine a chance to monkey with the module
globals before the module code is executed (e.g. ensuring __package__ is
set, setting a new __import_engine__ variable, overriding __import__ to
play nicely with the current import engine)

If a non-global import system adopted such an alternate loader protocol
it could easily avoid invoking standard loaders that directly
manipulated the sys attributes.

>>>> The standard import engine instance would be stored in a new sys module
>>>> attribute (e.g. 'sys.import_engine'). For backwards compatibility, the
>>>> existing sys attributes would remain as references to the relevant
>>>> instance attributes of the standard engine.
>>> How would that work? Because they are module attributes there is no
>>> way to use a property to have them return what the current
>>> sys.import_engine uses.
>> Yes, I eventually realised it would be better to turn the dependency
>> around the other way (i.e. have an engine subclass that used properties
>> to refer to the sys module attributes)
> Yeah, you could have it default to the attributes on the sys module if
> no instance attributes are set.

I was actually thinking of a SysImportEngine subclass that turned them
all into properties that referenced the appropriate objects in sys.

I'm starting to convince myself that I should *find* the time to
experiment with this in the sandbox... then again, I wouldn't be
entirely surprised if Guido deemed all this outright abuse of the import
system :)


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-ideas mailing list