[Import-sig] RFC: an import "concentrator" __init__.py

Peter Funk pf@artcom-gmbh.de
Mon, 13 Mar 2000 09:14:24 +0100 (MET)


Hi!

Last weekend I have played around with an idea stolen from Greg McFarlanes
Pmw (Python Mega Widgets) package:

The basic idea is as follows: you have a package delivering a huge bunch 
of partly unrelated features, which are implemented in different sub 
modules of the package.  But the user of the package is not interested 
in the particular interior structure of the package and only wants to to 
use all the classes and features exported by the package directly as
if it were a single module.  

For example a class 'NoteBook' is implemented in submodule 'Pmw.PmwNoteBook'. 
   [footnote: this is still a over-simplification of the original Pmw 
    design, which also includes a sophisticated version numbering scheme]
Now he can 
   >>> import Pmw
   >>> a_notebook = Pmw.NoteBook(....)
instead of 
   >>> a_notebook = Pmw.PmwNoteBook.NoteBook(....)

So from the namespace POV this the same as if we have the package would
contain an __init__.py with the following:
   from submodule1 import *
   from submodule2 import *
   ...
   from submoduleN import *

But this has the disadvantage that *ALL* submodules have to be imported
and initialized during application startup.  So __init__.py contains a 
dynamic loader, which replaces the original module in 'sys.modules'.  
This loader is controlled by tables (dictionaries) mapping the names 
of exported features (classes and functions) to submodule names, which
implement them.  In 'Pmw' the file containing this list of features is 
called 'Pmw.def'.

Since I wanted to use a similar mechanism for a package of my own, 
I've ripped out the version management mechanism from the PmwLoader
(this saves some indirection and makes the code shorter and IMHO easier
to understand), generalized the approach and added a small utility 
script called 'build_exports.py', which creates the 'export.def' file 
readed by my version of a loader in '__init__.py'.

I think this hacked version might be generic enough to be useful for other
packages too.  Since both modules are relatively short I include them below.

Some random notes:
  * This approach has advantages, if you have a module containing a collection
    of rather independent classes and features sharing only a rather small
    common base, which should be broken into several pieces, without
    having to change the whole application using features from this module.  
  * It may also be used for applications using a plugin directory, which
    may be populated by a large number of plugins, from which only a few
    are needed.
  * Replacing a module object in sys.modules with a class instance has 
    the disadvantage, that the 'from package import ....' syntax is
    inhibited: users of the package are forced to use 'import package'
    instead.  From my POV this is a feature rather than a flaw.
    I think using a bunch of 'from module import ...' hurts readability and
    maintainance anyway.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)
---- 8< ---- 8< ---- __init__.py ------ 8< ---- ------- ---- 8< ---- ------- --
#!/usr/bin/env python
## vim:ts=4:et:nowrap
# [Emacs: -*- python -*-]
"""__init__ : This file is executed when the package is imported.  It creates
a lazy importer/dynamic loader for the package and replaces the package 
module with it.  This is a very simplified version of the loader supplied 
with Pmw.  All the package version management has been stripped off."""

import sys, os, string, types

_EXP_DEF = 'exports.def'       # export definition file
_BASEMODULE = 'base'           # Name of Base module for the package

class _Loader:
    """An instance of this class will replace the module in sys.modules"""

    def __init__(self, path, package):
        self._path, self._package = path, package
        self._initialised = 0
        
    def _getmodule(self, modpath):
        __import__(modpath)
        mod = sys.modules[modpath]
        return mod

    def _initialise(self):
        # Create attributes for the Base classes and functions.
        basemodule = self._getmodule('_'+self._package+'.'+_BASEMODULE)
        for k,v in basemodule.__dict__.items():
            if k[0] is not '_' and type(v) != types.ModuleType:
                self.__dict__[k] = v
        # Set the package definitions from the exports.def file.
        dict = {
            '_features'     : {},
            '_modules'      : {},
        }
        for name in dict.keys():
            self.__dict__[name] = {}
            d = {}
            execfile(os.path.join(self._path, _EXP_DEF), d)
            for k,v in d.items():
                if dict.has_key(k):
                    if type(v) == types.TupleType:
                        for item in v:
                            ## modpath = self._package + item
                            modpath = item
                            dict[k][item] = modpath
                    elif type(v) == types.DictionaryType:
                        for k1, v1 in v.items():
                            ## modpath = '_'+self._package +'.'+ v1
                            modpath = v1
                            dict[k][k1] = modpath
        self.__dict__.update(dict)
        self._initialised = 1

    def __getattr__(self, name):
        """This will solve references to not yet used features"""
        if not self._initialised:
            self._initialise()
            # Beware: _initialise may have defined 'name'
            if self.__dict__.has_key(name):
                return self.__dict__[name]
        # The requested feature is not yet set. Look it up in the
        # tables set by exports.def, import the appropriate module and
        # set the attribute so that it will be found next time.
        if self._features.has_key(name):
            # The attribute is a feature from one of the modules.
            modname = self._features[name]
            mod  = self._getmodule('_'+self._package+'.'+modname)
            feature = getattr(mod, name)
            self.__dict__[name] = feature
            return feature
        elif self._modules.has_key(name):
            # The attribute is a module
            mod = self._getmodule('_'+self._package+'.'+name)
            self.__dict__[name] = mod
            return mod
        else:
            # The attribute is not known by the package, report an error.
            raise AttributeError, name

# Retrieve the name of the package:
_package = os.path.split(__path__[0])[1]
# Rename (hide) the original package for later perusual:
sys.modules['_'+_package] = sys.modules[_package]
# Create the dynamic loader and install it into sys.modules:
sys.modules[_package] = _Loader(__path__[0], _package)
---- 8< ---- 8< ---- build_export .py - 8< ---- ------- ---- 8< ---- ------- --
#!/usr/bin/env python
## vim:ts=4:et:nowrap                    
# [Emacs: -*- python -*-]
"""build_exports.py -- create 'exports.def' helper file for lazy importer"""
import sys, types, pprint

modules = []
features = {}

multiple_defined = {}

template = '''## vim:ts=4:et:nowrap                     
# [Emacs: -*- python -*-]
"""export.def --- This is an exports definition file ---

this was automatically created by %(prog)s

It is invoked by a dynamic import loader in __init__.

    features      : dictionary from feature names to modules names.
    modules       : tuple of module names 
"""
#

_features = %(features)s

_modules = %(modules)s

'''

def spewout(stream=sys.stdout, modules=(), features={}):
    pp = pprint.PrettyPrinter(indent=4)
    d = { 'prog': sys.argv[0], 
          'modules': pp.pformat(tuple(modules)), 
          'features': pp.pformat(features),
    }
    stream.write(template % d)

def inspect(modulename, modules, features, multiple_defined):
    if modulename[-3:] == ".py": modulename = modulename[:-3]
    __import__(modulename)
    mod = sys.modules[modulename]
    for symbol in dir(mod):
        if symbol[:1] != '_' or (symbol == '_' and modulename == 'base'):
            obj = mod.__dict__[symbol]
            if type(obj) == types.ModuleType or symbol == "Pmw":
                if not symbol in modules:
                    modules.append(symbol)
            else:
                if features.has_key(symbol):
                    if multiple_defined.has_key(symbol):
                        multiple_defined[symbol] = multiple_defined[symbol] + \
                            " " + features[symbol]
                    else:
                        multiple_defined[symbol] = features[symbol]
                features[symbol] = modulename

if __name__ == "__main__":
    sys.path.insert(0, '.')
    if len(sys.argv) > 1:
        for arg in sys.argv[1:]:
            inspect(arg, modules, features, multiple_defined)
        outfile = sys.stdout
    else:
        import glob
        l = glob.glob("[a-z]*.py")
        print l
        for module in l:
            inspect(module, modules, features, multiple_defined)
        if multiple_defined == {}:
            outfile = open("exports.def", "w")
    if multiple_defined == {}:
        spewout(outfile, modules,  features)
    else:
        for k, v in multiple_defined.items():
            print k, "has multiple definitions in:", v, features[k]
---- 8< ---- 8< ---- ----- ----- ------ 8< ---- ------- ---- 8< ---- ------- --