[Web-SIG] Standardized configuration

Fri Jul 22 22:38:07 CEST 2005

I've had a stab at creating a simple WSGI deployment implementation.
I use the term "WSGI component" in here as shorthand to indicate all
types of WSGI implementations (server, application, gateway).

The primary deployment concern is to create a way to specify the
configuration of an instance of a WSGI component, preferably within a
declarative configuration file.  A secondary deployment concern is to
create a way to "wire up" components together into a specific
deployable "pipeline".  

A strawman implementation that solves both issues via the
"configurator", which would be presumed to live in "wsgiref". Currently
it lives in a package named "wsgiconfig" on my laptop.  This module
follows.

    """ Configurator for establishing a WSGI pipeline """

    from ConfigParser import ConfigParser
    import types

    def configure(path):
        config = ConfigParser()
        if isinstance(path, types.StringTypes):
            config.readfp(open(path))
        else:
            config.readfp(path)

        appsections = []

        for name in config.sections():
            if name.startswith('application:'):
                appsections.append(name)
            elif name == 'pipeline':
                pass
            else:
                raise ValueError, '%s is not a valid section name'

        app_defs = {}

        for appsection in appsections:
            app_config_file = config.get(appsection, 'config')
            app_factory_name = config.get(appsection, 'factory')
            app_name = appsection.split('application:')[1]
            if app_config_file is None:
                raise ValueError, ('application section %s requires a
"config" '
                                   'option' % app_config_file)
            if app_factory_name is None:
                raise ValueError, ('application %s requires a "factory"'
                                   ' option' % app_factory_name)
            app_defs[app_name] = {'config':app_config_file,
                                  'factory':app_factory_name}

        if not config.has_section('pipeline'):
            raise ValueError, 'must have a "pipeline" section in config'

        pipeline_str = config.get('pipeline', 'apps')
        if pipeline_str is None:
            raise ValueError, ('must have an "apps" definition in the '
                               'pipeline section')

        pipeline_def = pipeline_str.split()

        next = None

        while pipeline_def:
            app_name = pipeline_def.pop()
            app_def = app_defs.get(app_name)
            if app_def is None:
                raise ValueError, ('appname %s os defined in pipeline '
                                   '%s butno application is defined '
                                   'with that name')
            factory_name = app_def['factory']
            factory = import_by_name(factory_name)
            config_file = app_def['config']
            app_factory = factory(config_file)
            app = app_factory(next)
            next = app

        if not next:
            raise ValueError, 'no apps defined in pipeline'
        return next

    def import_by_name(name):
        if not "." in name:
            raise ValueError("unloadable name: " + `name`)
        components = name.split('.')
        start = components[0]
        g = globals()
        package = __import__(start, g, g)
        modulenames = [start]
        for component in components[1:]:
            modulenames.append(component)
            try:
                package = getattr(package, component)
            except AttributeError:
                n = '.'.join(modulenames)
                package = __import__(n, g, g, component)
        return package

  We configure a pipeline based on a config file, which
  creates and chains two "sample" WSGI applications together.

  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

    [application:sample1]
    config = sample1.conf
    factory = wsgiconfig.tests.sample_components.factory1

    [application:sample2]
    config = sample2.conf
    factory = wsgiconfig.tests.sample_components.factory2

    [pipeline]
    apps = sample1 sample2

  The configurator exposes a function that accepts a single argument,
  "configure".

    >>> from wsgiconfig.configurator import configure
    >>> appchain = configure('myapplication.conf')

  The "sample_components" module referred to in the
  'myapplication.conf' file application definitions might look like
  this::

      class sample1:
          """ middleware """
          def __init__(self, app):
              self.app = app
          def __call__(self, environ, start_response):
              environ['sample1'] = True
              return self.app(environ, start_response)

      class sample2:
           """ end-point app """
          def __init__(self, app):
              self.app = app

          def __call__(self, environ, start_response):
              environ['sample2'] = True
              return ['return value 2']

      def factory1(filename):
          # this app requires no configuration, but if it did, we would
          # parse the file represented by filename and do some config
          return sample1

      def factory2(filename):
          # this app requires no configuration, but if it did, we would
          # parse the file represented by filename and do some config
          return sample2

  The appchain represents an automatically constructed pipeline of
  WSGI components.  Each application in the chain is constructed from
  a factory.

    >>> appchain.__class__.__name__ # sample1 (middleware)
    'sample1'
    >>> appchain.app.__class__.__name__  # sample2 (application)
    'sample2'

  Calling the "appchain" in this example results in the keys "sample1"
  and "sample2" being available in the environment, and what is
  returned is the result of the application, which is the list
  ['return value 2'].

Potential points of contention

 - The WSGI configurator assumes that you are willing to write WSGI
   component factories which accept a filename as a config file.  This
   factory returns *another* factory (typically a class) that accepts
   "the next" application in the pipeline chain and returns a WSGI
   application instance.  This pattern is necessary to support
   argument currying across a declaratively configured pipeline,
   because the WSGI spec doesn't allow for it.  This is more contract
   than currently exists in the WSGI specification but it would be
   trivial to change existing WSGI components to adapt to this
   pattern.  Or we could adopt a pattern/convention that removed one
   of the factories, passing both the "next" application and the
   config file into a single factory function.  Whatever.  In any
   case, in order to do declarative pipeline configuration, some
   convention will need to be adopted.  The convention I'm advocating
   above seems to already have been for the current crop of middleware
   components (using a factory which accepts the application as the
   first argument).

 - Pipeline deployment configuration should be used only to configure
   essential information about pipeline and individual pipeline
   components.  Where complex service data configuration is necessary,
   the component which implements a service should provide its own
   external configuration mechanism.  For example, if an XSL service
   is implemented as a WSGI component, and it needs configuration
   knobs of some kind, these knobs should not live within the WSGI
   pipeline deployment file.  Instead, each component should have its
   own configuration file.  This is the purpose (undemonstrated above)
   of allowing an [application] section to specify a config filename.

 - Some people have seem to be arguing that there should be a single
   configuration format across all WSGI applications and gateways to
   configure everything about those components.  I don't think this is
   workable.  I think the only thing that is workable is to recommend
   to WSGI component authors that they make their components
   configurable using some configuration file or other type of path
   (URL, perhaps).  The composition, storage, and format of all other
   configuration data for the component should be chosen by the
   author.

 - Threads which discussed this earlier on the web-sig list included
   the idea that a server or gateway should be able to "find" an
   end-point application based on a lookup of source file/module +
   attrname specified in the server's configuration.  I'm suggesting
   instead that the mapping between servers, gateways, and
   applications be a pipeline and that the pipeline itself have a
   configuration definition that may live outside of any particular
   server, gateway, or application.  The pipeline definition(s) would
   wire up the servers, gateways, and applications itself.  The
   pipeline definition *could* be kept amongs the files representing a
   particular server instance on the filesystem (and this might be the
   default), but it wouldn't necessarily have to be.  This might just
   be semantics.

 - There were a few mentions of being able to configure/create a WSGI
   application at request time by passing name/value string pairs
   "through the pipeline" that would ostensibly be used to create a
   new application instance (thereby dynamically extending or
   modifying the pipeline).  I think it's fine if a particular
   component does this, but I'm suggesting that a canonization of the
   mechanism used to do this is not necessary and that it's useful to
   have the ability to define static pipelines for deployment.

 - If elements in the pipeline depend on "services" (ala
   Paste-as-not-a-chain-of-middleware-components), it may be
   advantageous to create a "service manager" instead of deploying
   each service as middleware.  The "service manager" idea is not a
   part of the deployment spec.  The service manager would itself
   likely be implemented as a piece of middleware or perhaps just a
   library.

On Wed, 2005-07-20 at 02:08 +0800, ChunWei Ho wrote:
> Hi, I have been looking at WSGI for only a few weeks, but had some
> ideas similar (I hope) to what is being discussed that I'll put down
> here. I'm new to this so I beg your indulgence if this is heading down
> the wrong track or wildly offtopic :)
> 
> It seems to me that a major drawback of WSGI middleware that is
> preventing flexible configuration/chain paths is that the application
> to be run has to be determined at init time. It is much flexible if we
> were able to specify what application to run and configuration
> information at call time - the middleware would be able to approximate
> a service of sorts.

....