[Web-SIG] Standardized configuration
ianb at colorstudy.com
Tue Jul 19 19:15:00 CEST 2005
Chris McDonough wrote:
> On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
>>In addition to the examples I gave in response to Graham, I wrote a
>>document on this a while ago:
>>The hard part about this is configuration; it's easy to configure a
>>non-branching chain of middleware. Once it branches the configuration
>>becomes hard (like programming-hard; which isn't *hard*, but it quickly
>>stops feeling like configuration).
> Yep. I think I'm getting it. For example, I see that Paste's URLParser
> seems to *construct* applications if they don't already exist based on
> the URL. And I assume that these applications could themselves be
> middleware. I don't think that is configurable declaratively if you
> want to decide which app to use based on arbitrary request parameters.
> But if we already had the config for each app "instance" that URLParser
> wanted to consult laying around as files on disk, wouldn't it be just as
> easy to construct these app objects "eagerly" at startup time? Then you
> URLParser could choose an already-configured app based on some sort of
> configuration file in the URLParser component itself. The "apps"
> themselves may be pipelines, too, I realize that, but that is still
> configurable without coding.
That's what paste.urlmap is for:
(I haven't actually tried using it much for practical things, so it's
quite possible it has design mistakes in it)
The idea being that you do:
urlmap['/myapp'] = MyApp()
But additionally (in PathProxyURLMap):
urlmap['/myapp'] = 'myapp.conf'
And it builds the application from the configuration file.
> Maybe there'd be some concern about needing to stop the process in order
> to add new applications. That's a use case I hadn't really considered.
> I suspect this could be done with a signal handler, though, which could
> tell the URLParser to reload its config file instead of potentially
> locating a and creating a new application within every request.
> This would make URLParser a kind of "decision" middleware, but it would
> choose from a static set of existing applications (or pipelines) for the
> lifetime of the process as opposed to constructing them lazily.
URLParser itself is just one parsing implementation, though maybe named
too generically. I don't think that particular code needs to grow many
more features, but there's also room for many other parsers. And it's
also fairly easy to wrestle control from URLParser if that gets put in
the stack (for instance, putting an application function in __init__.py
will basically take over URL parsing for that directory).
>>>OTOH, I'm not sure that I want my framework to "find" an app for me.
>>>I'd like to be able to define pipelines that include my app, but I'd
>>>typically just want to statically declare it as the end point of a
>>>pipeline composed of service middleware. I should look at Paste a
>>>little more to see if it has the same philosophy or if I'm
>>Mostly I wanted to avoid lots of magical incantations for the simple
>>case. If you are used to Webware, well it has a very straight-forward
>>way of finding your application -- you give it a directory name. If
>>Quixote or CherryPy, you give it a root object. Maybe Zope would take a
>>ZEO connection string, and so on.
> I think I understand now.
> In general, I think I'd rather create "instance" locations of WSGI
> applications (which would essentially consist of a config file on disk
> plus any state info required by the app), configure and construct Python
> objects out of those instances eagerly at "startup time" and just choose
> between already-constructed apps if in "decision middleware" that has
> its own declarative configuration if decisions need to be made about
> which app to use.
I think this is a laudible goal. Right now, when I'm deploying
applications written for Paste, I am reluctant to intermingle them in
the same process and configuration... but that's because Paste is young,
not because that's a bad idea. But as a result I haven't tried it, and
I only have a moderate concept of what it would mean in practice.
A neat feature would be to configure fairly seemlessly across process
boundaries. E.g., add a "fork=True" parameter to an application's
configuration, and the server would fork a process (or delegate to an
already forked worker process) for that application. That's the sort of
thing that could move Python into PHP-style hosting situations.
> This is mostly because I want the configuration info to live within the
> application/middleware instance and have some other "starter" import
> those configurations from application/middleware instance locations on
> the filesystem. The "starter" would construct required instances as
> Python objects, and chain them together arbitrarily based on some other
> "pipeline configuration" file that lives with the "starter". The first
> part of that (construct required instances) is described in a post I
> made to this list yesterday.
> This is probably because I'd like there to be one well-understood way to
> declaratively configure pipelines as opposed to each piece of middleware
> potentially needing to manage app construction and having its own
> configuration to do so.
> I don't know if this is reasonable for simpler requirements. This is
> more of a "formal deployment spec" idea and of course is likely flawed
> in some subtle way I don't understand yet.
I think there's probably some room for separation. In practice I end up
with multiple configuration files for my projects -- one that's generic
to the application, and one that's specific to the deployment. But it's
very hard to determine ahead of time what stuff goes where. For
instance, server options mostly go in the deployment configuration.
Until I start building conventions about configuration information on
the servers, at which time I expect configuration will migrate into
common locations in the form of configuration-loading options. E.g.,
where I now do:
server = 'scgi_threaded'
port = 4010
In the future I might do:
port = port_map.find_port(app_name)
Where port_map is some global module where I keep the entire server's
list of ports mappings. And being able to do stuff like this is what
makes Python-syntax imperative configuration so nice... it's crude and
annoying, but configuration that is more declarative becomes even worse
when you try to build these kind of features into it.
But I digress... the deployment configuration as I currently use it is
usually something that overwrites the generic application configuration.
They aren't two distinct things. And the configuration doesn't belong
to one or the other. Is the location of session information server
specific, application specific, profile specific? It depends on your
situation. I might have a standard convention for the location of
development machine I override that because I'm doing development on one
of those libraries. There's all sorts of specific cases, and in
declarative or well-partitioned configurations the configuration
language has to include lots and lots of features. Or you end up with
configuration file generation or other nonsense.
In the end, I think I have more faith in the general applicability of
Python as a way to describe structures, combined with strong
configuration-specific conventions and style guides. Otherwise it feels
like this embeds policy into the configuration-loading code, and I hate
policy in code.
>>>I'm pretty sure you're not advocating it, but in case you are, I'm not
>>>sure it adds as much value as it removes to be able to have a "dynamic"
>>>middleware chain whereby new middleware elements can be added "on the
>>>fly" to a pipeline after a request has begun. That is *very* "late
>>>binding" to me and it's impossible to configure declaratively.
>>I'm comfortable with a little of both. I don't even know *how* I'd stop
>>dynamic middleware. For instance, one of the methods I added to Wareweb
>>recently allows any servlet to forward to any WSGI application; but from
>>the outside the servlet looks like a normal WSGI application just like
> It's obviously fine if applications themselves want to do this. I'm not
> sure that it would be possible to create a "deployment spec" that
> canonized *how* to do it because as you mentioned it's not really a
> configuration task, it's a programming task.
>>>I agree! I'm a bit confused because one of the canonical examples of
>>>how WSGI middleware is useful seems to be the example of implementing a
>>>framework-agnostic sessioning service. And for that sessioning service
>>>to be useful, your application has to be able to depend on its
>>>availability so it can't be "oblivious".
>>This is where I'd like additional (incrementally agreed upon) standards.
>> For instance, a standard for the interface of 'webapp01.session'.
>>It's a requirement, certainly, but the requirement is merely "there must
>>be a webapp01-compliant session installed".
> Yes... I think the best way to describe this sort of thing is through
> interfaces (at least notional, documented ones, if not formal ones that
> can be introspected at runtime). But that will need to be fleshed out
> on a service-by-service basis, obviously.
> FWIW, I'm also finding myself agreeing with Phillip's idea of allowing
> applications to have a context object to which can help them find
> services, as opposed to implementing each service entirely as
> Instead of obtaining the sessioning service via
> "environ['webapp01.session']" in an application's __call__ , you might
> do "self.context.get_service('session')"... or maybe even
> "environ['services'].get_service('session')". The latter would be
> easier to add because we'd be using an existing PEP 333 protocol. We'd
> consume a single key within the environ namespace, but there would need
> to be no change to the WSGI spec.
I have to read over PJE's email some more. It doesn't really remove the
need for middleware, it's more like it could consolidate many services
into one generic service middleware. For instance, the session service
still needs access to the response, and the only general way to access
the response is through middleware. The request, at least, can be
generally accessed as the environment dictionary; but replacing
middleware with contracts on what you must return from your application
is a non-starter. E.g., if an auth service requires something like:
auth = get_service('auth')
if not auth.allowed(app_context):
forbidden = auth.forbidden()
Well... that's not very nice, is it? And it's totally infeasible once
your code is in the bowls of some framework. You could do it with an
exception (with some middleware that catches the exception). You could
do the session service with some middleware that collects extra headers
and other response information.
And now that I'm thinking through an implementation, I realize it's
something I've thought of before -- in my mind it was about
lighter-weight filters and simpler configuration, but the implementation
would be similar.
My only concern is if it confuses the order of filters. If there's one
generic service middleware, it's probably going to be invoked before
some other middleware and after others. But the services would
communicate with that service middleware outside of the WSGI band (using
callbacks or shared structures or something). This makes it difficult
for transforming middleware to be certain that it has full control to
> This would be pretty straightforward and a separate services framework
> could be implemented outside WSGI entirely perhaps taking some cues from
> PEAK and/or Zope 3 ( or even [gasp] *code!*, god knows this problem has
> already been solved many times over ;-) -- for implementing service
> registration and lookup. It could form the basis for a "WSGI services"
> spec without muddying the waters for PEP 333.
> That said, if you're not interested in that because you think
> implementing services as middleware is "good enough" and you'd rather
> not implement another framework, I'd totally understand that. At that
> point I probably wouldn't be interested either because you're the
> defacto champion of WSGI middleware as a lingua franca and the only
> reason to do any of this is for the sake of collaboration and code
> sharing. But I do think it would be cleaner.
Well, I'm a fan of working code. If services are a better way of doing
some of this stuff, and they supercede code I've written or imagined,
that's not that big a deal. At this point I'd be interested to see how
a Really Lame Implementation of Sessions (for instance) would be
implemented with services.
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Web-SIG