[Web-SIG] WSGI deployment use case

Tue Jul 26 03:29:34 CEST 2005

Phillip J. Eby wrote:
> At 06:40 PM 7/25/2005 -0500, Ian Bicking wrote:
> 
>> But configuration and composition of multiple independent applications
>> into a single process isn't.  I don't think we can solve these
>> separately, because the Hard Problem is how to handle configuration
>> alongside composition.  How can I apply configuration to a set of
>> applications?  How can I make exceptions?  How can an application
>> consume configuration as well as delegate configuration to a
>> subapplication?  The pipeline is often more like a tree, so the logic is
>> a little complex.  Or, rather, there's actual *logic* in how
>> configuration is applied, almost all of which are viable.
> 
> 
> We probably need something like a "site map" configuration, that can 
> handle tree structure, and can specify pipelines on a per location 
> basis, including the ability to specify pipeline components to be 
> applied above everything under a certain URL pattern.  This is more or 
> less the same as my "container API" concept, but we are a little closer 
> to being able to think about such a thing.

It could also be something based on general matching rules, with some 
notion of precedence and how the rule effects SCRIPT_NAME/PATH_INFO.  Or 
something like that.

> Of course, I still think it's something that can be added *after* having 
> a basic deployment spec.

I feel a very strong need that this be resolved before settling on 
anything deployment related.  Not necessarily as a standard, but 
possibly as a set of practices.  Even a realistic and concrete use case 
might be enough.

>> I can figure out a bunch of ad hoc and formal ways of accomplishing this
>> in Paste; most of it is already possible, and entry points alone clean
>> up a lot of what's there (encouraging a separation between how an
>> application is invoked generally, and install-specific configuration).
>> But with a more limited and declarative configuration it is harder.
> 
> 
> But the tradeoff is greater ability to build tools that operate on the 
> configuration to do something -- like James Gardner's ideas about 
> backup/restore and documentation tools.

I can see that.  But I know my way works, which is a bit of a bonus. 
And really it's entirely possible to inspect it as well.

>> Also when configuration is pushed into factories as keyword arguments,
>> instead of being pulled out of a dictionary, it is much harder -- the
>> configuration becomes unhackable.
> 
> 
> But a **kw argument *is* a dictionary, so I don't understand what you 
> mean here.

It's about how configuration is delegated to contained applications and 
middleware, and what's the expectation of what that configuration looks 
like.  I think components that don't take **kw will be hard to work with.

Right now Paste hands around a fairly flat dictionary.  This dictionary 
is passed around in full (as part of the WSGI environment) to every 
piece of middleware, and actually to everything (via an import and 
threadlocal storage).  It gets used all over the place, and the ability 
to draw in configuration without passing it around is very important.  I 
know it seems like heavy coupling, but in practice it causes unstable 
APIs if it is passed around explicitly, and as long as you keep clever 
dynamic values out of the configuration it isn't a problem.

Anyway, every piece gets the full dictionary, so if any piece expected a 
constrained set of keys it would break.  Even ignoring that there are 
multiple consumers with different keys that they pull out, it is common 
to create intermediate configuration values to make the configuration 
more abstract.  E.g., I set a "base_dir", then derive "publish_dir" and 
"template_dir" from that.  Apache configuration is a good anti-example 
here; its lack of variables hurts me daily.  While some variables could 
be declared "abstract" somehow, that adds complexity where the 
unconstrained model avoids that complexity.

When one piece delegates to another, it passes the entire dictionary 
through (by convention, and by the fact it gets passed around 
implicitly).  It is certainly possible in some circumstances that a 
filtered version of the configuration should be passed in; that hasn't 
happened to me yet, but I can certainly imagine it being necessary 
(especially when a larger amount of more diverse software is running in 
the same process).

One downside of this is that there's no protection from name conflicts. 
  Though name conflicts can go both ways.  The Happy Coincidence is when 
two pieces use the same name for the same purpose (e.g., it's highly 
likely "smtp_server" would be the subject of a Happy Coincidence).  An 
Unhappy Coincidence is when two pieces use the same value for different 
purposes ("publish_dir" perhaps).  An Expected Coincidence is when the 
same code, invoked in two separate call stacks, consumes the same value. 
  Of course, I allow configuration to be overwritten depending on the 
request, so high collision names (like publish_dir) in practice are 
unlikely to be a problem.

The upside over anything that expects structure in the configuration 
(e.g., that configuration be targetted at a specific component) is that 
I can hide implementation.  This is extremely important to me, because I 
have lots of pieces.  Some of them are clearly different components from 
the inside, some are vague and the distinction would be based entirely 
on my mood.  For instance an application-specific middleware that could 
plausibly be used more widely -- does it consume the application 
configuration, or does it take its own configuration?  But even 
excluding those ambiguous situations, the way my middleware is factored 
is an internal implementation detail, and I don't feel comfortable 
pushing that structure into the configuration.

So that's the issue I'm concerned about.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org