[Web-SIG] wsgiconfig design

Sun Jul 8 22:18:04 CEST 2007

Jim Fulton wrote:
> 
> On Jul 7, 2007, at 3:01 PM, Ian Bicking wrote:
> 
>> Jim Fulton wrote:
> ..
>>> I do have one potential complaint about the entry-point APIs.  The 
>>> applications my company builds have configurations that are too 
>>> complex to fit in a single config-parser section.  To handle these 
>>> configurations, I'd need to be able to read multiple sections, or to 
>>> refer to an external configuration.  I think the later is the current 
>>> recommended approach for Paste Deploy.  If you want to keep that 
>>> approach, then the existing entry-point APIs are fine.  (I 
>>> personally, want to be able to put all of my configuration in a 
>>> single file, but zc.buildout lets me do that, so I don't need Paste 
>>> Deploy to do that for me.)
>>
>> As I've been coding, I've actually been thinking about passing a
>> complete config dictionary in instead of global_conf.  So it would be
>> like {section: {option: value, ...}}.
> 
> BTW, I've become a big fan of the ConfigParser module and format, mainly 
> because of it's simplicity. The simplicity of a dictionary of 
> dictionaries is very powerful, IMO.  Of course, the ConfigParser API is 
> pretty horrible.  Fortunately, it's trivial to convert a parser object 
> to a mapping of mappings and that's generally one of the first things I 
> do after I've parsed a file.
> 
>> This lets you look into other
>> application's sections, which maybe isn't ideal.
> 
> Why? Are you afraid that a handler will look at something it shouldn't?  
> Who cares? Relax. This isn't the Spanish Inquisition. :)
> 
> Embrace the simplicity of a mapping of mappings.

The problem as I see it with looking into other people's sections isn't 
so much that it's dangerous, as it expands the number of possible 
sources of a configuration bug.  That is, if an application is acting in 
an unexpected way, you have to worry about any configuration anywhere in 
the file.  You can't feel certain that the configuration problem is 
limited to some particular set of sections.

>> Another option that occurs to me now might be something like
>> [config:app_name section_name], and then pass in section_name={dict of
>> options} as a keyword argument.  I.e.:
>>
>>   [/]
>>   use = egg:MyPackage
>>   greeting = Hello
>>
>>   [config:/ email]
>>   smtp_server = localhost
>>   email = bob at example.com
>>
>> That leads to mypackage.wsgi_app(global_conf, greeting="Hello",
>> email={'smtp_server': 'localhost', 'email': 'bob at example.com})
>>
>> I think I prefer the latter.
> 
> I prefer the simpler model.  For one thing, it lets you share data among 
> multiple sections. Maybe this isn't important for Paste Deploy. Having 
> said that, I think your suggested is fine and is also less verbose than 
> the simpler approach, because, in the simpler approach, the root section 
> will tend to have options saying what other sections to read.
> 
> If you are going to do something like this, then, IMO, you might also 
> consider:
> 
>   [/]
>   use = egg:MyPackage
>   greeting = Hello
>   email =
>       smtp_server = localhost
>       email = bob at example.com

This will work fine already, and I use it sometimes in my applications. 
  Of course, the application has to parse those assignments itself.  But 
that's easy enough, and not every indented block of text is going to be 
a set of sub-assignments.

>>>> options is a flat dictionary of options that are passed in, which the
>>>> config loader can use at its discretion.  A common way to use it would
>>>> be for variable substitution.  This allows for things like "paster 
>>>> serve
>>>> config.ini var1=value var2=value", for ad hoc customization.  It 
>>>> returns
>>>> a single application.  (This does mean that a feature from Paste Deploy
>>>> is lost, where you could do "paster serve config.ini#section_name" to
>>>> load a specific application from several defined in a file -- but you
>>>> are less likely to get dead sections or confusing config file 
>>>> factorings).
>>>>
>>>> object_type is the kind of object we want to get out.  Here I'll only
>>>> specify 'wsgi.application'.  'wsgi.server' will probably also be
>>>> implemented, but that's all I plan for now.  'wsgi.appserver' or
>>>> something might be possible, for the process manager that runs an 
>>>> entire
>>>> application.
>>> I don't really follow this. Maybe an example would help.
>>
>> Well, lets say you have a configuration like:
>>
>>   [/]
>>   use = egg:MyApp
>>
>>   [middleware:/]
>>   use = egg:Paste#profile
>>
>>   [server:main] # or maybe just server?
>>   use = egg:Paste#http
>>   host = 127.0.0.1:${port}
>>
>> Then you start it up with "serve config.ini port=8090".  That's the idea
>> of the options dictionary, it holds {'port': '8090'}.
> 
> Ah, so command-line options.
> 
> In the example, a port is something you'd want to make available in a 
> server section isn't it?  Why do you want the loader to get command-line 
> options?

Users have requested this.  In part it's easier to ship a config file 
and give people instructions with some options for getting started 
quickly.  In the implementation I just set variables in parser.defaults 
with these values.  If you put "port = 8080" in [DEFAULT] then it'll 
overwrite that with your option, but work without the option too.

>>>> Unlike Paste Deploy, section names will not be arbitrary.  A section
>>>> name has a prefix and name.  The prefix, as in Paste Deploy, says what
>>>> you are describing.  The default prefix is "app:"; you can also give
>>>> "middleware:".  Prefixes not recognized will be ignored.  A possible
>>>> prefix might be "logging:" for logging, which if I don't implement it
>>>> will be initially ignored (but someone else could handle it).
>>> IMO, it would be nice *not* to reinvent yet another logging 
>>> configuration handler. The standard library already defines one. If 
>>> we don't like it, we should make it better.
>>
>> I don't like it, but I don't feel like improving it either ;)
> 
> I hope you don't consider that a reason to reinvent it.  I would hope 
> that, in the future, when someone gets that itch, they'll resist and 
> improve the standard one instead.
> 
> We invented ZConfig which has it's own logging configuration "schema".  
> The result?  It hasn't remained up to date with the logging package and 
> people who use it don't have access to some useful loggers without 
> screwing with ZConfig schemas (which isn't fun), Bad bad bad.
> 
>>   Anyway,
>> this is basically just a convention to group all the sections together
>> for logging based on that prefix.  The logging module's configuration
>> handler can handle it,
> 
> It can?  I think it looks for specific un-prefixed section names.
> 
>> or we could wrap it slightly (if you loaded
>> logging you wouldn't actually be returning anything, you'd be updating
>> the global logging configuration, which may or may not be what we want).
> 
> I'm not sure what you mean here.  In theory, if you simply let people 
> use the sections defined by the logging module, you could point the 
> standard logging module at your config and be done.  You could even 
> condition this on whether the defined sections are present.  
> Unfortunately, I don't speak from experience because the applications I 
> routinely use use ZConfig.

Right now in Paste Deploy it ignores any sections that aren't 
specifically asked for.  So you can have sections with any name, and 
even app: sections that just don't work as long as you don't try to use 
them.  So having the logging module load config from it is easy, and I 
added that automatic parsing recently.

I think you are right, it doesn't like prefixes, though I've never 
tried.  OTOH, it must be possible to give the logging module a 
dict-of-dicts or some fake ConfigParser that pulls only from sections 
with a given prefix.  This isn't reinventing the logging module's config 
handler, it's just giving the logging module a view onto the config file.

>>>>   Similarly
>>>> the server as with Paste Deploy can be defined with "server:".  For now
>>>> all we're concerned with is applications, middleware, and composites.
>>> Maybe I'm missing your point, but I thought the value of Paste Deploy 
>>> was to be able to have a way to define and end-to-end configuration 
>>> of applications, middleware and server.
>>
>> The server is a little bit of an outlier.  The applications and
>> middleware can be composed directly and fairly opaquely, but the 
>> server needs to be connected to the application more explicitly and 
>> outside of wsgiconfig.  OTOH, it's real handy to be able to put the 
>> server section in the same config file.
> 
> IMO, it's very important to put the server in the config.  Why make the 
> program using the config do that?
> 
> I really want to to be able to at least do all of the WSGI configuration 
> in one place.
> 
> Note that, traditionally, Zope has allowed multiple servers to exist in 
> a single process.  For smaller applications that can be handled by a 
> single process, this is a significant win.  Selfishly, this isn't so 
> important to me as the applications ZC deals with are large scale and 
> have many processes so having a single server per process is the norm 
> for us. Others may perceive the loss though,

I'm not arguing that the server shouldn't go into the config.  It's just 
a separate part of the system, where some program gets a config file, 
pulls out the application and pulls out the server, and then invokes 
them together (what paster serve does now).  It's much simpler, since a 
server is always defined in just a single section -- no middleware, and 
usually not many options.

I'd like to support multiple servers too, but for now the API of the 
paste.server_runner entry point isn't really rich enough to do that.  So 
I'm just going to stick with the current functionality Paste Deploy 
provides.  People don't complain about it much, usually they are just 
happy enough to have easy server configuration.

>>>> The applications and middleware are grouped together using the names.
>>>> That is, if you have an application "/" and a middleware 
>>>> "middleware:/",
>>>> then the middleware wraps that application.  Middleware sections can
>>>> have trailing numbers to indicate ordering and keep section names
>>>> unique.  Thus "middleware:/ 1", "middleware:/ 2", etc.  Negative 
>>>> numbers
>>>> and floats are allowed.  Anything but trailing numbers is considered
>>>> part of the name; thus names can have parameters or other structure.
>>> Hm.  Sounds a bit too magic to me.  Maybe an example will make it 
>>> look better. :)
>>
>> Well, what we are trying to create is a basic 
>> middleware1(middleware2(app)) composition, where the app is required 
>> and the middleware is not.
>>
>> We group these together by name, with urlmap that name is a path.  So 
>> / is the main app, /blog is the app mounted at /blog, etc.  Then we 
>> need an ordered list of the middleware to apply.  There needs to be 
>> some way to distinguish a middleware section from an application 
>> section, hence middleware:.  And then a way of ordering them.  We 
>> could use the section ordering, except duplicate section names are no 
>> good anyway, even if we did keep track of the order they were defined 
>> in.  So I'm proposing a trailing number.
> 
> Personally, I much prefer explicit composition sections, as I think you 
> have no.  Then you simply have an option that names the nodes to be 
> composed in order.

This is basically what Paste Deploy does now.  I've found it somewhat 
confusing for people, in part because the section names don't really 
mean anything.  Probably the worse problem is the middleware 
composition, which is solved by using shared names with the number 
suffixes, and that's more important to me.

OTOH, I almost always use urlmap if I'm doing any kind of application 
composition.  So I'm happy using that as a default, and just having 
people effectively replace "main" with "/" as the base application. 
I'll probably even switch the urlmap constructor to return the 
application itself when it gets {'/': something}, as it has no work to do.

>>>> All the applications in the server are put in a single dictionary, and
>>>> that is based to the composer.  The composer by default is urlmap 
>>>> (which
>>>> also includes optional host-based dispatch).  You can specify another
>>>> composer with a global option "composer = (specifier)"
>>> I'm not sure how I feel about that.
>>
>> What would be the problem?
> 
> I'd prefer the composers be more explicitly part of the configuration.  
> That is, a composer is defined with a section, like everything else.

Yeah, that would be fine, i.e., "[composer] use = foo" instead of 
"composer = foo".

>>   That is, you have to be able to say, "for this application, get the 
>> application from this other file".  That could simply be:
>>
>>   [/blog]
>>   use = egg:WSGIConfig#load_config
>>   config_file = blog.ini
>>
>> But that's kind of awkward, so I think it would be better if there was 
>> a clearer construct.  blog.ini might itself have internal structure 
>> and multiple applications, so we can't just use config file inlining 
>> to accomplish this.
> 
> I mainly think this is a different concept and should have a separate 
> option name, whatever syntax is used.

Yeah, maybe "from_file = filename".  Probably the [DEFAULT] from the 
first file would be passed in as options to the second file...?  This 
might be confusing, I'm not sure.  An explain mode might make it more 
obvious.

>> If you are restrictive in your entry point and don't include **kw, 
>> then you can raise errors about misspelled configuration.
> 
> Sure, but don't options put in DEFAULT appear everywhere?  Won't that 
> make it impossible to avoid **kw and to complain about unrecognized 
> options?

global_conf currently contains stuff from [DEFAULT], I don't pass it in 
as **kw.  I basically filter out all options where parser.defaults 
contains the same value.  Using INITools it actually keeps them fully 
separate, so that's not a problem.

The point of global_conf is specifically to give a place for 
unconstrained settings that aren't necessarily intended for every 
application.

One issue, though, is that frameworks like Pylons set up their handlers 
with **kw, because it's a lot easier to tell your users that they simply 
have to add a setting in their config file and it'll be available.  You 
can't do access tracking either, because that setting might only be used 
in one controller in the application.  So they lose option checking. 
OTOH, if someone wants to add polish to their application, it's easy 
enough to do a grep for settings and edit your wsgiapp.py file to have a 
specific signature.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers