[Web-SIG] WSGI adoption

Ian Bicking ianb at colorstudy.com
Mon Nov 29 23:15:15 CET 2004


Phillip J. Eby wrote:
>> 3. Even if we did use ConfigParser, it still doesn't solve the lack of 
>> encoding support.
> 
> 
> True, but entirely manageable for any 8-bit encoding that doesn't 
> require escaping for the characters (such as #, ; , [, =, ], :, and 
> whitespace) that ConfigParser uses for syntax.  IOW, the various Latin 
> codings and UTF-8 are all fine.

Well, it returns text as 8-bit strings, not as unicode strings.  I think 
Alan wants unicode.  I imagine it would be easy enough to add -- maybe 
enough just to open the file with an encoding specified (though being 
able to detect the encoding would be better).  But never using unicode 
strings, I seldom know what the Right Thing is for unicode.

Or, if applied as a wrapper, you could decode all the strings after 
they've been loaded.  Maybe that's what you were thinking?

>> 4. Other people are having exactly the same problems deciding how best 
>> to approach configuration.
> 
> 
> Actually, they aren't.  The ConfigParserShootout has expanded scope 
> tremendously over the original Python-Dev discussion, which was much 
> more about API than format.  The needs for a WSGI deployment format are 
> much more straightforward.

Well, it's not the ConfigParserShootout's fault that more persistent, 
extended discussion moved there.  That was the idea of the wiki page. 
It might not be representative of everyone's feelings, but it's 
representative of something.

> The format MUST be:
> 
> * Easy for non-programmer users to read, write, and edit (which implies 
> a variety of more detailed requirements, such as case-insensitivity for 
> configuration keys, and a lack of excessive quoting or escaping)
> * Extensible, such that programs can ignore parts that are not intended 
> for them

Extensibility also requires (IMHO) the layering of multiple 
configuration files**.  Maybe ConfigParser's model of just overwriting 
old values with every read() is sufficient; though at least it should 
have a .copy() method if it's going to destructively read values.

> * Able to represent filenames, strings, numbers, and boolean flags.

I think it's easiest that it simply have string values, without any 
"native" data types.  This is how ConfigParser works, though it also 
happens to include convenience methods that also do conversion (in 
contrast to YAML, where data types are built into the config file format)

> The format SHOULD be:
> 
> * Easy for a GUI or other tool to edit or generate

ConfigParser does poorly at this; if I was to do this with ConfigParser 
you'd really end up doing some sort of funny thing where you made a 
separate parser that would look for the point in the file you want to 
add a key, then do so, and modify ConfigParser so it kept track of the 
changes that were made to it; ConfigParser itself doesn't facilitate 
this at all.

> To me, the .ini syntax's only failing in these requirements so far is 
> that an encoding would need to be specified for strings.
> 
> Whether the ConfigParser library itself should be used or not, I don't 
> know.  Its advantages are:
> 
> * It's been in the standard library a long time, meaning it's available 
> on the platforms of interest for WSGI

Because no one is (or would) propose any alternative other than a 
plain-Python module, it's not a big deal to distribute it separately. 
Especially if there are version issues with old versions of Python and 
ConfigParser.  (I'm not sure if there is, but if we want to enhance 
ConfigParser even slightly then there will be)

> * It handles booleans in a user-friendly fashion, at least for 
> English-speaking users.

That's fairly trivial functionality.  At least it's trivial when you 
don't have native types, i.e., you know ahead of time (or at runtime) 
what values are to be interpreted as booleans.

> * It allows string interpolation for the hackerly types who don't like 
> repeating themselves

People seem unhappy with this feature.  At least there were several 
people in the python-dev who felt this way.  The fact that there's a 
"more sane" version of this (SafeConfigParser) makes it seem like a 
questionable feature.  It's also useful, so I'm not entirely sure what 
to make of it.  But then if we go down that path, conditionals 
(#ifdef-style) are also very useful, but soon we need a whole 
programming language.

Substitutions can also be applied to values after they are loaded.

> Its disadvantages are:
> 
> * Implementation has changed a lot over its history
> 
> * Documentation is accused of being "handwavy"

Maybe because it has a few too many options and a couple different 
interfaces (the different classes).  Mostly I think it's an ordering 
problem; RawConfigParser should be completely explained (and shouldn't 
be too hard to explain), and the other parsers

> * Format is not rigorously defined

It's not rigorous, but is it ambiguous or otherwise problematic?  When I 
reimplemented the parser (wsgikit.config.iniparser) I didn't notice any 
ambiguities, except maybe in terms of error conditions.

Well, there's maybe a question about how continuation lines should be 
interpreted.  Should all leading whitespace be chopped off?  Should 
newlines remain?

> Of course, I would be fine with us rigorously defining a format that met 
> the requirements.
> 
> One other possibility I can see, would be the Java properties file 
> format, or something similar to it.

That's just an XML format, right?  A painfully verbose format if I 
remember, even for XML.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org


More information about the Web-SIG mailing list