Re: [Python-checkins] CVS: python/dist/src/Lib ConfigParser.py,1.16,1.17
You may as well remove the entire "vi" concept from ConfigParser. Since "vi" can be *only* a '=' or ':', then you aren't truly checking anything in the "if" statement. Further, "vi" is used nowhere else, so that variable and the corresponding regex group can be nuked altogether. IMO, I'm not sure why the ";" comment form was initially restricted to just one option format in the first place. Cheers, -g On Fri, 3 Mar 2000, Jeremy Hylton wrote:
Update of /projects/cvsroot/python/dist/src/Lib In directory bitdiddle:/home/jhylton/python/src/Lib
Modified Files: ConfigParser.py Log Message: allow comments beginning with ; in key: value as well as key = value
Index: ConfigParser.py =================================================================== RCS file: /projects/cvsroot/python/dist/src/Lib/ConfigParser.py,v retrieving revision 1.16 retrieving revision 1.17 diff -C2 -r1.16 -r1.17 *** ConfigParser.py 2000/02/28 23:23:55 1.16 --- ConfigParser.py 2000/03/03 20:43:57 1.17 *************** *** 359,363 **** optname, vi, optval = mo.group('option', 'vi', 'value') optname = string.lower(optname) ! if vi == '=' and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character --- 359,363 ---- optname, vi, optval = mo.group('option', 'vi', 'value') optname = string.lower(optname) ! if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character
_______________________________________________ Python-checkins mailing list Python-checkins@python.org http://www.python.org/mailman/listinfo/python-checkins
-- Greg Stein, http://www.lyra.org/
Thanks for catching that. I didn't look at the context. I'm going to wait, though, until I talk to Fred to mess with the code any more. General question for python-dev readers: What are your experiences with ConfigParser? I just used it to build a simple config parser for IDLE and found it hard to use for several reasons. The biggest problem was that the file format is undocumented. I also found it clumsy to have to specify section and option arguments. I ended up writing a proxy that specializes on section so that get takes only an option argument. It sounds like ConfigParser code and docs could use a general cleanup. Are there any other issues to take care of as part of that cleanup? Jeremy
On Fri, 3 Mar 2000, Jeremy Hylton wrote:
Thanks for catching that. I didn't look at the context. I'm going to wait, though, until I talk to Fred to mess with the code any more.
Not a problem. I'm glad that diffs are now posted to -checkins. :-)
General question for python-dev readers: What are your experiences with ConfigParser?
Love it!
I just used it to build a simple config parser for IDLE and found it hard to use for several reasons. The biggest problem was that the file format is undocumented.
In my most complex use of ConfigParser, I had to override SECTCRE to allow periods in the section name. Of course, that was quite interesting since the variable is __SECTRE in 1.5.2 (i.e. I had to compensate for the munging). I also change OPTCRE to allow a few more charaters ("@" in particular, which even the update doesn't do). Not a problem nowadays since those are public. My subclass also defines a set() method and a delsection() method. These are used because I write the resulting changes back out to a file. It might be nice to have a method which writes out a config file (with an "AUTOGENERATED BY ConfigParser.py -- DO NOT EDIT BY HAND"; or maybe "... BY <appname> ...").
I also found it clumsy to have to specify section and option arguments.
I found these were critical in my application. I also take advantage of the sections in my "edna" application for logical organization.
I ended up writing a proxy that specializes on section so that get takes only an option argument.
It sounds like ConfigParser code and docs could use a general cleanup. Are there any other issues to take care of as part of that cleanup?
A set() method and a writefile() type of method would be nice. Cheers, -g -- Greg Stein, http://www.lyra.org/
On Fri, 3 Mar 2000, Jeremy Hylton wrote:
It sounds like ConfigParser code and docs could use a general cleanup. Are there any other issues to take care of as part of that cleanup?
One thing that bothered me once: I want to be able to have something like: [section] tag = 1 tag = 2 And be able to retrieve ("section", "tag") -> ["1", "2"]. Can be awfully useful for things that make sense several time. Perhaps there should be two functions, one that reads a single-tag and one that reads a multi-tag? File format: I'm sure I'm going to get yelled at, but why don't we make it XML? Hard to edit, yadda, yadda, but you can easily write a special purpose widget to edit XConfig (that's what we'll call the DTD) files. hopefull-yet-not-naive-ly y'rs, Z. -- Moshe Zadka <mzadka@geocities.com>. http://www.oreilly.com/news/prescod_0300.html
On Sat, 4 Mar 2000, Moshe Zadka wrote:
On Fri, 3 Mar 2000, Jeremy Hylton wrote:
It sounds like ConfigParser code and docs could use a general cleanup. Are there any other issues to take care of as part of that cleanup?
One thing that bothered me once:
I want to be able to have something like:
[section] tag = 1 tag = 2
And be able to retrieve ("section", "tag") -> ["1", "2"]. Can be awfully useful for things that make sense several time. Perhaps there should be two functions, one that reads a single-tag and one that reads a multi-tag?
Structured values would be nice. Several times, I've needed to decompose the right hand side into lists.
File format: I'm sure I'm going to get yelled at, but why don't we make it XML? Hard to edit, yadda, yadda, but you can easily write a special purpose widget to edit XConfig (that's what we'll call the DTD) files.
Write a whole new module. ConfigParser is for files that look like the above. There isn't a reason to NOT use XML, but it shouldn't go into ConfigParser. <IMO> I find the above style much easier for *humans*, than an XML file, to specify options. XML is good for computers; not so good for humans. </IMO> Cheers, -g -- Greg Stein, http://www.lyra.org/
On Sat, 4 Mar 2000, Greg Stein wrote:
Write a whole new module. ConfigParser is for files that look like the above.
Gotcha. One problem: two configurations modules might cause the classic "which should I use?" confusion.
<IMO> I find the above style much easier for *humans*, than an XML file, to specify options. XML is good for computers; not so good for humans. </IMO>
Of course: what human could delimit his text with <tag> and </tag>? oh-no-another-c.l.py-bot-ly y'rs, Z. -- Moshe Zadka <mzadka@geocities.com>. http://www.oreilly.com/news/prescod_0300.html
On Sat, 4 Mar 2000, Moshe Zadka wrote:
On Sat, 4 Mar 2000, Greg Stein wrote:
Write a whole new module. ConfigParser is for files that look like the above.
Gotcha.
One problem: two configurations modules might cause the classic "which should I use?" confusion.
Nah. They wouldn't *both* be called ConfigParser. And besides, I see the XML format more as a persistence mechanism rather than a configuration mechanism. I'd call the module something like "XMLPersist".
<IMO> I find the above style much easier for *humans*, than an XML file, to specify options. XML is good for computers; not so good for humans. </IMO>
Of course: what human could delimit his text with <tag> and </tag>?
Feh. As a communciation mechanism, dropping in that stuff... it's easy. <appository>But</appository><comma/><noun>I</noun> <verb><tense>would<modifier>not</modifier></tense>want</verb> ... bleck. I wouldn't want to use XML for configuration stuff. It just gets ugly. Cheers, -g -- Greg Stein, http://www.lyra.org/
"MZ" == Moshe Zadka <moshez@math.huji.ac.il> writes:
MZ> On Sat, 4 Mar 2000, Greg Stein wrote:
Write a whole new module. ConfigParser is for files that look like the above.
MZ> Gotcha. MZ> One problem: two configurations modules might cause the classic MZ> "which should I use?" confusion. I don't think this is a hard decision to make. ConfigParser is good for simple config files that are going to be maintained by humans with a text editor. An XML-based configuration file is probably the right solution when humans aren't going to maintain the config files by hand. Perhaps XML will eventually be the right solution in both cases, but only if XML editors are widely available.
<IMO> I find the above style much easier for *humans*, than an XML file, to specify options. XML is good for computers; not so good for humans. </IMO>
MZ> Of course: what human could delimit his text with <tag> and MZ> </tag>? Could? I'm sure there are more ways on Linux and Windows to mark up text than are dreamt of in your philosophy, Moshe <wink>. The question is what is easiest to read and understand? Jeremy
Jeremy Hylton writes:
Thanks for catching that. I didn't look at the context. I'm going to wait, though, until I talk to Fred to mess with the code any more.
I did it that way since the .ini format allows comments after values (the ';' comments after a '=' vi; '#' comments are a ConfigParser thing), but there's no equivalent concept for RFC822 parsing, other than '(...)' in addresses. The code was trying to allow what was expected from the .ini crowd without breaking the "native" use of ConfigParser.
General question for python-dev readers: What are your experiences with ConfigParser? I just used it to build a simple config parser for IDLE and found it hard to use for several reasons. The biggest problem was that the file format is undocumented. I also found it clumsy to have to specify section and option arguments. I ended up writing a proxy that specializes on section so that get takes only an option argument.
It sounds like ConfigParser code and docs could use a general cleanup. Are there any other issues to take care of as part of that cleanup?
I agree that the API to ConfigParser sucks, and I think also that the use of it as a general solution is a big mistake. It's a messy bit of code that doesn't need to be, supports a really nasty mix of syntaxes, and can easily bite users who think they're getting something .ini-like (the magic names and interpolation is a bad idea!). While it suited the original application well enough, something with .ini syntax and interpolation from a subclass would have been *much* better. I think we should create a new module, inilib, that implements exactly .ini syntax in a base class that can be intelligently extended. ConfigParser should be deprecated. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> Corporation for National Research Initiatives
[Fred]
I agree that the API to ConfigParser sucks, and I think also that the use of it as a general solution is a big mistake. It's a messy bit of code that doesn't need to be, supports a really nasty mix of syntaxes, and can easily bite users who think they're getting something .ini-like (the magic names and interpolation is a bad idea!). While it suited the original application well enough, something with .ini syntax and interpolation from a subclass would have been *much* better. I think we should create a new module, inilib, that implements exactly .ini syntax in a base class that can be intelligently extended. ConfigParser should be deprecated.
Amen. Some thoughts: - You could put it all in ConfigParser.py but with new classnames. (Not sure though, since the ConfigParser class, which is really a kind of weird variant, will be assumed to be the main class because its name is that of the module.) - Variants on the syntax could be given through some kind of option system rather than through subclassing -- they should be combinable independently. Som possible options (maybe I'm going overboard here) could be: - comment characters: ('#', ';', both, others?) - comments after variables allowed? on sections? - variable characters: (':', '=', both, others?) - quoting of values with "..." allowed? - backslashes in "..." allowed? - does backslash-newline mean a continuation? - case sensitivity for section names (default on) - case sensitivity for option names (default off) - variables allowed before first section name? - first section name? (default "main") - character set allowed in section names - character set allowed in variable names - %(...) substitution? (Well maybe the whole substitution thing should really be done through a subclass -- it's too weird for normal use.) --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum writes:
- You could put it all in ConfigParser.py but with new classnames. (Not sure though, since the ConfigParser class, which is really a kind of weird variant, will be assumed to be the main class because its name is that of the module.)
The ConfigParser class could be clearly marked as deprecated both in the source/docstring and in the documentation. But the class itself should not be used in any way.
- Variants on the syntax could be given through some kind of option system rather than through subclassing -- they should be combinable independently. Som possible options (maybe I'm going overboard here) could be:
Yes, you are going overboard. It should contain exactly what's right for .ini files, and that's it. There are really three aspects to the beast: reading, using, and writing. I think there should be a class which does the right thing for using the informatin in the file, and reading & writing can be handled through functions or helper classes. That separates the parsing issues from the use issues, and alternate syntaxes will be easy enough to implement by subclassing the helper or writing a new function. An "editable" version that allows loading & saving without throwing away comments, ordering, etc. would require a largely separate implementation of all three aspects (or at least the reader and writer).
(Well maybe the whole substitution thing should really be done through a subclass -- it's too weird for normal use.)
That and the ad hoc syntax are my biggest beefs with ConfigParser. But it can easily be added by a subclass as long as the method to override is clearly specified in the documenation (it should only require one!). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> Corporation for National Research Initiatives
On 05 March 2000, Guido van Rossum said:
- Variants on the syntax could be given through some kind of option system rather than through subclassing -- they should be combinable independently. Som possible options (maybe I'm going overboard here) could be:
- comment characters: ('#', ';', both, others?) - comments after variables allowed? on sections? - variable characters: (':', '=', both, others?) - quoting of values with "..." allowed? - backslashes in "..." allowed? - does backslash-newline mean a continuation? - case sensitivity for section names (default on) - case sensitivity for option names (default off) - variables allowed before first section name? - first section name? (default "main") - character set allowed in section names - character set allowed in variable names - %(...) substitution?
I agree with Fred that this level of flexibility is probably overkill for a config file parser; you don't want every application author who uses the module to have to explain his particular variant of the syntax. However, if you're interested in a class that *does* provide some of the above flexibility, I have written such a beast. It's currently used to parse the Distutils MANIFEST.in file, and I've considered using it for the mythical Distutils config files. (And it also gets heavy use in my day job.) It's really a class for reading a file in preparation for "text processing the Unix way", though: it doesn't say anything about syntax, it just worries about blank lines, comments, continuations, and a few other things. Here's the class docstring: class TextFile: """Provides a file-like object that takes care of all the things you commonly want to do when processing a text file that has some line-by-line syntax: strip comments (as long as "#" is your comment character), skip blank lines, join adjacent lines by escaping the newline (ie. backslash at end of line), strip leading and/or trailing whitespace, and collapse internal whitespace. All of these are optional and independently controllable. Provides a 'warn()' method so you can generate warning messages that report physical line number, even if the logical line in question spans multiple physical lines. Also provides 'unreadline()' for implementing line-at-a-time lookahead. Constructor is called as: TextFile (filename=None, file=None, **options) It bombs (RuntimeError) if both 'filename' and 'file' are None; 'filename' should be a string, and 'file' a file object (or something that provides 'readline()' and 'close()' methods). It is recommended that you supply at least 'filename', so that TextFile can include it in warning messages. If 'file' is not supplied, TextFile creates its own using the 'open()' builtin. The options are all boolean, and affect the value returned by 'readline()': strip_comments [default: true] strip from "#" to end-of-line, as well as any whitespace leading up to the "#" -- unless it is escaped by a backslash lstrip_ws [default: false] strip leading whitespace from each line before returning it rstrip_ws [default: true] strip trailing whitespace (including line terminator!) from each line before returning it skip_blanks [default: true} skip lines that are empty *after* stripping comments and whitespace. (If both lstrip_ws and rstrip_ws are true, then some lines may consist of solely whitespace: these will *not* be skipped, even if 'skip_blanks' is true.) join_lines [default: false] if a backslash is the last non-newline character on a line after stripping comments and whitespace, join the following line to it to form one "logical line"; if N consecutive lines end with a backslash, then N+1 physical lines will be joined to form one logical line. collapse_ws [default: false] after stripping comments and whitespace and joining physical lines into logical lines, all internal whitespace (strings of whitespace surrounded by non-whitespace characters, and not at the beginning or end of the logical line) will be collapsed to a single space. Note that since 'rstrip_ws' can strip the trailing newline, the semantics of 'readline()' must differ from those of the builtin file object's 'readline()' method! In particular, 'readline()' returns None for end-of-file: an empty string might just be a blank line (or an all-whitespace line), if 'rstrip_ws' is true but 'skip_blanks' is not.""" Interested in having something like this in the core? Adding more options is possible, but the code is already on the hairy side to support all of these. And I'm not a big fan of the subtle difference in semantics with file objects, but honestly couldn't think of a better way at the time. If you're interested, you can download it from http://www.mems-exchange.org/exchange/software/python/text_file/ or just use the version in the Distutils CVS tree. Greg
participants (6)
-
Fred L. Drake, Jr. -
Greg Stein -
Greg Ward -
Guido van Rossum -
Jeremy Hylton -
Moshe Zadka