[Tutor] parsing configuration files
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Mon Dec 22 19:20:12 EST 2003
On Mon, 22 Dec 2003, Daniel Ehrenberg wrote:
> I'm trying to write a program to make it easier to
> automate the making of GTK+ GUIs by making
> configuration files that use a simplified YAML-like
> syntax (see yaml.org; it looks kinda like Python).
Sounds good. Let's take a look at the code:
> def parsefile(filename):
> stuff2parse = open(filename)
> data = {}
> for line in stuff2parse.xreadlines():
> if line:
> if not line.startswith(' '):
> currentSection = line[:-1]
> data[currentSection] = []
> else:
> try:
> data[currentSection] += line.strip()
> except: pass
>
> return data
Side note: it might be nicer to have parsefile() take in a file-like
object, rather than a filename.
###
def parsefile(stuff2parse):
data = {}
for line in stuff2parse.xreadlines():
...
###
The reason this adjustment might help is because it becomes easier to test
out the code, since we can make a string look like a file with the
StringIO module:
http://www.python.org/doc/lib/module-StringIO.html
and be able to quickly test it with:
###
from StringIO import StringIO
sample_conffile = '''
gRadio:
_Gedit
gedit
simple text editor
sRadio:
_Synaptic
synaptic
GUI for apt-get
'''
print parsefile(StringIO(sample_conffile))
###
and it might be a good idea to make this a formal unit test. But I think
I'm getting off the subject. *grin*
Looking at the main loop:
###
for line in stuff2parse:
if line:
if not line.startswith(' '):
currentSection = line[:-1]
data[currentSection] = []
else:
try:
data[currentSection] += line.strip()
except: pass
###
There's a subtle type error going on, and it has to do with 'data'. In
the first case,
data[currentSection] = []
shows that data[currentSection] must be a list, so we need to deal with it
with list methods. The second block:
data[currentSection] += line.strip()
tries to deal with it as if it were a string! What ends up happening is
something akin to:
###
>>> l = []
>>> l += "hello world"
>>> l
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
###
By the way, this is bizarre! *grin* I had really expected a TypeError at
this point, like:
###
>>> [] + "foo"
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: can only concatenate list (not "str") to list
###
but the difference is due to the behavior of the '+=' operator on lists:
The behavior of '+=' on lists is equivalent to list.extend():
###
>>> l = []
>>> l.extend("hello")
>>> l
['h', 'e', 'l', 'l', 'o']
###
This explains why we got such strange string-to-character-transforming
behavior from '+='. But this is surprising; I'll have to keep my eye out
for this next time it happens. *grin*
Anyway, you meant to use the append() method of lists, so instead of:
data[currentSection] += line.strip()
use
data[currentSection].append(line.strip())
and that should fix the problem.
On other style note: when you're iterating over a file, you can just say:
for line in some_file:
Files are iterable in Python, so there's no more need to say:
for line in some_file.xreadlines()
as long as you're using a relatively recent version of Python.
I hope this helps!
More information about the Tutor
mailing list