DTD Parsing

Asun Friere afriere at yahoo.co.uk
Wed Nov 10 04:36:10 CET 2010

On Nov 10, 2:02 pm, Christian Heimes <li... at cheimes.de> wrote:
> Am 10.11.2010 03:44, schrieb Felipe Bastos Nunes:
> > I'd like to know too. I work with java and jdom, but I'm doing
> > personal things in python, and plan to go full python in the next 2
> > years. Xml is my first option for configuration files and simple
> > storages.
> Don't repeat the mistakes of others and use XML as a configuration
> language. XML isn't meant to be edited by humans.

Yes but configuration files are not necessarily meant to be edited by
humans either!

Having said that, I'm actually old school and prefer "setting=value"
human editable config files which are easily read into a dict via some
code something like this:

def read_config (file_obj) :
    """Reads a config file and returns values as a dictionary

    Config file is a series of lines in the format:
        name = value #comment
    Neither name nor value may contain '#', '=', ':' nor any spaces.

    config = {}
    nameval = re.compile('^\s*([^=:\s]+)\s*(?:=|:)\s*([^=:\s]*)
    comment = re.compile('^\s*($|#)').search
    for line in file_obj :
        if comment(line) : continue
        try :
            name, value = nameval(line).groups()
        except AttributeError :
            sys.stderr.write('WARNING: suspect entry: %s\n' % line)
    return config

Thanks Christian, I might check out 'configobj', but my needs are
rarely more complicated than the above will satisfy.

In any case Felipe, whether you intend to use XML for config or not
(or for any other reason), there are good tools for XML parsing in
python including with DTD validation.  Try the modules 'libxml2',
'lxml', or even, if your needs are modest, the poorly named

What I'm looking for instead is something to parse a DTD, such as
xmlproc's DTDConsumer.  It might even exist in the modules I've
mentioned, but I can't find it.  In the event, I think I'll use a DTD-
>xsd conversion script and then simply use HTMLParser.  Unless someone
can point me in the way of a simple DTD parser, that is.

More information about the Python-list mailing list