[Tutor] xml question

Steven D'Aprano steve at pearwood.info
Tue Jul 27 02:29:53 CEST 2010


On Tue, 27 Jul 2010 05:09:09 am Albert-Jan Roskam wrote:
> Hi,
>
> I am making a data processing program that will use a configuration
> file. The file should contain information about: (1) source files
> used, (2) (intermediate) output files, (3) used parameters/estimation
> methods (4) manual data edits + datetime stamp + user name . I'd like
> to store this config file in xml.

Why XML?

Even though XML is plain text, it is *not* a human writable format, 
except perhaps for the simplest data. XML is one of those things which 
has become "the in-thing" and is used in all sorts of inappropriate 
places just because "everybody else uses XML". Even *supporters* of XML 
describe themselves as having "drunk the XML Kool-Aid". 

(For those who are unaware, "drinking the Kool-Aid" refers to the 
Reverend Jim Jones mass murder-suicide back in the 70s, when nearly a 
thousand cult members drank poison-laced Kool-Aid.)

XML is extremely verbose and inefficient. It has its uses, but the best 
advice I can give is, don't use XML unless you need to communicate with 
something that expects XML, or unless your data is so complex that you 
need XML.

Some alternatives:

If you're using Python 2.6 or better, you might consider the plistlib 
module for a thin wrapper around XML. 

JSON is often considered a more friendly format. Some people prefer YAML 
over JSON, although YAML isn't in the standard library.

If your data is in the form option:value, then ConfigParser 
(Windows-style ini files) may be all you need.

> However, I've never created
> something like this before. Is this a suitable format, and, if so,
> what would the elementtree look like? 

You tell us, it's your data :)


> Should I just use 'config'  or 
> something similar as root, and the information elements 1 through 3
> as child elements? And should the manual edits be stored as an
> element 'edit' with various attributes (the edit itself, the time
> stamp, etc.)?

How would you store the data in a Python class? Design your class first.



-- 
Steven D'Aprano


More information about the Tutor mailing list