Python to XML to Python conversion
Jeremy Bowers
newfroups at jerf.org
Thu Jul 11 23:01:51 EDT 2002
thehaas at binary.net wrote:
> I'd do the Python -> XML like this:
>
> outfile = file("out.xml")
>
> outfile.write("<pydict>")
> for key in dict.keys():
> outfile.write("<%s>%s</%s>\n" %(key, dict[key], key) )
>
> outfile.write("</pydict>")
> outfile.close()
>
> How's that?? Well-formed XML, without any DOM-overhead.
This is common and incorrect; the XML is not going to be well formed for
any number of reasons. The keys of the dict are not required to be valid
XML tag names (consider a key "1 2", wrong for starting with a number
AND having a space in it). The keys of the dict may not be strings. The
values of the dict may not be strings either. The values of the dict may
contain any of several XML chars which much be encoded, such as &.
Goodness help your XML parser if the text happens to include XML or XML
fragments.
For each key in the dict, the odds become increasingly stacked against you.
If you __know__ you have string keys and string vals, you can do
something like
from xml.sax.saxutils import quoteattr
...
outfile.write('<item name=%s value=%s>' % (quoteattr(key),
quoteattr(dict[key]))
...
(untested)
but it is still better to go with the XML marshaler or standard Pickle
module if at all possible.
Also, part of being a good programmer is learning how to elicit good
requirements. Do you understand why you need XML? XML is a good transfer
language between programs and language boundaries. If you just need to
save some data for the same program to retrieve later, you actively
*don't* want XML. Use pickle. (Or 'shelve', which I like for quick
projects.) If you *are* going to transfer this data to another program,
then what do those other programs take naturally? If they have a native
format and you can match it, you can save yourself that much trouble.
Understand the motivation. If XML is being used as a bullet point, you
may consider politely suggesting better, cheaper, faster,
faster-to-*develop* alternatives (cPickle). Failing that and if you
never intend to transfer the data anywhere, then use the XML marshaler
for the buzzword compliance and ease-of-use pickling.
(Thought: XML should never be your *first* choice of file format. It is
the choice of *last* resort, when you absolutely *need* easy parsing in
multiple languages or environments and can't get it any other way. It is
then a much better choice then other formats, but only under those
limited, albiet extremely popular, conditions.)
More information about the Python-list
mailing list