xml encoding in minidom

Joakim Storck joakim.storck at home.se
Tue Apr 9 13:41:26 EDT 2002


I've been playing around with the xml.dom.minidom, but I have some
problems with the document encoding. What I'm trying to do is
basically to take care of data from a web form using the cgi-module,
then insert data into a xml-document using the minidom. However, the
data does contain swedish characters (å, ä, ö) which messes things up
more than I could have imagined.

Since the minidom has no way of setting document encoding directly, i
tried the following:

doc = xml.dom.minidom.Document()
pi = doc.createProcessingInstruction('xml','version="1.0"
encoding="ISO-8859-1" ')
doc.appendChild(pi)

But then I get double headers:

<?xml version="1.0" ?>
<?xml version="1.0" encoding="ISO-8859-1" ?>

There is also some kind of problem with the minidom.toxml()-method.
>From what I've read in other postings it has to do with file encoding,
which seems to be set to UTF-8 by default.

Preferrably I'd like to take care of both problems by writing
something like this:

doc = xml.dom.minidom.Document()
doc.setEncoding('ISO-8859-1')

I'm no expert, neither at xml nor Python, but it seems I'm not the
only one who experienced these problems.

Thanks' for any suggestions!

Joakim



More information about the Python-list mailing list