[Tutor] Python - XML: How to write UNICODE to a file ?? (when using
LATIN-1 Chars)
Javier JJ
python.tutorial at jarava.org
Tue Aug 26 01:18:38 EDT 2003
Hi all!!
I'm getting my feet wet with Python + XML processing and I've run into
something that's stopping me:
I have an xml file with ASCII characters (no encoding is specified; it
contains characters valid in the latin-1 charset - it's a log file
generated by MSN Messenger 6.0).
I am processing it with mindom as follows:
doc = minidom.parse(log_file.xml)
rootNode = doc.childNodes[1]
Now, I can do all sorts of manipulation on the nodes w/o any problem
But afterwards I want to write the result back to disk as XML, so I do:
>>> out = open("salida.txt", "wb")
>>> listado = rootNode.toxml()
Now "listado" has a xml-looking unicode string; it looks quite fine to
me... but when I try to write it to disk, I get:
>>> out.write(listado)
Traceback (most recent call last):
File "<pyshell#34>", line 1, in -toplevel-
out.write(listado)
UnicodeEncodeError: 'ascii' codec can't encode character '\ued' in
position 2274: ordinal not in range(128)
The "offending" character is:
>>> listado[2274]
u'\xed'
>>> print listado[2274]
í
In the original XML file (after all, I'm writing back the same thing I'm
reading) the char appears as follows:
</To><Text Style="font-family:Comic Sans MS; color:#000080; ">bien
aquÃ</Text></Message>
If I cut&paste the text from IDLE into UltraEdit and then save it, and
try to view the result, the XSL bombs on the same character:
I've tried using both IDLE (python2.3 on cygwin) and PythonWin 2.2.2
(ActiveState) and both complain....
I _know_ that there has to be a way to be able to write back the XML to
the file, but I can't figure it out.
Any suggestions?
Thanks a lot!
Javier J
More information about the Tutor
mailing list