minidom and unicode errors
Abhimanyu Seth
abhimanyu.seth at gmail.com
Tue Mar 7 00:33:34 EST 2006
Hi all,
I'm trying to parse and modify an XML document using xml.dom.minidom module
and Python 2.4.2
>> from xml.dom import minidom
>> dom = minidom.parse ("c:/test.txt")
If the xml file contains a non-ascii character, then i get a parse error.
I have the following line in my xml file:
<target>Exception beim Löschen des Audit-Moduls aufgetreten. Exception Stack
lautet: %1.</target>
ExpatError: not well-formed (invalid token): line 8, column 27
If I remove the ö character, then it works fine. I'm guessing this has to do
with the default encoding which is ascii. I guess i can change the encoding
by modifying a file on my machine that the interpretter reads while loading,
but then how do I get my program to work on different machines?
Also, while writing such a special character to the file, I get an error.
>> document.writexml (file (myFile, "w"), encoding='utf-8')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position
16: ordinal not in range(128)
Any help would be appreciated.
--
Regards,
Abhimanyu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20060307/b3ea8c9b/attachment.html>
More information about the Python-list
mailing list