XML: minidom toxml() does not work for non English files! :-(
Jaros³aw Zabie³³o (delete .PL)
webmaster at apologetyka.com.pl
Sat May 4 03:34:55 EDT 2002
I have a small code:
from xml.dom import minidom
xmldoc = minidom.parse('myfile.xml')
print xmldoc.toxml()
It works for 7-bit text fine. But the problem is it works ONLY for
pure ASCII text. :-( If I try to use any of non English characters,
Python raise an exception:
UnicodeError: ASCII encoding error: ordinal not in range(128)
It does NOT work even on utf-8 xml files with any character outside
7-bit ASCII character set. It is strange, because utf-8 should be
correctly parsed by all xml tools.
Is it mean toxml() or toprettyxml() methods of minidom are useless for
non English strings? I need them to cut one big xml file into smaller
pieces and write them into several files.
--
Jarosław Zabiełło (UIN: 6712522)
URL: http://www.pik-net.pl/~zbiru
More information about the Python-list
mailing list