minidom and unicode errors

Fredrik Lundh fredrik at pythonware.com
Tue Mar 7 02:16:42 EST 2006


Abhimanyu Seth wrote:

> Sorry, my mistake. The file was not saved as utf-8. Saving it as utf-8
> solves my problems.
> >> f = codecs.open ("c:/test.txt", "r", "utf-8")
> >> dom = minidom.parseString (codecs.encode (f.read(), "utf-8"))
>
> However, I still need to encode the string returned by f.read () before
> passing it to parseString. Otherwise I get an exception.

if the file contains UTF-8 data,

    dom = minidom.parse("c:/test.txt")

should be exactly equivalent to your recoding solution.  if it isn't, post a
copy of the sample file.

(if you've double-checked, and are 100% certain that it's not your editor
or your environment that's playing tricks with you, you can also report this
over here:

    http://sourceforge.net/tracker/?group_id=5470&atid=105470

)

</F>






More information about the Python-list mailing list