ElementTree cannot parse UTF-8 Unicode?
fredrik at pythonware.com
Wed Jan 19 21:50:57 CET 2005
Erik Bethke wrote:
> I am getting an error of not well-formed at the beginning of the Korean
> text in the second example. I am doing something wrong with how I am
> encoding my Korean? Do I need more of a wrapper about it than simple
> quotes? Is there some sort of XML syntax for indicating a Unicode
> string, or does the Elementree library just not support reading of
XML is Unicode, and ElementTree supports all common encodings just
fine (including UTF-8).
> this one fails:
> <?xml version="1.0" encoding="UTF-8"?>
> <Word L1="?????!"></Word>
this works just fine on my machine.
what's the exact error message?
print on your machine?
what happens if you attempt to parse
<Word L1="어녕하세요!" />
More information about the Python-list