pyexpat and unicode

Edoardo ''Dado'' Marcora marcora at colorado.edu
Mon Dec 17 14:32:28 EST 2001


I am getting the same error when I am trying to extract a Latin-1 string
from an xml doc I am parsing with minidom.
Is Python 2.2 gonna make this unicode mess transparent to the
user/programmer?

Dado

"mallum" <breakfast at 10.am> wrote in message
news:mailman.1008606560.15267.python-list at python.org...
> Hi all;
>
> I may be doing something stupid here and missing the obvious, but if I
> run the following;
>
> ...snip...
>
> import xml.parsers.expat
> parser = xml.parsers.expat.ParserCreate(encoding='utf8')
>
> data_uni = u"<?xml version='1.0' encoding='UTF-8' ?><hello>\202</hello>"
> data     = "<?xml version='1.0' encoding='UTF-8' ?><hello>there</hello>"
>
> data_uni.encode('utf8')
>
> parser.Parse(data)
> parser.Parse(data_uni)
>
> ......
>
> expat barfs with ;
>
> Traceback (most recent call last):
>   File "./testuni.py", line 12, in ?
>     parser.Parse(data_uni)
> UnicodeError: ASCII encoding error: ordinal not in range(128)
>
> Does this mean Im unable to pass utf8 encoded strings to pyexpat ?
> According to the docs it should. Can anyone spread some light on this.
>
>   -- mallum
>





More information about the Python-list mailing list