minidom and åäö once again :-P

Magnus Heino magnus.heino at pleon.sigma.se
Wed Apr 17 09:07:17 EDT 2002


If i replace åäö below with codecs.utf_8_encode('åäö')[0], then parseString 
is happy.

toxml() doesnt produce a u'' string though, so I have to do 
unicode(doc.toxml()) and then feed that into parseString.

Eh??

/Magnus

Magnus Heino wrote:

> 
> 
> Hi.
> 
> I am (still) trying to get minidom and åäö to play together. And I guess
> the main problem really is me ;-)
> 
> But why doesnt this work? Shouldnt parseString eat what toxml produce?
> 
> /Magnus
> 
> [magnus at bombardier magnus]$ python2.2
> Python 2.2 (#1, Jan 23 2002, 14:50:45)
> [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-98)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import xml.dom.minidom
>>>> d=xml.dom.minidom.parseString('<?xml version="1.0"
> encoding="iso-8859-15" ?><test>åäö</test>')
>>>> d.toxml()
> '<?xml version="1.0" ?>\n<test>\xe5\xe4\xf6</test>'
>>>> d=xml.dom.minidom.parseString(d.toxml())
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/lib/python2.2/site-packages/_xmlplus/dom/minidom.py", line
> 965, in parseString
>     return _doparse(pulldom.parseString, args, kwargs)
>   File "/usr/lib/python2.2/site-packages/_xmlplus/dom/minidom.py", line
> 952, in _doparse
>     toktype, rootNode = events.getEvent()
>   File "/usr/lib/python2.2/site-packages/_xmlplus/dom/pulldom.py", line
> 256, in getEvent
>     self.parser.feed(buf)
>   File "/usr/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py",
>   line
> 143, in feed
>     self._parser.Parse(data, isFinal)
> UnicodeError: UTF-8 decoding error: invalid data
>>>>




More information about the Python-list mailing list