[XML-SIG] sax expatreader and unicode
Joe Murray
jmurray@agyinc.com
Tue, 17 Apr 2001 18:21:04 -0700
What am I missing: the sax expatreader can't handle some unicode
characters? I thought this was supported. I believe the xml.dom
modules handle unicode characters just fine...
>From the text:
"...LEX. IN NA=EFVE H4 AND CHO CELLS, PS1 CO-IMM..."
Output:
=2E
=2E
=2E
File "analyzexml.py", line 68, in analyze_sax
parser.parse(stream)
File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 43, in parse
xmlreader.IncrementalParser.parse(self, source)
File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/xmlreader.py", line
120, in parse
self.feed(buffer)
File
"/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 87, in feed
self._parser.Parse(data, isFinal)
UnicodeError: UTF-8 decoding error: invalid data
jm
--=20
Joseph Murray
Bioinformatics Specialist, AGY Therapeutics
290 Utah Avenue, South San Francisco, CA 94080
(650) 228-1146