[XML-SIG] Copyright character chokes parser

Ng Pheng Siong ngps@post1.com
Sat, 16 Dec 2000 00:17:23 +0800


Hi,

I'm fiddling with XBEL using PyXML 0.6.2.

I have a bookmark entry as follows:

    <bookmark href="http://www.optioninsight.com/" added="946429657" visited="946444587" modified="946429652" >
      <title>Option Insight© - Home of the Greatest Option Program. Ever.</title>
    </bookmark>


The copyright character (you might see it as <A9>) in the title chokes 
xbel_parse.py:

$ python xbel_parse.py --xbel < bm.xml 
Traceback (most recent call last):
  File "xbel_parse.py", line 91, in ?
    p.parseFile( sys.stdin )
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/sax/drivers/drv_pyexpat.py", line 68, in parseFile
    if self.parser.Parse(buf, 0) != 1:
xml.parsers.expat.error: not well-formed: line 68, column 27


A simple SAX-based parser written per the XML HOWTO throws an exception at 
the same spot:

$ python xbp.py < bm.xml
Traceback (most recent call last):
  File "xbp.py", line 19, in ?
    p.parse(sys.stdin)
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py", line 42, in parse
    xmlreader.IncrementalParser.parse(self, source)            
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/sax/xmlreader.py", line 120, in parse
    self.feed(buffer)
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py", line 86, in feed
    self._err_handler.fatalError(exc)
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <stdin>:68:27: not well-formed


Line 68 column 27 is where the copyright character is.

Any hints to a workaround? (I'm not subscribed. Please cc replies.)

TIA. Cheers.
-- 
Ng Pheng Siong <ngps@post1.com> * http://www.post1.com/home/ngps