[XML-SIG] Copyright character chokes parser
uche.ogbuji@fourthought.com
uche.ogbuji@fourthought.com
Fri, 15 Dec 2000 09:49:40 -0700
> I'm fiddling with XBEL using PyXML 0.6.2.
> =
> I have a bookmark entry as follows:
> =
> <bookmark href=3D"http://www.optioninsight.com/" added=3D"946429657=
" visited=3D"946444587" modified=3D"946429652" >
> <title>Option Insight=A9 - Home of the Greatest Option Program. E=
ver.</title>
> </bookmark>
I just went through encoding hell of a more involved sort so I might as w=
ell =
chip in here.
Add =
<?xml version=3D'1.0' encoding=3D'ISO-8859-1'?>
As the first thing in your XML file (that is even before any white space)=
and =
you should be fine. If you don't specify an encoding, the parser assumes=
UTF-8
(except if you use a byte-order mark in which case it assumes UTF-16). T=
he =
copyright char is not legal UTF-8 because it''s a byte value exceeding 12=
7. =
ISO-8859-1 or LATIN-1 allow you to use byte values above 127.
-- =
Uche Ogbuji Principal Consultant
uche.ogbuji@fourthought.com +1 303 583 9900 x 101
Fourthought, Inc. http://Fourthought.com =
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python