xsl and unicode surrogate characters

Sakcee sakcee at gmail.com
Wed Jan 4 19:56:23 EST 2006


Hi

In one of the data files that I have , I am seeing these characters
\xed\xa0\xa0 .  They seem to break the xsl.

---------------------------------------------------------------
Extra content at the end of the document
 XML/XSL Error: </data><data ><![CDATA[ í   Pls advice
----------------------------------------------------------------


this seems to break the libxml2/libxslt

is this a unicode utf-16 surrogate pair ?
for displaying it on xml/xsl, should I extract only \xa0?
since this is hingher than 00-7f range can i just strip it?
under what condition the encoding software put this string in?


thanks for help,




More information about the Python-list mailing list