[XML-SIG] Re: Issues with Unicode type
Uche Ogbuji
uche.ogbuji@fourthought.com
Mon, 23 Sep 2002 11:53:28 -0600
> On Mon, 2002-09-23 at 18:14, Tom Emerson wrote:
> =
> > By default Python is using UTF-16 as its Unicode encoding. The
> > code-point that you specify, U+10800, is outside the BMP and hence is=
> > represented by two surrogate characters in UTF-16.
> =
> Arg! Does that mean that by default Python isn't strictly conform to XM=
L
> 1.0?
This is apples and oranges. Python is not an XML app, so I don't think i=
t =
means anything for Python to conform to XML.
The question is how easy it is to write a Python app that does conform to=
XML. =
Even if Python does not support characters outside the BMP, then this ca=
n be =
handled by writing code that does the special processing for such charact=
ers.
The other question is whether PyXML and 4Suite are conformant, since they=
are =
XML apps. That's what we're really trying to figure out here, I think.
-- =
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Apache 2.0 API - http://www-106.ibm.com/developerworks/linux/library/l-ap=
ache/
Python&XML column: Tour of Python/XML - http://www.xml.com/pub/a/2002/09/=
18/py.
html
Python/Web Services column: xmlrpclib - http://www-106.ibm.com/developerw=
orks/w
ebservices/library/ws-pyth10.html