[XML-SIG] XML Unicode and UTF-8
Uche Ogbuji
uche.ogbuji at fourthought.com
Tue Aug 10 03:11:11 CEST 2004
It looks as if I should have read the whole thread before posting.
Martin's been a great help, but I still have a couple of observations.
On Thu, 2004-08-05 at 06:22, n.youngman at ntlworld.com wrote:
> OK. I read the opaque documentation^W^W fine manual for a while, then googled for a while, and finally decided to just hack about with what I had.
I personally think the Python/Unicode docs are pretty good, but Unicode
is *hard*. No getting around that.
> I now have
>
> charset_tag.appendChild( doc.createTextNode( segment[1] ) )
> unicode = segment[0].decode( segment[1] ).encode( "utf-8")
> unicode_tag = doc.createElement( 'unicode' )
> unicode_tag.appendChild( doc.createTextNode( unicode ) )
I wouldn't use "unicode" as a variable name if I were you, since it's a
built-in in Python 2.2 and up.
I suggest
unicode_tag = doc.createElement( u'unicode' )
rather than
unicode_tag = doc.createElement( 'unicode' )
Remember that XML element and attribute names are also (a subset of)
Unicode, even though they're a smaller subset than that of character
data.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Decomposition, Process, Recomposition - http://www.xml.com/pub/a/2004/07/28/py-xml.html
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML - http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/
More information about the XML-SIG
mailing list