[XML-SIG] XML Unicode and UTF-8

Uche Ogbuji uche.ogbuji at fourthought.com
Tue Aug 10 03:11:11 CEST 2004

It looks as if I should have read the whole thread before posting. 
Martin's been a great help, but I still have a couple of observations.

On Thu, 2004-08-05 at 06:22, n.youngman at ntlworld.com wrote:
> OK. I read the opaque documentation^W^W fine manual for a while, then googled for a while, and finally decided to just hack about with what I had.

I personally think the Python/Unicode docs are pretty good, but Unicode
is *hard*.  No getting around that.

> I now have
>     charset_tag.appendChild( doc.createTextNode( segment[1] ) )
>     unicode = segment[0].decode( segment[1] ).encode( "utf-8")
>     unicode_tag = doc.createElement( 'unicode' )
>     unicode_tag.appendChild( doc.createTextNode( unicode ) )

I wouldn't use "unicode" as a variable name if I were you, since it's a
built-in in Python 2.2 and up.

I suggest

    unicode_tag = doc.createElement( u'unicode' )

rather than

    unicode_tag = doc.createElement( 'unicode' )

Remember that XML element and attribute names are also (a subset of)
Unicode, even though they're a smaller subset than that of character

