Degree symbol (UTF-8 > ASCII)

Irmen de Jong irmen at -NOSPAM-REMOVE-THIS-xs4all.nl
Wed Apr 16 15:11:03 EDT 2003


Peter Clark wrote:
> I'm working with a xml document which doesn't include an encoding, so
> it defaults to UTF-8. Of course, all of the text is ASCII, and likely
> to remain so. I would like to insert the degree symbol (chr(176)), but
> because this is outside the bounds (chr(128) is the limit), Python
> raises an XML error. What's the simplest way of getting including an
> unadorned degree symbol? Again, it's not necessary to preserve the
> UTF-8 encoding, but I'm not quite certain as to how to tell Python
> that the XML document is plain ASCII (or I guess in the case of the
> degree symbol, latin-1).
>     Thanks,
>     :Peter

The "degree" symbol is chr(176) in what character encoding?
Certainly not in UTF-8. Perhaps in windows cp-1252 or ISO-8859-1.

Are you creating the UTF-8 encoded XML document using Python?
Then try the following to add a "degree" character to your
output file:

To get the (unicode) character you want:
	degreeChar = u'\N{DEGREE SIGN}'

(this is unicode character 00b0, so you could also use:
	degreeChar = u'\u00b0'

To write it using UTF-8 encoding to a file object 'output':
	output.write(degreeChar.encode('UTF-8'))

(if you look what is written you'll see that the character
is encoded in two bytes when using UTF-8: 0xc2,0xb0)

--Irmen





More information about the Python-list mailing list