Degree symbol (UTF-8 > ASCII)
Erik Max Francis
max at alcyone.com
Wed Apr 16 21:56:34 EDT 2003
Peter Clark wrote:
> Since the output is meant to be read to be displayed by a font
> which is in essentially latin-1 encoding, I need to restrict the
> manner in which the degree symbol is displayed to one byte. Yet I
> cannot get it to behave, even though 'print chr(176) works perfectly
> fine at the prompt. My suspicion is that the default encoding of the
> system is messing python up somewhere along the way--is there any way
> to tell it to just print the stupid character and not be concerned
> with the output?
I've come into this conversation late, but could it be that what's
confusing you is that UTF-8 and Latin-1 are not the same thing? It
sounds like you want Latin-1 but are asking for UTF-8. UTF-8 is an
octet representation of Unicode which uses escape sequences and the like
to represent eight-bit information; Latin-1 is an eight-bit encoding.
Both have the property that pure-ASCII data will be represented without
modification, but they aren't the same beast. If you're converting to
UTF-8 and are puzzled why 8-bit data is expanding to multiple
characters, then chances are UTF-8 isn't what you wanted.
[where u is a Unicode string representing the degree symbol]
>>> u.encode('latin-1')
'\xb0'
>>> u.encode('utf-8')
'\xc2\xb0'
>>> print u.encode('latin-1')
[the degree symbol]
--
Erik Max Francis / max at alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/ \ It was involuntary. They sank my boat.
\__/ John F. Kennedy (on how he became a war hero)
Bosskey.net: Return to Wolfenstein / http://www.bosskey.net/rtcw/
A personal guide to Return to Castle Wolfenstein.
More information about the Python-list
mailing list