char 128? no... 256
Afanasiy
abelikov72 at hotmail.com
Wed Feb 12 13:29:21 EST 2003
On Wed, 12 Feb 2003 10:44:18 -0600, Skip Montanaro <skip at pobox.com> wrote:
>
> >> To clarify, the TradeMark symbol is being transformed to Unicode
> >> #8482 automatically, presumably by COM or ADO. In Python, I do not
> >> know how I am supposed to be able to print (for example) the Unicode
> >> object I receive which contains this transformed TradeMark symbol.
>
>Print it where? To a file or a display device? Remember, display devices
>need to know the encoding of the data they receive as well. For example, it
>does me no good to print utf-8 encoded characters in an xterm, since it only
>understands iso-8859-1. On the other hand, I can set the charset of my
>Mac's Terminal app windows to utf-8 and display all sorts of cool stuff. In
>this case, it appears the trademark sign is not available in iso-8859-1:
>
> >>> tm = u"\N{TRADE MARK SIGN}"
> >>> tm
> u'\u2122'
> >>> tm.encode("latin-1")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeEncodeError: 'latin-1' codec can't encode character '\u2122' in position 0: ordinal not in range(256)
> >>> tm.encode("utf-8")
> '\xe2\x84\xa2'
>
>so you're scrod unless you can find an encoding your display device knows
>about which contains a trade mark sign.
>
>Skip
All of my devices can display the TradeMark symbol correctly.
None of them can print the Unicode character 8482. I never use Unicode.
The TradeMark symbol is being encoded to that Unicode value, 8482.
I would like to decode that back to what I assume is iso-8859-1.
However, encoding back to iso-8859-1 only allows characters under 256.
More information about the Python-list
mailing list