char 128? no... 256
Afanasiy
abelikov72 at hotmail.com
Wed Feb 12 13:30:32 EST 2003
On Wed, 12 Feb 2003 20:18:02 +0300 (MSK), Roman Suzi <rnd at onego.ru> wrote:
>On Wed, 12 Feb 2003, Afanasiy wrote:
>>On Wed, 12 Feb 2003 15:50:53 GMT, Afanasiy <abelikov72 at hotmail.com> wrote:
>>>On Wed, 12 Feb 2003 03:18:43 GMT, Afanasiy <abelikov72 at hotmail.com> wrote:
>
>>>Now, even encoding the 'latin-1', 8 bit, is problematic, because symbols
>>>which are 8 bit in Windows, such as the TradeMark symbol will not encode
>>>into 8 bit, as the ordinal value in the Unicode object is 8482.
>>>
>>>This is hex 99 on a plain Windows 2000 install, I presume 'latin-1'.
>
>That is why your Windows doesn't use latin-1.
>
>$ grep -i trade /usr/local/lib/python2.3/encodings/*.py
>cp1250.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1251.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1252.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1253.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1254.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1255.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1256.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1257.py: 0x0099: 0x2122, # TRADE MARK SIGN
>cp1258.py: 0x0099: 0x2122, # TRADE MARK SIGN
>mac_cyrillic.py: 0x00aa: 0x2122, # TRADE MARK SIGN
>mac_greek.py: 0x0093: 0x2122, # TRADE MARK SIGN
>mac_iceland.py: 0x00aa: 0x2122, # TRADE MARK SIGN
>mac_latin2.py: 0x00aa: 0x2122, # TRADE MARK SIGN
>mac_roman.py: 0x00aa: 0x2122, # TRADE MARK SIGN
>mac_turkish.py: 0x00aa: 0x2122, # TRADE MARK SIGN
>palmos.py: 0x0099: 0x2122, # TRADE MARK SIGN
>
>So, you need to convert to one of these instead of latin-1.
>
>(Hmmm... I thought cp1250 is latin1.)
>
>Aliases of latin-1:
> '8859' : 'latin_1',
> 'cp819' : 'latin_1',
> 'csisolatin1' : 'latin_1',
> 'ibm819' : 'latin_1',
> 'iso8859' : 'latin_1',
> 'iso_8859_1' : 'latin_1',
> 'iso_8859_1_1987' : 'latin_1',
> 'iso_ir_100' : 'latin_1',
> 'l1' : 'latin_1',
> 'latin' : 'latin_1',
> 'latin1' : 'latin_1',
>
>
This is a wonderfully valid answer and it works, thank you!
More information about the Python-list
mailing list