[I18n-sig] Changing case
Guido van Rossum
guido@python.org
Tue, 11 Apr 2000 09:24:48 -0400
> I definately do not understand. \204 is lower_e_egu (spelling?) and \204 is
> lower_a_umlaut. Upper case of these should be \216 and \220 respectively.
>
> (Probably will not display properly on all machines)
> --------------
> >>> s = u"ιδ"
> >>> s
> u'\202\204'
> >>> t = u"ΔΙ"
> >>> t
> u'\216\220'
> -------------
> Mark
Aha, *I* understand. You must be on Windows. Windows has its own
character encoding, where e-egu is \202 and a-umlaut is \204. However
Python doesn't know what character set you are using, and when you
typed e-egu, all it knew is that you entered \202. If you type this
in a u"..." string, all codes are interpreted as if they are Latin-1,
which happens to be the lower 256 bytes of Unicode. The Latin-1
character \202 (which is NOT e-egu but a control character) has no
upper case equivalent.
How do you get what you want?
Instead of typing u"ιδ", you should be able to type
unicode("ιδ", "mbcs").
HOWEVER, I can't get this to work either! I get
unicode('\202\204','mbcs') -> u"\u201A\u201E" and the latter string
doesn't have an upper case equivalent either! I had expected that
these would have translated to the Latin-1. Maybe I'm using the wring
MBCS code page???
> <SNIP>
> Marc ->
> Those two characters don't have a lower/upper case mapping:
>
> <SNIP>
> .lower() and .upper() only modify chars which do have such a
> mapping -- all others are left untouched.
--Guido van Rossum (home page: http://www.python.org/~guido/)