[I18n-sig] Changing case

Guido van Rossum guido@python.org
Tue, 11 Apr 2000 09:24:48 -0400


> I definately do not understand. \204 is lower_e_egu (spelling?) and \204 is
> lower_a_umlaut. Upper case of these should be \216 and \220 respectively.
> 
> (Probably will not display properly on all machines)
> --------------
> >>> s = u"ιδ"
> >>> s
> u'\202\204'
> >>> t = u"ΔΙ"
> >>> t
> u'\216\220'
> -------------
> Mark

Aha, *I* understand.  You must be on Windows.  Windows has its own
character encoding, where e-egu is \202 and a-umlaut is \204.  However
Python doesn't know what character set you are using, and when you
typed e-egu, all it knew is that you entered \202.  If you type this
in a u"..." string, all codes are interpreted as if they are Latin-1,
which happens to be the lower 256 bytes of Unicode.  The Latin-1
character \202 (which is NOT e-egu but a control character) has no
upper case equivalent.

How do you get what you want?

Instead of typing u"ιδ", you should be able to type
unicode("ιδ", "mbcs").

HOWEVER, I can't get this to work either!  I get
unicode('\202\204','mbcs') -> u"\u201A\u201E" and the latter string
doesn't have an upper case equivalent either!  I had expected that
these would have translated to the Latin-1.  Maybe I'm using the wring
MBCS code page???

> <SNIP>
> Marc ->
> Those two characters don't have a lower/upper case mapping:
> 
> <SNIP>
> .lower() and .upper() only modify chars which do have such a
> mapping -- all others are left untouched.

--Guido van Rossum (home page: http://www.python.org/~guido/)