String methods understanding anything but ASCII?

Magnus Lie Hetland mlh at vier.idi.ntnu.no
Mon Jan 20 15:42:42 EST 2003


In article <b0f9k4$n7v$00$1 at news.t-online.com>, Martin v. Löwis wrote:
>Magnus Lie Hetland wrote:
>> I just wondered -- is there hope that string methods such as upper()
>> or capitalize() will ever understand anything other than ascii? 
>
>They already do that, after you invoke locale.setlocale:
>
> >>> import locale
> >>> locale.setlocale(locale.LC_ALL,"")
>'German_Germany.1252'
> >>> print "ö".upper()
>Ö

Right. Handy. :)

> For purposes of conversion to Unicode, 
>the "system default encoding" is also "ascii", unless overridden by the 
>administrator.

Ah. That's where I failed in my experiments, I guess -- I tried to use
unicode('ø'), but that, of course, tried to use 'ascii'. (u'ø',
though, uses 'iso8859-1', since that is what my c library uses, I
suppose.)

[snip]
>As Irmen explains, you really should use Unicode strings for that - they 
>support uppercasing for all languages of the world, simultaneously.

Yes. Wonderful -- this sort of thing is what Unicode is all about, I
suppose. ;)

[snip]

- Magnus

-- 
Magnus Lie Hetland
http://hetland.org




More information about the Python-list mailing list