[I18n-sig] Perhaps the locale should matter?

Just van Rossum just@letterror.com
Sat, 6 May 2000 09:32:56 +0100


At 12:20 AM +0200 06-05-2000, Bruno Haible wrote:
>It is reasonably standardized. But it doesn't help Guido: When he is faced
>with a locale named "ru" or "ru_RU", he wouldn't know whether its character
>set is ISO-8859-5 or KOI8-R.

Hm, is this the show stopper that it appears to be?

I have no idea how the locale stuff works, nor how exactly it relates to
standard C functions like islower() and toupper(), but I do know that these
do the "right" thing on my platform. That is, they assume MacRoman, and
work correctly with accented characters. Peter Funk's post reminded me of
this -- there's probably lots of code out there that depends on it :-(. So
far this has been the only argument that convinced me that the
8-bit/Latin-1 really *is* flawed (Guido should thank you, Peter! ;-). Now
I'm not even sure that using the locale to aid narrow to wide conversion
(and vv) is such a good idea -- even if it were possible. The 7-bit
proposal may be the only wise choice after all.


Just

PS: has any progress been made to add an encoding pragma to source files?
Or is this 1.7 stuff?

PPSS: shouldn't u'\337'.upper() yield u'SS'? (\337 is the german "sharp s")