Some information about locale (was Re: [Python-Dev] repr vs. str and locales again)

Peter Funk pf@artcom-gmbh.de
Mon, 22 May 2000 19:17:40 +0200 (MEST)


Hi!

Fredrik Lund:
[...]
> > > so in order to provide platform-independent unicode support, Python 1.6
> > > comes with unicode-aware and fully portable replacements for the ctype
> > > functions.
> > 
> > For those who only need Latin-1 or another 8-bit ASCII superset, the
> > Unicode stuff is overkill.
> 
> why?

Going from 8 bit strings to 16 bit strings doubles the memory 
requirements, right?

As long as we only deal with English, Spanish, French, Swedish, Italian
and several other languages, 8 bit strings work out pretty well.  
Unicode will be neat if you can effort the additional space.  
People using Python on small computers in western countries
probably don't want to double the size of their data structures
for no reasonable benefit.

> > This is a figment of your imagination.  You can use 8-bit text strings
> > to contain Latin-1, but you have to set your locale to match.
> 
> if that's a supported feature (instead of being deprecated in favour
> for unicode), maybe we should base the default unicode/string con-
> versions on the locale too?

Many locales effectively use Latin1 but for some other locales there
is a difference:

$ LANG="es_ES" python  # Espanõl uses Latin-1, the same as "de_DE"
Python 1.5.2 (#1, Jul 23 1999, 06:38:16)  [GCC egcs-2.91.66 19990314/Linux (egcs- on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import string; print string.upper("äöü")
ÄÖÜ

$ LANG="ru_RU" python  # This uses ISO 8859-5 
Python 1.5.2 (#1, Jul 23 1999, 06:38:16)  [GCC egcs-2.91.66 19990314/Linux (egcs- on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import string; print string.upper("äöü")
Ħ¬

I don't know, how many people for example in Russia already depend 
on this behaviour.  I suggest it should stay as is.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)