[Python-3000] string module trimming

Wed Apr 18 23:18:47 CEST 2007

On 4/18/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 4/17/07, Guido van Rossum <guido at python.org> wrote:
> > The locale module doesn't deal with Unicode, only with 8-bit characters (not
> > multi-byte characters). You'll lose this anyway. Certainly
> > string.letters is not going to provide this functionality.
>
> But for languages in Latin1, 8-bit characters are sufficient --
> anything with more than 8 bits is by definition not a (local) letter.

Latin-1 is just another encoding (and not a very useful one given that
it can't encode all of Unicode). I don't want to define a feature that
only works for Latin-1.

> I won't swear that localizations currently replace string.letters with
> the appropriately ordered (slight) superset, but it is a valid use
> case, and string* (or text*) is clearly the right place.

The right solution for locale-dependent collation for sure isn't
having a string containing all the letters in the right order. There
are plenty of languages where that approach doesn't even work.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)