Some information about locale (was Re: [Python-Dev] repr vs. str and locales again)
Guido van Rossum
guido@python.org
Mon, 22 May 2000 15:38:17 -0500
[Fredrik]
> > > - 8-bit binary arrays. may contain binary goop, or text in some strange
> > > encoding. upper, strip, etc should not be used.
[Guido]
> > These are not strings.
[Ping]
> Indeed -- but at the moment, we're letting people continue to
> use strings this way, since they already do it.
Oops, mistake. I thought that Fredrik (not Fred! that's another
person in this context!) meant the array module, but upon re-reading
he didn't.
> > > - 8-bit text strings using the system encoding. upper, strip, etc works
> > > as long as the locale is properly configured.
> > >
> > > - 8-bit unicode text strings. upper, strip, etc may work, as long as the
> > > system encoding is a subset of unicode -- which means US ASCII or
> > > ISO Latin 1.
> >
> > This is a figment of your imagination. You can use 8-bit text strings
> > to contain Latin-1, but you have to set your locale to match.
>
> I would like it to be only the latter, as Fred, i, and others
Fredrik, right?
> have previously suggested, and as corresponds to your ASCII
> proposal for treatment of 8-bit strings.
>
> But doesn't the current locale-dependent behaviour of upper()
> etc. mean that strings are getting interpreted in the first way?
That's what I meant to say -- 8-bit strings use the system encoding
guided by the locale.
> > > is this complexity really worth it?
> >
> > From a backwards compatibility point of view, yes. Basically,
> > programs that don't use Unicode should see no change in semantics.
>
> I'm afraid i have to agree with this, because i don't see any
> other option that lets us escape from any of these four ways
> of using strings...
Which is why I find Fredrik's attitude unproductive.
And where's the SRE release?
--Guido van Rossum (home page: http://www.python.org/~guido/)