[Python-3000] locale-aware strings ?

Paul Prescod paul at prescod.net
Mon Sep 4 03:55:20 CEST 2006

On 9/3/06, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 9/1/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Fredrik Lundh wrote:
> > > today's Python supports "locale aware" 8-bit strings ...
> > > to what extent should this be supported by Python 3000 ?
> > Since all strings will be Unicode by then:
> >  >>> u"åäö".isalpha()
> > True
> Two followup questions, then ...
> (1)  To what extent should python support files (including stdin,
> stdout) in local (non-unicode) encodings?  (not at all, per-file,
> settable global default?)

I presume that Python's support of these will not change from today's. I
don't think that locale changes file decoding today, nor should it. After
all, files are emailed from place to place all the time.

(2)  To what extent will strings have an opaque (or at least
> on-demand) backing store, so that decoding/encoding could be delayed?
> (For example, Swedish text could be stored in single-byte characters,
> and only converted to standard unicode on the rare occasions when it
> met strings in an incompatible encoding.)

I don't see this as particularly related to the locale issue either. It is
being discussed in other threads under the name "Polymorphic strings."
Fredrik Lundh said:

"I think just delaying decoding would take us most of the way.  the big
advantage of storage polymorphism is that you can avoid decoding and
encoding (and having to pay for the cycles and bytes needed for that) if
you don't do have to."

I believe he is working on a prototype.

 Paul Prescod
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20060903/12c63525/attachment.htm 

More information about the Python-3000 mailing list