[Python-Dev] Generalised String Coercion

Guido van Rossum gvanrossum at gmail.com
Sun Aug 7 03:56:39 CEST 2005

[Removed python-list CC]

On 8/6/05, Terry Reedy <tjreedy at udel.edu> wrote:
> > PEP: 349
> > Title: Generalised String Coercion
> ...
> > Rationale
> >    Python has had a Unicode string type for some time now but use of
> >    it is not yet widespread.  There is a large amount of Python code
> >    that assumes that string data is represented as str instances.
> >    The long term plan for Python is to phase out the str type and use
> >    unicode for all string data.
> This PEP strikes me as premature, as putting the toy wagon before the
> horse, since it is premised on a major change to Python, possibly the most
> disruptive and controversial ever, being a done deal.  However there is, as
> far as I could find no PEP on Making Strings be Unicode, let alone a
> discussed, debated, and finalized PEP on the subject.

True. OTOH, Jython and IreonPython already have this, and it is my
definite plan to make all strings Unicode in Python 3000. The rest
(such as a bytes datatype) is details, as they say. :-)

My first response to the PEP, however, is that instead of a new
built-in function, I'd rather relax the requirement that str() return
an 8-bit string -- after all, int() is allowed to return a long, so
why couldn't str() be allowed to return a Unicode string?

The main problem for a smooth Unicode transition remains I/O, in my
opinion; I'd like to see a PEP describing a way to attach an encoding
to text files, and a way to decide on a default encoding for stdin,
stdout, stderr.

--Guido van Rossum (home page: http://www.python.org/~guido/)

More information about the Python-Dev mailing list