[Python-Dev] Re: [I18n-sig] Re: Unicode debate

Guido van Rossum guido@python.org
Fri, 28 Apr 2000 10:32:28 -0400


> This is the exact reason that Unicode should be used for all string
> literals: from a language design perspective I don't understand the
> rationale for providing "traditional" and "unicode" string.

In Python 3000, you would have a point.  In current Python, there
simply are too many programs and extensions written in other languages
that manipulating 8-bit strings to ignore their existence.  We're
trying to add Unicode support to Python 1.6 without breaking code that
used to run under Python 1.5.x; practicalities just make it impossible
to go with Unicode for everything.

I think that if Python didn't have so many extension modules (many
maintained by 3rd party modules) it would be a lot easier to switch to
Unicode for all strings (I think JavaScript has done this).

In Python 3000, we'll have to seriously consider having separate
character string and byte array objects, along the lines of Java's
model.  Note that I say "seriously consider."  We'll first have to see
how well the current solution works *in practice*.  There's time
before we fix Py3k in stone. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)