[Python-Dev] Unicode debate
Fredrik Lundh
Fredrik Lundh" <effbot@telia.com
Wed, 3 May 2000 12:31:34 +0200
Ka-Ping Yee <ping@lfw.org> wrote:
> > to throw some extra gasoline on this, how about allowing
> > str() to return unicode strings?
>=20
> You still need to *print* them somehow. One way or another,
> stdout is still just a stream with bytes on it, unless we
> augment file objects to understand encodings.
>=20
> stdout sends bytes to something -- and that something will
> interpret the stream of bytes in some encoding (could be
> Latin-1, UTF-8, ISO-2022-JP, whatever). So either:
>=20
> 1. You explicitly downconvert to bytes, and specify
> the encoding each time you do. Then write the
> bytes to stdout (or your file object).
>=20
> 2. The file object is smart and can be told what
> encoding to use, and Unicode strings written to
> the file are automatically converted to bytes.
which one's more convenient?
(no, I won't tell you what I prefer. guido doesn't want
more arguments from the old "characters are characters"
proponents, so I gotta trick someone else to spell them
out ;-)
> > (extra questions: how about renaming "unicode" to "string",
> > and getting rid of "unichr"?)
>=20
> Would you expect chr(x) to return an 8-bit string when x < 128,
> and a Unicode string when x >=3D 128?
that will break too much existing code, I think. but what
about replacing 128 with 256?
</F>