Re: [Python-Dev] Unicode debate

3 May 2000

      Ka-Ping Yee  wrote:
...
...
to throw some extra gasoline on this, how about allowing
str() to return unicode strings?
You still need to *print* them somehow.  One way or another,
stdout is still just a stream with bytes on it, unless we
augment file objects to understand encodings.
stdout sends bytes to something -- and that something will
interpret the stream of bytes in some encoding (could be
Latin-1, UTF-8, ISO-2022-JP, whatever).  So either:
1.  You explicitly downconvert to bytes, and specify
        the encoding each time you do.  Then write the
        bytes to stdout (or your file object).
2.  The file object is smart and can be told what
        encoding to use, and Unicode strings written to
        the file are automatically converted to bytes.
which one's more convenient?

(no, I won't tell you what I prefer. guido doesn't want
more arguments from the old "characters are characters"
proponents, so I gotta trick someone else to spell them
out ;-)
...
...
(extra questions: how about renaming "unicode" to "string",
and getting rid of "unichr"?)
Would you expect chr(x) to return an 8-bit string when x < 128,
and a Unicode string when x >= 128?
that will break too much existing code, I think.  but what
about replacing 128 with 256?

</F>

Re: [Python-Dev] Unicode debate

Fredrik Lundh