Ka-Ping Yee
to throw some extra gasoline on this, how about allowing str() to return unicode strings?
You still need to *print* them somehow. One way or another, stdout is still just a stream with bytes on it, unless we augment file objects to understand encodings.
stdout sends bytes to something -- and that something will interpret the stream of bytes in some encoding (could be Latin-1, UTF-8, ISO-2022-JP, whatever). So either:
1. You explicitly downconvert to bytes, and specify the encoding each time you do. Then write the bytes to stdout (or your file object).
2. The file object is smart and can be told what encoding to use, and Unicode strings written to the file are automatically converted to bytes.
which one's more convenient? (no, I won't tell you what I prefer. guido doesn't want more arguments from the old "characters are characters" proponents, so I gotta trick someone else to spell them out ;-)
(extra questions: how about renaming "unicode" to "string", and getting rid of "unichr"?)
Would you expect chr(x) to return an 8-bit string when x < 128, and a Unicode string when x >= 128?
that will break too much existing code, I think. but what about replacing 128 with 256? </F>