
[me]
So what do you think of my new proposal of using ASCII as the default "encoding"?
[Paul]
I can live with it. I am mildly uncomfortable with the idea that I could write a whole bunch of software that works great until some European inserts one of their name characters.
Better than that when some Japanese insert *their* name characters and it produces gibberish instead.
Nevertheless, being hard-assed is better than being permissive because we can loosen up later.
Exactly -- just as nobody should *count* on 10**10 raising OverflowError, nobody (except maybe parts of the standard library :-) should *count* on unicode("\347") raising ValueError. I think that's fine.
What do we do about str( my_unicode_string )? Perhaps escape the Unicode characters with backslashed numbers?
Hm, good question. Tcl displays unknown characters as \x or \u escapes. I think this may make more sense than raising an error. But there must be a way to turn on Unicode-awareness on e.g. stdout and then printing a Unicode object should not use str() (as it currently does). --Guido van Rossum (home page: http://www.python.org/~guido/)