[Fredrik]
-- my proposal: expose both types, but let them contain characters from the same character set -- at least when used as strings.
as before, 8-bit strings can be used to store binary data, so we don't need a separate ByteArray type. in an 8-bit string, there's always one character per byte.
[imho: small changes to the existing code base, about as efficient as can be, no attempt to second-guess the user, fully backwards com- patible, fully compliant with the definition of strings in the language reference, patches are available, etc...]
Sorry, all this proposal does is change the default encoding on conversions from UTF-8 to Latin-1. That's very western-culture-centric. You already have control over the encoding: use unicode(s, "latin-1"). If there are places where you don't have enough control (e.g. file I/O), let's add control there. --Guido van Rossum (home page: http://www.python.org/~guido/)