Just van Rossum wrote:
I wrote:
A utf-8-encoded 8-bit string in Python is *not* a string, but a "ByteArray".
Another way of putting this is: - utf-8 in an 8-bit string is to a unicode string what a pickle is to an object. - defaulting to utf-8 upon coercing is like implicitly trying to unpickle an 8-bit string when comparing it to an instance. Bad idea.
Defaulting to Latin-1 is the only logical choice, no matter how western-culture-centric this may seem.
Please note that the support for mixing strings and Unicode objects is really only there to aid porting applications to Unicode. New code should use Unicode directly and apply all needed conversions explicitly using one of the many ways to encode or decode Unicode data. The auto-conversions are only there to help out and provide some convenience. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/