Mailman 3 Re: Python 1.6a2 Unicode bug (was Re: comparing strings and ints) - Python-Dev

26 Apr 2000


      Fredrik Lundh replied to himself in c.l.py:
...
...
as far as I can tell, it's supposed to be a feature.
if you mix 8-bit strings with unicode strings, python 1.6a2
attempts to interpret the 8-bit string as an utf-8 encoded
unicode string.
but yes, I also think it's a bug.  but this far, my attempts
to get someone else to fix it has failed.  might have to do
it myself... ;-)
postscript: the powers-that-be has decided that this is not
a bug.  if you thought that strings were just sequences of
characters, just as in Perl and Tcl, you're in for one big
surprise in Python 1.6...
I just read the last few posts of the powers-that-be-list on this subject
(Thanks to Christian for pointing out the archives in c.l.py ;-), and I
must say I completely agree with Fredrik. The current situation sucks. A
string should always be a sequence of characters. A utf-8-encoded 8-bit
string in Python is *not* a string, but a "ByteArray". An 8-bit string
should never be assumed to be utf-8 because of that distinction. (The
default encoding for the builtin unicode() function may be another story.)

Just

Re: Python 1.6a2 Unicode bug (was Re: comparing strings and ints)

Just van Rossum

tags

participants (1)