[Python-Dev] Re: Multibyte repr()
Guido van Rossum
guido@python.org
Wed, 09 Oct 2002 15:04:08 -0400
> I told this all to Tim, and he had one comment. The repr() function
> of an 8-bit string can now return characters with the high bit set.
> This was the direct cause of the failures. It was introduced in the
> following patch:
>
> ----------------------------
> revision 2.190
> date: 2002/10/07 13:55:50; author: loewis; state: Exp; lines: +68 -15
> Patch #479898: Use multibyte C library for printing strings if available.
> ----------------------------
>
> Was this really a good idea???
Here's an example of what I mean.
Python 2.2:
>>> u = u'\u1f40'
>>> s = u.encode('utf8')
>>> s
'\xe1\xbd\x80'
>>>
Python 2.3:
>>> u = u'\u1f40'
>>> s = u.encode('utf8')
>>> s
'á½\x80'
>>>
The latter output is not helpful, because the encoding of s is not the
locale's encoding.
--Guido van Rossum (home page: http://www.python.org/~guido/)