[Python-Dev] Re: Multibyte repr()

Guido van Rossum guido@python.org
Wed, 09 Oct 2002 15:04:08 -0400


> I told this all to Tim, and he had one comment.  The repr() function
> of an 8-bit string can now return characters with the high bit set.
> This was the direct cause of the failures.  It was introduced in the
> following patch:
> 
> ----------------------------
> revision 2.190
> date: 2002/10/07 13:55:50;  author: loewis;  state: Exp;  lines: +68 -15
> Patch #479898: Use multibyte C library for printing strings if available.
> ----------------------------
> 
> Was this really a good idea???

Here's an example of what I mean.

Python 2.2:

  >>> u = u'\u1f40'
  >>> s = u.encode('utf8')
  >>> s
  '\xe1\xbd\x80'
  >>> 

Python 2.3:

  >>> u = u'\u1f40'
  >>> s = u.encode('utf8')
  >>> s
  'á½\x80'
  >>>

The latter output is not helpful, because the encoding of s is not the
locale's encoding.

--Guido van Rossum (home page: http://www.python.org/~guido/)