[Python-3000] Displaying strings containing unicode escapes

Thu May 1 19:16:26 CEST 2008

On Thu, May 1, 2008 at 1:06 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> atsuo ishimoto writes:
>
>   > > And where does Atsuo fall?
>   >
>   > Sorry, I cannot understand word 'fall', perhaps a colloquial expression?
>
>  In this case, it means "what is your opinion, compared to Stephen and
>  Martin?"

Oh, I see. Thank you. As I wrote, I think these proposals are not
competing, so I don't 'fall' to neither side.

In my PEP, I proposed to use Unicode properties based on proposal from
Michael and Martin. It's almost identical as written by Martin, but I
added Zs (Separator, Space) other than ASCII space('\x20'). This
category contains characters listed at end of this mail. I assume
these characters should be hex-escaped, although I know nothing about
these characters.

I think readability beats unambiguity for repr(), so I don't agree
Stephen's view that "repr is like quoted-printable encoding in MIME".
If the standard repertoire Stephen proposed is desired, the conversion
based on the repertoire should be done against strings repr()
produced. Such repertoire will be more useful if we have:

def standard_string(s):
    return _convert_ambiguous_chars(s)

print standard_string(repr(obj)), standard_string(sys.stdin.readline())

>  Great!  I'll take a look tomorrow or Friday.
>

Thank you. I'll looking forward your feedback.

Characters defined as Zs::
---------------------------------------------------------
0x20 SPACE
0xa0 NO-BREAK SPACE
0x1680 OGHAM SPACE MARK
0x2000 EN QUAD
0x2001 EM QUAD
0x2002 EN SPACE
0x2003 EM SPACE
0x2004 THREE-PER-EM SPACE
0x2005 FOUR-PER-EM SPACE
0x2006 SIX-PER-EM SPACE
0x2007 FIGURE SPACE
0x2008 PUNCTUATION SPACE
0x2009 THIN SPACE
0x200a HAIR SPACE
0x200b ZERO WIDTH SPACE
0x202f NARROW NO-BREAK SPACE
0x205f MEDIUM MATHEMATICAL SPACE
0x3000 IDEOGRAPHIC SPACE