[Python-3000] Displaying strings containing unicode escapes
Stephen J. Turnbull
stephen at xemacs.org
Tue Apr 29 20:01:50 CEST 2008
atsuo ishimoto writes:
> 2008/4/17 Stephen J. Turnbull <stephen at xemacs.org>:
> > How about choosing a standard Python repertoire (based on the Unicode
> > standard, of course) of which characters get a graphic repr and which
> > ones get \u-escaped, and have a post-hook for repr which gets passed
> > the string repr proposes to print out?
>
> Will the standard repertoire exclude Cyrillic or full-with ASCII?
"Exclude"? Nothing is "excluded". In my proposal, compatibility
(full-width) "ASCII" will be \u-escaped by repr, yes. Cyrillic
characters that can be confused with ASCII characters will be
\u-escaped, yes.
> If so, I (Japanese) will disable the hook
[[ You have the way my proposal works backwards. The post-hook
may be provided by the user to convert the unambiguous standard
representation into one the user prefers. Python may or may not
provide a library of convenient functions. ]]
> because full-with ASCII characters are not ambiguous to me.
That depends on the font(s) you use. Many fonts used with word
processors make very little distinction and leave it up to the layout
manager to create enough space. If you use different fonts for ASCII
and JIS as will be the case in many environments (eg, an Emacs shell
or python-mode buffer), who knows which will look wider?
> I think ambiguity will occur when we meet with unfamiliar
> characters. So choosing repertoire everybody can accept will be
> difficult.
We already know that.
The point is that repr is like quoted-printable encoding in MIME. It
should be mostly readable for programmers. There will be situations
where it's a horrible choice from the point of view of readability,
but the considerations of (1) consistency and (2) removal of ambiguity
should be given precedence IMO.
More information about the Python-3000
mailing list