[Python-3000] Displaying strings containing unicode escapes
"Martin v. Löwis"
martin at v.loewis.de
Thu Apr 17 23:40:17 CEST 2008
> I do think we should use some kind of Unicode-standard-endorsed
> definition of "printable" (as long as it excludes all ASCII escapes),
I think
unicodedata.category(c)[0] != "C"
is fairly close. That excludes control characters (Cc), format
characters (Cf), surrogates (Cs), private-use (Co) and unassigned
characters (Cn). We should then also escape \, ' and ", following
the traditional algorithm.
Printable then would be all letters, numbers, punctuation, symbols,
but also marks (e.g. TILDE, COMBINING RIGHT HARPOON ABOVE) and
separators (SPACE, NO-BREAK SPACE, THREE-PER-EM SPACE, LINE SEPARATOR,
PARAGRAPH SEPARATOR). It might be reasonable to also exclude line
separators (Zl) and paragraph separators (Zp), each category having
only one character in them.
Regards,
Martin
More information about the Python-3000
mailing list