[Python-3000] Displaying strings containing unicode escapes

"Martin v. Löwis" martin at v.loewis.de
Thu Apr 17 23:40:17 CEST 2008


> I do think we should use some kind of Unicode-standard-endorsed
> definition of "printable" (as long as it excludes all ASCII escapes),

I think

  unicodedata.category(c)[0] != "C"

is fairly close. That excludes control characters (Cc), format
characters (Cf), surrogates (Cs), private-use (Co) and unassigned
characters (Cn). We should then also escape \, ' and ", following
the traditional algorithm.

Printable then would be all letters, numbers, punctuation, symbols,
but also marks (e.g. TILDE, COMBINING RIGHT HARPOON ABOVE) and
separators (SPACE, NO-BREAK SPACE, THREE-PER-EM SPACE, LINE SEPARATOR,
PARAGRAPH SEPARATOR). It might be reasonable to also exclude line
separators (Zl) and paragraph separators (Zp), each category having
only one character in them.

Regards,
Martin


More information about the Python-3000 mailing list