[issue9198] Should repr() print unicode characters outside the BMP?
Amaury Forgeot d'Arc
report at bugs.python.org
Thu Jul 8 10:53:07 CEST 2010
New submission from Amaury Forgeot d'Arc <amauryfa at gmail.com>:
On wide unicode builds, '\U00010000'.isprintable() returns True, and repr() returns the character unmodified.
Is it a good behavior, given that very few fonts have can display this character?
Marc-Andre Lemburg wrote:
> The "printable" property is a Python invention, not a Unicode property,
> so we do have some freedom is deciding what is printable and what
> is not.
The current implementation considers printable """all the characters except those characters defined in the Unicode character database as following categories are considered printable.
* Cc (Other, Control)
* Cf (Other, Format)
* Cs (Other, Surrogate)
* Co (Other, Private Use)
* Cn (Other, Not Assigned)
* Zl Separator, Line ('\u2028', LINE SEPARATOR)
* Zp Separator, Paragraph ('\u2029', PARAGRAPH SEPARATOR)
* Zs (Separator, Space) other than ASCII space('\x20').
We could also arbitrarily exclude all the non-BMP chars.
nosy: amaury.forgeotdarc, ezio.melotti, lemburg
title: Should repr() print unicode characters outside the BMP?
versions: Python 3.2
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list