[issue5127] Use Py_UCS4 instead of Py_UNICODE in unicodectype.c

Marc-Andre Lemburg report at bugs.python.org
Thu Jul 8 09:50:49 CEST 2010


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Ezio Melotti wrote:
> 
> Ezio Melotti <ezio.melotti at gmail.com> added the comment:
> 
> Given that '\U00010000'.isprintable() returns True, I would say yes. If someone needs to print this char and has an appropriate font to do it, I don't see why it shouldn't work.

Note that Python3 will send printable code points as-is to the console,
so whether or not a code point is considered printable should take the
common availability of fonts being able to display the code point
into account. Otherwise, a user would just see a square box instead of
the much more useful escape sequence.

The "printable" property is a Python invention, not a Unicode property,
so we do have some freedom is deciding what is printable and what
is not.

In recent years the situation has just started clearing up
for fonts covering the assigned BMP range, mostly due to Microsoft actively
working to get their fonts cover the Unicode 2.0 assigned code points
(BMP only):

    http://support.microsoft.com/kb/287247

The only font set I know of that tries to go beyond BMP is this
shareware one:

    http://code2000.net/

Most other fonts just cover small parts of the Unicode assigned
code point ranges:

    http://unicode.org/resources/fonts.html
    http://www.wazu.jp/
    http://www.unifont.org/fontguide/

I suppose that in a few years we'll see OS and GUIs mix and match the
available fonts to display Unicode code points.

Given the font situation, I don't think we should have repr()
pass through printable non-BMP code points as is. Perhaps we shouldn't
give those code points the printable property to begin with.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5127>
_______________________________________


More information about the Python-bugs-list mailing list