[Python-ideas] Support Unicode code point notation
Terry Reedy
tjreedy at udel.edu
Fri Aug 2 00:09:09 CEST 2013
On 8/1/2013 1:14 PM, Bruce Leban wrote:
> I wonder if this should also support the special labels for characters
> without names:
>
> control-NNNN
> reserved-NNNN
> noncharacter-NNNN
> private-use-NNNN
> surrogate-NNNN
>
> see p. 138 of http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf
>
> I would think that unicodedata.name <http://unicodedata.name> should not
> return these, but perhaps unicodedata.lookup should accept them. Note
> that the doc says that these are frequently displayed enclosed in <>, so
> perhaps
>
> unicodedata.lookup('U+0001')
> == unicodedata.lookup('control-0001')
> == unicodedata.lookup('<control-0001>')
> == '\x01'
That is a lot of added complication of both doc and code for what seems
like little gain. Why would someone write 'control-' instead of 'U+'?
--
Terry Jan Reedy
More information about the Python-ideas
mailing list