[Python-ideas] Support Unicode code point notation

Nick Coghlan ncoghlan at gmail.com
Fri Aug 2 01:20:46 CEST 2013


On 2 Aug 2013 09:00, "Alexander Belopolsky" <alexander.belopolsky at gmail.com>
wrote:
>
>
> On Thu, Aug 1, 2013 at 6:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>>
>> Why would someone write 'control-' instead of 'U+'?
>
>
> Because this is the recommended way to form the code-point labels:
>
> "For each code point type without character names, code point labels are
constructed by using a lowercase prefix derived from the code point type,
followed by a hyphen-minus and then a 4- to 6-digit hexadecimal
representation of the code point."
>
> "To avoid any possible confusion with actual, non-null Name property
values, constructed Unicode code point labels are often displayed between
angle brackets: <control-0009>, <noncharacter-FFFF>, and so on. This
convention is used consistently in the data files for the Unicode Character
Database."
>
> "A constructed code point label is distinguished from the designation of
the code point itself (for example, “U+0009” or “U+FFFF”), which is also a
unique identifier, as described in Appendix A, Notational Conventions." <
http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf>
>
> I would rather see unicodedata.lookup() to be extended to accept
code-point labels rather than "the designation of the code point itself."
 The same applies to \N escape: I would rather see \N{control-NNNN} or
\N{surrogate-NNNN}  in string literals than some mysterious \N{U+NNNN}.

-1. I'd never even heard of code point labels before this thread, while the
"U+" notation is incredibly common.

Cheers,
Nick.

>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130802/9b20702e/attachment-0001.html>


More information about the Python-ideas mailing list