[Python-ideas] Support Unicode code point notation

Fri Aug 2 03:15:42 CEST 2013

Alexander Belopolsky writes:
 > On Thu, Aug 1, 2013 at 8:04 PM, Bruce Leban <bruce at leapyear.org> wrote:

 >> I'm +/-0 on having 'control-' and 'reserved-' etc. simply being
 >> different spellings of 'U+' so that '\N{control-0021}' == '\N{U+0021}'
 >> == '\x21' == '!' even though that isn't a control character.

 > This misses the point of adding the code point type prefix.

Not really.  That would just pass the responsibility for enforcing
consistency to linters, instead of the translator.  You can't just
make this a syntax error because a code point may be reserved one
Python version and a letter in another, depending on which versions of
the Unicode tables are being used by those versions of Python.  That
would conflict with Unicode itself, which says that unknown code
points must be treated as characters.  This is way too fragile to be
allowed to cause syntax errors.

 > If you fat-finger \N{control-0021} instead of intended \N{control-0012}
 > you would want a quick syntax error rather than an obscure bug.
 > Similarly, when you are reading someone else's code, you don't want
 > to consult the code table every time you see \N{control-NNNN} to
 > assure that this is really a control character rather than a
 > surrogate- or private-use- one.

+0 on Bruce's idea, -1 on syntax errors

It might be on rare occasions be useful to be strict about fixed-for-
all-time types like surrogate and private use.  (But even those
weren't fixed for all time in the past!)  Really, this is an editor or
linter function.