[Python-ideas] Support Unicode code point notation

Stephen J. Turnbull stephen at xemacs.org
Fri Aug 2 09:36:57 CEST 2013


Alexander Belopolsky writes:
 > On Thu, Aug 1, 2013 at 9:15 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
 >> Alexander Belopolsky writes:

 >>> This misses the point of adding the code point type prefix.

 >> Not really.  That would just pass the responsibility for enforcing
 >> consistency to linters, instead of the translator.

 > I have not seen a linter yet that would suggest that "\x41" should be
 > written as "A".

Irrelevant.  All I suggest the linter do is the "is \N{control-0x21}
consistent in the sense that U+0021 is a control character?" check.
That's what you said is the point.  I just want that check done
outside of the compiler.

 >> You can't just make this a syntax error because a code point may
 >> be reserved one Python version and a letter in another, depending
 >> on which versions of the Unicode tables are being used by those
 >> versions of Python.

 > That's true, but why would you write \N{reserved-NNNN} instead of
 > \uNNNN to begin with?

I wouldn't.  The problem isn't writing "\N{reserved-50000}".  It's
the other way around: I want to *write* "\N{control-50000}" which
expresses my intent in Python 3.5 and not have it blow up in Python
3.4 which uses an older UCD where U+50000 is unassigned.

 > With the possible exception or reserved-, on a rare occasion when you
 > want to be explicit about the character type, it is useful to be
 > strict.

As explained above, strictness is not backward compatible with older
versions of the UCD that might be in use in older versions of Python.



More information about the Python-ideas mailing list