[Python-3000] Unicode IDs -- why NFC? Why allow ligatures?

Wed Jun 6 10:26:33 CEST 2007

Rauli Ruohonen writes:

 > There are some cases where users might in the future want to make
 > a distinction between "compatibility" characters, such as these:
 > http://en.wikipedia.org/wiki/Mathematical_alphanumeric_symbols

I don't think they belong in identifiers in a general purpose
programming language, though their usefulness to mathematical printers
is obvious.  I think programs should be verbalizable, unlike math
where most of the text is not intended to correspond to any reality,
but is purely syntactic transformation.

 > For this reason I think that compatibility transformation, if any,
 > should only be applied to characters where there's a practical
 > reason to do so, and for other cases punting (=syntax error) is
 > safest.

"Banzai Python!" and all that, but even if Python is in use 10,000
years from now, I think compatibility characters will still be a
YAGNI.  I admit that's a reasonable compromise, and allows future
extension without gratuitously making existing programs illegal; I
could live with it very easily (but I'd want those full-width ASCII
decomposed :-).  I just feel it would be wiser to limit Python
identifiers to NFKC.

 > I use two Japanese input methods (MS IME and scim/anthy), but only the
 > latter one daily. When I type text that mixes Japanese and other
 > For code that uses a lot of Japanese this may not be convenient,
 > but then you'd want to set your input method to use ASCII for ASCII
 > anyway,

Both of those address the issue of the annoyance of syntax errors in
original code to a great extent, but not in debug/maintenance mode
where you only type a few characters of code at a time, and typically
enter from user mode.

 > You have to go out of your way to type halfwidth katakana, and it
 > isn't really useful in identifiers IMHO.

I agree, but then I don't work for the Japanese Social Security
Administration.